From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: composed characters question and suggestions for quail-cyrillic-* Date: Tue, 08 Jul 2008 00:42:09 +0300 Organization: JURTA Message-ID: <87fxql8j7y.fsf@jurta.org> References: <86lk19mmua.fsf@lifelogs.com> <485298A4.30000@gnu.org> <867ict8awn.fsf@lifelogs.com> <87zlouswvk.fsf@jurta.org> <86skulfo7j.fsf@lifelogs.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1215467148 7662 80.91.229.12 (7 Jul 2008 21:45:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 7 Jul 2008 21:45:48 +0000 (UTC) Cc: emacs-devel@gnu.org To: Ted Zlatanov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jul 07 23:46:35 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KFyXT-0002RC-Fs for ged-emacs-devel@m.gmane.org; Mon, 07 Jul 2008 23:46:27 +0200 Original-Received: from localhost ([127.0.0.1]:39967 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KFyWc-00016k-A4 for ged-emacs-devel@m.gmane.org; Mon, 07 Jul 2008 17:45:34 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KFyWE-0000pV-Ca for emacs-devel@gnu.org; Mon, 07 Jul 2008 17:45:10 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KFyWD-0000om-UK for emacs-devel@gnu.org; Mon, 07 Jul 2008 17:45:09 -0400 Original-Received: from [199.232.76.173] (port=44541 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KFyWD-0000ob-HF for emacs-devel@gnu.org; Mon, 07 Jul 2008 17:45:09 -0400 Original-Received: from relay02.kiev.sovam.com ([62.64.120.197]:64074) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1KFyWD-0003Sj-17 for emacs-devel@gnu.org; Mon, 07 Jul 2008 17:45:09 -0400 Original-Received: from [83.170.232.243] (helo=smtp.svitonline.com) by relay02.kiev.sovam.com with esmtp (Exim 4.67) (envelope-from ) id 1KFyW9-0007sN-Dd; Tue, 08 Jul 2008 00:45:05 +0300 In-Reply-To: <86skulfo7j.fsf@lifelogs.com> (Ted Zlatanov's message of "Mon, 07 Jul 2008 15:12:32 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (x86_64-pc-linux-gnu) X-Scanner-Signature: 44556c58cf61beb49de4494ee559d59c X-DrWeb-checked: yes X-SpamTest-Envelope-From: juri@jurta.org X-SpamTest-Group-ID: 00000000 X-SpamTest-Header: Trusted X-SpamTest-Info: Profiles 4245 [July 7 2008] X-SpamTest-Info: {received from trusted relay: common white list} X-SpamTest-Method: white ip list X-SpamTest-Rate: 0 X-SpamTest-Status: Trusted X-SpamTest-Status-Extended: trusted X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0278], KAS30/Release X-detected-kernel: by monty-python.gnu.org: FreeBSD 6.x (1) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:100430 Archived-At: > JL> 1. It uses the acute accent to put the grave accent above letters, > JL> e.g. ("'a" ?à) ("'o" ?ò). A correct way to implement this is to use the > JL> acute accent to put the acute accent above letters, and to use the grave > JL> accent to put the grave accent above letters, as all Latin input methods > JL> do, e.g. ("'a" ?á) ("'o" ?ó) ("`a" ?à) ("`o" ?ò). > > You are right. But please note that AFAIK in Cyrillic it's rare to find > acute accents, so the idea was "accent the next letter" and the ' key is > much more convenient on modern keyboards. For Cyrillic in particular, > it may make sense to use ' as the accent prefix or accept it in addition > to `. If you still think only ` should be used, I'll commit a patch > immediately. Instead of the grave accent `, most Cyrillic languages (including Bulgarian, Russian, Ukrainian) use the acute accent ' to mark the stressed vowel. Please see http://en.wikipedia.org/wiki/Acute_accent#Stress for more information. > JL> 2. It uses accented Latin letters à, ò that is inappropriate for > JL> Cyrillic texts. The only valid way (as I understand according to > JL> Unicode specifications) is to use combining characters. > > I think I mentioned this in an earlier post. Combining characters look > inconsistent and sometimes take up two lines of text in Emacs, so I > thought it would be acceptable to use the accented Latin letters. If > not, I'm OK with replacing them with the combining versions. Please > note I'm not an expert on this topic, so I greatly appreciate your > recommendations. If combining characters take two lines, then it is a bug. I remember that rendering of combining characters was correct before the Unicode merge. If it was possible to do right before the merge, maybe it will be possible to fix this in current code using the same logic? > JL> 3. It turns "'" into a prefix key, but it is used to input "ь" according > JL> to the rule ("'" ?ь). > > Would it be possible to move ь under the ' prefix? As I mentioned the ' > key is very convenient and ь is not a frequently-needed letter. It > actually works fine for me as it is (unless I need to type something > like ьо, which is rare), but I see the problem. In Bulgarian it is rare, but in Russian and Ukrainian it is very frequently used letter ;-) > JL> 4. «»“„‘‚§№ is too limited set of necessary characters and this set is > JL> not specific to `cyrillic-translit'. Different styles of quotation > JL> marks are required by typographic rules in other several languages and > JL> scripts besides Cyrillic, and these rules also require using other > JL> symbols like dashes of different lengths, nbsp, 1/2, 1/4, subscripts, > JL> copyright, currency signs, and many more. > > In the specific cases I know (I only write in Bulgarian frequently), the > characters I added are most needed. If you or others want to add more > characters, go ahead or tell me what needs to be added. Thanks, the characters you added are very needed. Other needed characters to add are at least ”’–—•… > JL> So instead of copying the same rules to all input method a better > JL> way is to create a separate common input method with all these > JL> special symbols and to share it with language specific input > JL> methods. > > My suggestion was essentially to build a prefix tree for Slavic > languages, since they share enough typographic rules, and to insert it > into every specific input method. Using a secondary input method works > better so I hope it can happen (if Kenichi Handa's patch is OK). And in another message you wrote: > If this can go into the trunk, I'll be glad to use it (my changes will > then be unnecessary). The only caution is that universal sequences are > not always intuitive; a good example is that I put "/ab" for paragraph > because that makes sense in Bulgarian ("абзац" means paragraph, > pronounced "abzatz"). So it would be nice to have a universal input > method plus custom rules at the intermediate level (e.g. cyrillic-*). It might be funny but in Russian § is named as a "paragraph sign", so your mnemonics don't work here. And "абзац" is used for a different character, actually the pilcrow. Please compare: http://ru.wikipedia.org/wiki/%C2%B6 http://ru.wikipedia.org/wiki/%D0%97%D0%BD%D0%B0%D0%BA_%D0%BF%D0%B0%D1%80%D0%B0%D0%B3%D1%80%D0%B0%D1%84%D0%B0 -- Juri Linkov http://www.jurta.org/emacs/