* composed characters question and suggestions for quail-cyrillic-* @ 2008-06-13 14:27 Ted Zlatanov 2008-06-13 15:11 ` Eli Zaretskii 2008-06-13 15:56 ` Jason Rumney 0 siblings, 2 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-06-13 14:27 UTC (permalink / raw) To: emacs-devel Accented characters are infrequently used in Cyrillic writing, but I think the Quail input methods should support them. I am puzzled by the look of such characters: а̀у̀ character: а (1072, #o2060, #x430) preferred charset: unicode (Unicode (ISO10646)) code point: 0x0430 syntax: w which means: word category: Y:Cyrillic characters of 2-byte character sets c:Chinese h:Korean j:Japanese y:Cyrillic buffer code: #xD0 #xB0 file code: #xD0 #xB0 (encoded by coding system utf-8-emacs) display: composed to form "а̀" (see below) Composed with the following character(s) "̀" by the rule: (?а (tc . bc) ?̀) The component character(s) are displayed by these fonts (glyph codes): а: -misc-fixed-medium-r-normal--20-200-75-75-c-100-iso10646-1 (#x430) ̀: -misc-fixed-medium-r-normal--20-200-75-75-c-100-iso10646-1 (#x300) See the variable `reference-point-alist' for the meaning of the rule. Character code properties are not shown: customize what to show There are text properties here: auto-composed t composition [Show] fontified t It really looks unpleasant in the font recorded here, taking up two rows so the accent can be displayed alone on top of the letter, so I'm curious if Emacs will pick the right appearance if the font has an accented version of the character. I don't have such a font. I don't know all the nuances here, sorry if this is an obvious question. I also think Quail Cyrillic input methods should support the „" characters, which are not the same as the Latin quote marks AFAIK, and are important in Cyrillic writing (at least in Bulgarian, to my knowledge). I'll be glad to implement the Quail changes if they are OK with everyone. I didn't go ahead and implement them because I'm not familiar with any existing conventions; maybe there's a reason why they have not been added yet. I'd like to know about the font look but it's not an impediment to the Quail changes. Thanks Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-13 14:27 composed characters question and suggestions for quail-cyrillic-* Ted Zlatanov @ 2008-06-13 15:11 ` Eli Zaretskii 2008-06-13 15:56 ` Jason Rumney 1 sibling, 0 replies; 77+ messages in thread From: Eli Zaretskii @ 2008-06-13 15:11 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > From: Ted Zlatanov <tzz@lifelogs.com> > Date: Fri, 13 Jun 2008 09:27:41 -0500 > > I also think Quail Cyrillic input methods should support the „" > characters 100% agreement. > I'll be glad to implement the Quail changes if they are OK with > everyone. I didn't go ahead and implement them because I'm not familiar > with any existing conventions; maybe there's a reason why they have not > been added yet. Please go ahead. In my experience, most omissions in the input methods are not intentional, they are just because no one cared enough to include them. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-13 14:27 composed characters question and suggestions for quail-cyrillic-* Ted Zlatanov 2008-06-13 15:11 ` Eli Zaretskii @ 2008-06-13 15:56 ` Jason Rumney 2008-06-13 18:09 ` Ted Zlatanov 1 sibling, 1 reply; 77+ messages in thread From: Jason Rumney @ 2008-06-13 15:56 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel Ted Zlatanov wrote: > It really looks unpleasant in the font recorded here, taking up two rows > so the accent can be displayed alone on top of the letter, so I'm > curious if Emacs will pick the right appearance if the font has an > accented version of the character. No. There are no unicode codepoints for those accented characters AFAICS so we can't do the substitution within Emacs. The only way to combine them would be if the font contained GSUB (glyph substitution) tables with entries for those sequences of characters, and Emacs had been told to use font-shape-text for displaying Cyrillic (currently it is only used for Indic and some South East Asian scripts, but should be for Arabic as well). The library that Emacs uses for shaping (libotf/m17n-flt or on Windows uniscribe) might need some knowledge of those accented characters as well, I'm not entirely sure of how glyph shaping all fits together. Different fonts might have different metrics for the accent characters though, which could improve the appearance. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-13 15:56 ` Jason Rumney @ 2008-06-13 18:09 ` Ted Zlatanov 2008-06-14 9:44 ` Eli Zaretskii 2008-07-06 18:41 ` composed characters question and suggestions for quail-cyrillic-* Juri Linkov 0 siblings, 2 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-06-13 18:09 UTC (permalink / raw) To: emacs-devel On Fri, 13 Jun 2008 16:56:20 +0100 Jason Rumney <jasonr@gnu.org> wrote: JR> Ted Zlatanov wrote: >> It really looks unpleasant in the font recorded here, taking up two rows >> so the accent can be displayed alone on top of the letter, so I'm >> curious if Emacs will pick the right appearance if the font has an >> accented version of the character. JR> No. There are no unicode codepoints for those accented characters JR> AFAICS so we can't do the substitution within Emacs. The only way to JR> combine them would be if the font contained GSUB (glyph substitution) JR> tables with entries for those sequences of characters, and Emacs had JR> been told to use font-shape-text for displaying Cyrillic (currently it JR> is only used for Indic and some South East Asian scripts, but should JR> be for Arabic as well). The library that Emacs uses for shaping JR> (libotf/m17n-flt or on Windows uniscribe) might need some knowledge of JR> those accented characters as well, I'm not entirely sure of how glyph JR> shaping all fits together. I understand better now. I'll work without composition. How do I generate composed characters for Quail mappings? I couldn't figure it out with compose-chars. I found accented a and o in the Latin-1 map, and accented u (у) is not available by itself. I'd like to get the accented у character for completeness, but Latin-* only has an accented y with the accent going in the wrong direction (ý). It's probably more proper to do the Cyrillic аоu(у) as composed characters, but I don't know. Accented e and i were in the Cyrillic map and look fine. These accented characters are rare enough that it doesn't matter; in common usage only ѝ is found (in Bulgarian, at least). The others are used only if the writer wants to emphasize which syllable to stress upon pronunciation, as in a dictionary, AFAIK. On Fri, 13 Jun 2008 18:11:19 +0300 Eli Zaretskii <eliz@gnu.org> wrote: >> I'll be glad to implement the Quail changes if they are OK with >> everyone. I didn't go ahead and implement them because I'm not familiar >> with any existing conventions; maybe there's a reason why they have not >> been added yet. EZ> Please go ahead. In my experience, most omissions in the input EZ> methods are not intentional, they are just because no one cared enough EZ> to include them. Thanks. I did: * quail/cyrillic.el: Add quotation marks, paragraph symbol, angled brackets, number symbol, and accented aeio to cyrillic-translit. The quotation marks are both ‚‘ and „“ (plus «» which are also common). Let me know if I've missed anything. I can add these to the other cyrillic-* methods if people think it's useful. Thanks Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-13 18:09 ` Ted Zlatanov @ 2008-06-14 9:44 ` Eli Zaretskii 2008-06-14 18:55 ` Stephen J. Turnbull 2008-07-06 18:41 ` composed characters question and suggestions for quail-cyrillic-* Juri Linkov 1 sibling, 1 reply; 77+ messages in thread From: Eli Zaretskii @ 2008-06-14 9:44 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > From: Ted Zlatanov <tzz@lifelogs.com> > Date: Fri, 13 Jun 2008 13:09:12 -0500 > > * quail/cyrillic.el: Add quotation marks, paragraph symbol, angled > brackets, number symbol, and accented aeio to cyrillic-translit. > > The quotation marks are both ‚‘ and „“ (plus «» which are also common). > Let me know if I've missed anything. > > I can add these to the other cyrillic-* methods if people think it's > useful. Thanks. cyrillic-translit is what I use all the time (because it makes my life so easy on any keyboard), but I think all cyrillic-* input methods should in general support the same repertoire of characters. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-14 9:44 ` Eli Zaretskii @ 2008-06-14 18:55 ` Stephen J. Turnbull 2008-06-14 19:45 ` Eli Zaretskii 0 siblings, 1 reply; 77+ messages in thread From: Stephen J. Turnbull @ 2008-06-14 18:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Ted Zlatanov, emacs-devel Eli Zaretskii writes: > cyrillic-translit is what I use all the time (because it makes my life > so easy on any keyboard), but I think all cyrillic-* input methods > should in general support the same repertoire of characters. This is true for the input methods of each language family. Somebody could write a script to do repertoire checks. Sorry, it isn't going to be me today. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-14 18:55 ` Stephen J. Turnbull @ 2008-06-14 19:45 ` Eli Zaretskii 2008-06-18 20:17 ` Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Eli Zaretskii @ 2008-06-14 19:45 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: tzz, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: Ted Zlatanov <tzz@lifelogs.com>, > emacs-devel@gnu.org > Date: Sun, 15 Jun 2008 03:55:49 +0900 > > Eli Zaretskii writes: > > > cyrillic-translit is what I use all the time (because it makes my life > > so easy on any keyboard), but I think all cyrillic-* input methods > > should in general support the same repertoire of characters. > > This is true for the input methods of each language family. Yes, that's what I meant to say (but did it in a way that could be interpreted otherwise). ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-14 19:45 ` Eli Zaretskii @ 2008-06-18 20:17 ` Ted Zlatanov 2008-06-19 11:45 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-06-18 20:17 UTC (permalink / raw) To: emacs-devel On Sat, 14 Jun 2008 22:45:04 +0300 Eli Zaretskii <eliz@gnu.org> wrote: >> From: "Stephen J. Turnbull" <stephen@xemacs.org> >> Cc: Ted Zlatanov <tzz@lifelogs.com>, >> emacs-devel@gnu.org >> Date: Sun, 15 Jun 2008 03:55:49 +0900 >> >> Eli Zaretskii writes: >> >> > cyrillic-translit is what I use all the time (because it makes my >> > life so easy on any keyboard), but I think all cyrillic-* input >> > methods should in general support the same repertoire of >> > characters. >> >> This is true for the input methods of each language family. EZ> Yes, that's what I meant to say (but did it in a way that could be EZ> interpreted otherwise). I'll set up quail-define-rules to run with the append argument for each method. Something like (dolist (method '(cyrillic-translit cyrillic-yawerty ...)) (... switch to Quail method ...) (quail-define-rules (append . t) (... characters ...))) Does anyone know how to list all the Quail methods matching a regex, and how to switch to a Quail package? All the examples use quail-define-package to switch to the package and I hope I don't have to dig into the macros to figure out how to switch to an already-defined package. I'll abstract the common elements so they are not entered multiple times and find which elements are not covered by which methods once I get the above working. Thanks Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-18 20:17 ` Ted Zlatanov @ 2008-06-19 11:45 ` Kenichi Handa 2008-07-02 20:25 ` Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-06-19 11:45 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel In article <86lk12zee7.fsf@lifelogs.com>, Ted Zlatanov <tzz@lifelogs.com> writes: > I'll set up quail-define-rules to run with the append argument for each > method. Something like > (dolist (method '(cyrillic-translit cyrillic-yawerty ...)) > (... switch to Quail method ...) > (quail-define-rules (append . t) (... characters ...))) > Does anyone know how to list all the Quail methods matching a regex, You can check elements of input-method-alist. > and > how to switch to a Quail package? > All the examples use > quail-define-package to switch to the package and I hope I don't have to > dig into the macros to figure out how to switch to an already-defined > package. You can use quail-defrule. emacs/leim/leim-ext.el adds extra rules to the existing input method by: (eval-after-load "quail/..." '(quail-defrule "..." ...)) --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-19 11:45 ` Kenichi Handa @ 2008-07-02 20:25 ` Ted Zlatanov 2008-07-03 2:29 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-02 20:25 UTC (permalink / raw) To: emacs-devel On Thu, 19 Jun 2008 20:45:20 +0900 Kenichi Handa <handa@m17n.org> wrote: KH> In article <86lk12zee7.fsf@lifelogs.com>, Ted Zlatanov <tzz@lifelogs.com> writes: >> and how to switch to a Quail package? All the examples use >> quail-define-package to switch to the package and I hope I don't have >> to dig into the macros to figure out how to switch to an >> already-defined package. KH> You can use quail-defrule. KH> emacs/leim/leim-ext.el adds extra rules to the existing input method by: KH> (eval-after-load "quail/..." KH> '(quail-defrule "..." ...)) Thank you. I have this: (dolist (method ;; there must be a better way to do this grep :) (remove nil (mapcar (lambda(m) (let ((name (car-safe m))) (when (string-match "cyrillic" name) name))) input-method-alist))) (quail-defrule ",," ?„ method) (quail-defrule "\"\"" ?“ method)) but it is not working, because the method name is not the right thing to specify for quail-defrule. I hope I don't have to use a macro to do (eval-after-load "quail/..." '(quail-defrule "A" ?B nil t)) for each such method, as leim-ext.el does. I hope also I don't have to modify input-method-alist directly, but it seems right now that is the best option. Thanks Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-02 20:25 ` Ted Zlatanov @ 2008-07-03 2:29 ` Kenichi Handa 2008-07-03 19:53 ` adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-07-03 2:29 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel In article <86r6acqbif.fsf@lifelogs.com>, Ted Zlatanov <tzz@lifelogs.com> writes: KH> You can use quail-defrule. KH> emacs/leim/leim-ext.el adds extra rules to the existing input method by: KH> (eval-after-load "quail/..." KH> '(quail-defrule "..." ...)) > Thank you. I have this: > (dolist (method > ;; there must be a better way to do this grep :) > (remove nil > (mapcar > (lambda(m) > (let ((name (car-safe m))) > (when (string-match "cyrillic" name) > name))) > input-method-alist))) > (quail-defrule ",," ?„ method) > (quail-defrule "\"\"" ?“ method)) > but it is not working, because the method name is not the right thing to > specify for quail-defrule. NAME surely should work as far as that input method is already loaded. > I hope I don't have to use a macro to do > (eval-after-load "quail/..." > '(quail-defrule "A" ?B nil t)) > for each such method, as leim-ext.el does. Why? > I hope also I don't have to modify input-method-alist > directly, but it seems right now that is the best option. In what way, are you going to modify input-method-alist? --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-03 2:29 ` Kenichi Handa @ 2008-07-03 19:53 ` Ted Zlatanov 2008-07-05 12:54 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-03 19:53 UTC (permalink / raw) To: emacs-devel On Thu, 03 Jul 2008 11:29:53 +0900 Kenichi Handa <handa@m17n.org> wrote: KH> In article <86r6acqbif.fsf@lifelogs.com>, Ted Zlatanov <tzz@lifelogs.com> writes: KH> You can use quail-defrule. KH> emacs/leim/leim-ext.el adds extra rules to the existing input method by: KH> (eval-after-load "quail/..." KH> '(quail-defrule "..." ...)) >> Thank you. I have this: ... >> but it is not working, because the method name is not the right thing to >> specify for quail-defrule. KH> NAME surely should work as far as that input method is KH> already loaded. Right, I'm trying to do the above for input methods that are in input-method-alist but not necessarily loaded. So this works: (dolist (method (remove nil (mapcar (lambda(m) (let ((name (car-safe m))) (when (and (string-match "cyrillic" name) (quail-package name)) name))) input-method-alist))) (message "Defining rules for method %s" method) (quail-defrule ",," ?„ method)) The question is, should I load those input methods explicitly, or just assume that it's enough to handle what's in leim/quail/cyrillic.el? Specifically, cyrillic-jis-russian wouldn't be handled this way, and it can benefit from the extended mappings I am creating (proper double quotes „ “, single quotes ‚‘ and the № § « » symbols). If I should special-case it in leim-ext.el that's fine, I'm just trying to approach it programmatically. I think the croatian-*, slovak-*, czech-* and slovenian-* methods (broadly speaking, the Slavic languages) can also use the extra symbols (their names doen't match "cyrillic" in the above loop so I'd have to add them to the regex). If any speakers of those languages or other languages from the region have an opinion, please let us know. I changed the subject accordingly. AFAIK the symbols I mentioned above are widely used in the region, but I only know Russian and Bulgarian use them for sure. Thanks Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-03 19:53 ` adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods Ted Zlatanov @ 2008-07-05 12:54 ` Kenichi Handa 2008-07-06 18:40 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-07-05 12:54 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel In article <864p76lp60.fsf_-_@lifelogs.com>, Ted Zlatanov <tzz@lifelogs.com> writes: > Right, I'm trying to do the above for input methods that are in > input-method-alist but not necessarily loaded. So this works: > (dolist (method > (remove nil > (mapcar > (lambda(m) > (let ((name (car-safe m))) > (when (and (string-match "cyrillic" name) > (quail-package name)) > name))) > input-method-alist))) > (message "Defining rules for method %s" method) > (quail-defrule ",," ?„ method)) > The question is, should I load those input methods explicitly, or just > assume that it's enough to handle what's in leim/quail/cyrillic.el? You can't assume that all "cyrillic*" input methods are in leim/quail/cyrillic.el. So the safest (and more efficient) way is something like this: (mapc (lambda(m) (let ((name (car m))) (when (string-match "cyrillic" name) (message "Defining rules for method %s" method) (activate-input-method name) (quail-defrule ",," ?„) (inactivate-input-method)))) input-method-alist) But, I'm now thiking about introducing this variable to avoid eval-after-load in leim-ext.el: ;;;###autoload (defvar quail-additional-rule-alist nil "Alist of Quail package names vs. the rules to add after loading the package. Each element has the form (PACKAGE-NAME RULE ...), where PACKAGE-NAME is a Quail package name (string representing an input method), and RULE is a translation rule of the form (KEY TRANSLATION APPEND). See the documentaion of the function `quail-defrule' for the meanings or KEY, TRANSLATION, and APPEND.") With this, you can do: (mapc (lambda(m) (let ((name (car m))) (when (string-match "cyrillic" name) (message "Defining rules for method %s" method) (push (list name '(",," ?„)) quail-additional-rule-alist)))) input-method-alist) > Specifically, cyrillic-jis-russian wouldn't be handled this way, and it > can benefit from the extended mappings I am creating (proper double > quotes „ “, single quotes ‚‘ and the № § « » symbols). If I should > special-case it in leim-ext.el that's fine, I'm just trying to approach > it programmatically. I think cyrillic-jis-russian is now obsolete because all Cyrillic character in JIS are now unified into Unicode. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-05 12:54 ` Kenichi Handa @ 2008-07-06 18:40 ` Juri Linkov 2008-07-06 22:54 ` Miles Bader 2008-07-07 1:57 ` Kenichi Handa 0 siblings, 2 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-06 18:40 UTC (permalink / raw) To: Kenichi Handa; +Cc: Ted Zlatanov, emacs-devel > But, I'm now thiking about introducing this variable to > avoid eval-after-load in leim-ext.el: > > ;;;###autoload > (defvar quail-additional-rule-alist nil > "Alist of Quail package names vs. the rules to add after loading the package. > Each element has the form (PACKAGE-NAME RULE ...), where > PACKAGE-NAME is a Quail package name (string representing an input method), > and RULE is a translation rule of the form (KEY TRANSLATION APPEND). > See the documentaion of the function `quail-defrule' for the meanings > or KEY, TRANSLATION, and APPEND.") > > With this, you can do: > > (mapc (lambda(m) > (let ((name (car m))) > (when (string-match "cyrillic" name) > (message "Defining rules for method %s" method) > (push (list name '(",," ?„)) quail-additional-rule-alist)))) > input-method-alist) Since a list of necessary Unicode characters is too large and it is not limited to Cyrillic, what do you think about creating a new input method that could be activated simultaneously with language specific input method? In case of conflicting rules we could specify the priority by using the order of active input methods e.g. "unicode-map,cyrillic-translit" vs "cyrillic-translit,unicode-map" in a new variable like `quail-additional-input-methods'. As I see now there are at least two main types of input methods: 1. mapping a keyboard layout to language letters (like `cyrillic-jcuken'); 2. multi-key input methods to compose letters of one specific language. What is missing is a multi-key input method to input arbitrary Unicode characters in addition to the active language specific input method. The closest Unicode character input method I see is `sgml-input.el' but it relies on remembering the names of SGML entities. A better method could use mnemonics, and a good candidate is X11/locale/en_US.UTF-8/Compose for the X Window Input Method. For example, it uses the following mnemonic rules for quotation marks: << "«" LEFT-POINTING DOUBLE ANGLE QUOTATION MARK >> "»" RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK '< "‘" LEFT SINGLE QUOTATION MARK '> "’" RIGHT SINGLE QUOTATION MARK ', "‚" SINGLE LOW-9 QUOTATION MARK "< "“" LEFT DOUBLE QUOTATION MARK "> "”" RIGHT DOUBLE QUOTATION MARK ", "„" DOUBLE LOW-9 QUOTATION MARK and ~5000 other rules for many Unicode characters. Do you know a better method to input Unicode characters? -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-06 18:40 ` Juri Linkov @ 2008-07-06 22:54 ` Miles Bader 2008-07-10 0:09 ` Juri Linkov 2008-07-10 0:27 ` Juri Linkov 2008-07-07 1:57 ` Kenichi Handa 1 sibling, 2 replies; 77+ messages in thread From: Miles Bader @ 2008-07-06 22:54 UTC (permalink / raw) To: Juri Linkov; +Cc: Ted Zlatanov, emacs-devel, Kenichi Handa Juri Linkov <juri@jurta.org> writes: > and ~5000 other rules for many Unicode characters. > > Do you know a better method to input Unicode characters? rfc1345 isn't too bad. It also uses mnemonics to a degree (though they are often kind of weird), and has pretty good coverage. -Miles -- Christian, n. One who follows the teachings of Christ so long as they are not inconsistent with a life of sin. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-06 22:54 ` Miles Bader @ 2008-07-10 0:09 ` Juri Linkov 2008-07-10 0:37 ` Kenichi Handa 2008-07-10 1:15 ` Stefan Monnier 2008-07-10 0:27 ` Juri Linkov 1 sibling, 2 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-10 0:09 UTC (permalink / raw) To: Miles Bader; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 461 bytes --] >> and ~5000 other rules for many Unicode characters. >> >> Do you know a better method to input Unicode characters? > > rfc1345 isn't too bad. It also uses mnemonics to a degree (though they > are often kind of weird), and has pretty good coverage. I noticed rfc1345 is already implemented in leim/quail/rfc1345.el. BTW, this file is detected as binary. I think it's better to replace non-printable control characters with equivalent text-only notations: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 433 bytes --] Index: leim/quail/rfc1345.el =================================================================== RCS file: /sources/emacs/emacs/leim/quail/rfc1345.el,v retrieving revision 1.11 diff -u -a -r1.11 rfc1345.el --- leim/quail/rfc1345.el 7 May 2008 03:37:06 -0000 1.11 +++ leim/quail/rfc1345.el 10 Jul 2008 00:08:43 -0000 @@ -37,38 +37,38 @@ (quail-define-rules ;; There doesn't seem to be any point in including ASCII. -;; ("&NU" ?\ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 0:09 ` Juri Linkov @ 2008-07-10 0:37 ` Kenichi Handa 2008-07-10 0:52 ` Juri Linkov 2008-07-10 1:15 ` Stefan Monnier 1 sibling, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-07-10 0:37 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel, miles In article <87y74a38hd.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: > I noticed rfc1345 is already implemented in leim/quail/rfc1345.el. > BTW, this file is detected as binary. ??? Really? As it has coding tag utf-8, it should be read as utf-8. If not, there's a bug, but I can't reproduce it. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 0:37 ` Kenichi Handa @ 2008-07-10 0:52 ` Juri Linkov 2008-07-10 1:44 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-10 0:52 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-devel, miles >> I noticed rfc1345 is already implemented in leim/quail/rfc1345.el. > >> BTW, this file is detected as binary. > > ??? Really? As it has coding tag utf-8, it should be read > as utf-8. If not, there's a bug, but I can't reproduce it. Sorry, I should have been more explicit: it is detected as binary not by Emacs, but by external utilities. For example, I once missed rfc1345.el when I searched in the quail directory for UTF-8 input methods using grep because it didn't display the match for rfc1345.el treated as binary, and also I had to add the `-a' option to the cvs diff command line to treat is as non-binary when creating a patch. So control characters in source files cause too much trouble. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 0:52 ` Juri Linkov @ 2008-07-10 1:44 ` Kenichi Handa 0 siblings, 0 replies; 77+ messages in thread From: Kenichi Handa @ 2008-07-10 1:44 UTC (permalink / raw) To: Juri Linkov; +Cc: miles, emacs-devel In article <87fxqiy2ja.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: >>> I noticed rfc1345 is already implemented in leim/quail/rfc1345.el. > > >>> BTW, this file is detected as binary. > > > > ??? Really? As it has coding tag utf-8, it should be read > > as utf-8. If not, there's a bug, but I can't reproduce it. > Sorry, I should have been more explicit: it is detected as binary > not by Emacs, but by external utilities. Ah, I see. > For example, I once missed > rfc1345.el when I searched in the quail directory for UTF-8 input methods > using grep because it didn't display the match for rfc1345.el treated as > binary, and also I had to add the `-a' option to the cvs diff command line > to treat is as non-binary when creating a patch. So control characters > in source files cause too much trouble. Agreed. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 0:09 ` Juri Linkov 2008-07-10 0:37 ` Kenichi Handa @ 2008-07-10 1:15 ` Stefan Monnier 1 sibling, 0 replies; 77+ messages in thread From: Stefan Monnier @ 2008-07-10 1:15 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel, Miles Bader > I think it's better to replace non-printable control characters with > equivalent text-only notations: Agreed, Stefan ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-06 22:54 ` Miles Bader 2008-07-10 0:09 ` Juri Linkov @ 2008-07-10 0:27 ` Juri Linkov 2008-07-10 1:16 ` Miles Bader 1 sibling, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-10 0:27 UTC (permalink / raw) To: Miles Bader; +Cc: emacs-devel >> and ~5000 other rules for many Unicode characters. >> >> Do you know a better method to input Unicode characters? > > rfc1345 isn't too bad. It also uses mnemonics to a degree (though they > are often kind of weird), and has pretty good coverage. I see rfc1345 is criticized and deprecated in favor of a new standard REPERTOIREMAP from ISO/IEC FCD 14652. Its symbolic names are quite widespread already. Maybe we should implement a new input method based on its symbolic names too? I can't find an existing Emacs implementation. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 0:27 ` Juri Linkov @ 2008-07-10 1:16 ` Miles Bader 2008-07-10 18:43 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Miles Bader @ 2008-07-10 1:16 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel Juri Linkov <juri@jurta.org> writes: >> rfc1345 isn't too bad. It also uses mnemonics to a degree (though they >> are often kind of weird), and has pretty good coverage. > > I see rfc1345 is criticized and deprecated in favor of a new standard > REPERTOIREMAP from ISO/IEC FCD 14652. "Criticized and deprecated" where? Is there a new rfc? -Miles -- Barometer, n. An ingenious instrument which indicates what kind of weather we are having. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 1:16 ` Miles Bader @ 2008-07-10 18:43 ` Juri Linkov 2008-07-11 2:52 ` Miles Bader 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-10 18:43 UTC (permalink / raw) To: Miles Bader; +Cc: emacs-devel >>> rfc1345 isn't too bad. It also uses mnemonics to a degree (though they >>> are often kind of weird), and has pretty good coverage. >> >> I see rfc1345 is criticized and deprecated in favor of a new standard >> REPERTOIREMAP from ISO/IEC FCD 14652. > > "Criticized and deprecated" where? Is there a new rfc? As I see, there is no new rfc. And rfc1345 is criticized because it was published without review, contains errors, and is inconsistent with the Unicode data. Without these drawbacks it would be a good input method. There is a relevant thread: http://thread.gmane.org/gmane.ietf.general/26127 -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-10 18:43 ` Juri Linkov @ 2008-07-11 2:52 ` Miles Bader 0 siblings, 0 replies; 77+ messages in thread From: Miles Bader @ 2008-07-11 2:52 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel Juri Linkov <juri@jurta.org> writes: > As I see, there is no new rfc. And rfc1345 is criticized because it was > published without review, contains errors, and is inconsistent with the > Unicode data. Without these drawbacks it would be a good input method. > There is a relevant thread: > http://thread.gmane.org/gmane.ietf.general/26127 I read though that thread, and it comes to no real conclusion. rfc1345 clearly has flaws, but it actually _does_ have some practical utility. This was pointed out (repeatedly) in that thread, but the objectors seemed unable to suggest a practical alternative (and the most vocal objectors seemed rather clueless to tell the truth). Unless there is a better replacement in-kind, there seems no point to removing the rfc1345 input method. -Miles -- 80% of success is just showing up. --Woody Allen ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-06 18:40 ` Juri Linkov 2008-07-06 22:54 ` Miles Bader @ 2008-07-07 1:57 ` Kenichi Handa 2008-07-07 4:39 ` Stefan Monnier 1 sibling, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-07-07 1:57 UTC (permalink / raw) To: Juri Linkov; +Cc: tzz, emacs-devel In article <87ej66ubkv.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: > Since a list of necessary Unicode characters is too large and it is > not limited to Cyrillic, what do you think about creating a new > input method that could be activated simultaneously with language > specific input method? In case of conflicting rules we could specify > the priority by using the order of active input methods e.g. > "unicode-map,cyrillic-translit" vs "cyrillic-translit,unicode-map" > in a new variable like `quail-additional-input-methods'. I remember that activating multiple input methods was discussed a while ago on this list, but don't remember the conclusion. Unfortunately, I don't have a time to implement it at the moment. If there's someone who want to work on it, I'm gladly help him. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-07 1:57 ` Kenichi Handa @ 2008-07-07 4:39 ` Stefan Monnier 2008-07-07 5:25 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Stefan Monnier @ 2008-07-07 4:39 UTC (permalink / raw) To: Kenichi Handa; +Cc: Juri Linkov, tzz, emacs-devel > I remember that activating multiple input methods was > discussed a while ago on this list, but don't remember the > conclusion. :-( My memory was telling me that you had implemented and installed it already. Stefan ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-07 4:39 ` Stefan Monnier @ 2008-07-07 5:25 ` Kenichi Handa 2008-07-07 19:42 ` Ted Zlatanov 2008-07-07 22:05 ` Juri Linkov 0 siblings, 2 replies; 77+ messages in thread From: Kenichi Handa @ 2008-07-07 5:25 UTC (permalink / raw) To: Stefan Monnier; +Cc: juri, tzz, emacs-devel In article <jwvzlouwbof.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > > I remember that activating multiple input methods was > > discussed a while ago on this list, but don't remember the > > conclusion. > :-( My memory was telling me that you had implemented and installed > it already. Ah!!! Now I remembered that I wrote a experimental code in a little bit tricky way. I dig out the attached mail. But, I completely forgot about it because there was no response at that time. --- Kenichi Handa handa@ni.aist.go.jp ------------------------------------------------------------ From: Kenichi Handa <handa@m17n.org> To: rms@gnu.org In-reply-to: <E1IR12U-0000a1-Fv@fencepost.gnu.org> (message from Richard Stallman on Fri, 31 Aug 2007 03:35:34 -0400) MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Date: Tue, 04 Sep 2007 15:48:38 +0900 Cc: blais@furius.ca, emacs-devel@gnu.org Subject: Re: smartquotes.el -- Insertion of unicode quotes in text documents In article <E1IR12U-0000a1-Fv@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: > What kind of user interface should be provided for > activating and deactivating multiple input methods? > Here's one idea: the user enables and disables one input method, > directly, and the others are enabled or disabled by a Lisp interface. > That Lisp interface can be called by other commands. > Does this solve that one problem? Yes. I implemented it as an add-on code (with a little bit tricky way). Once it is found that it works well, I'll merge the code into mule-cmds.el while cleaning the code. Please try the attached code with the latest emacs-unicode-2. The Lisp interfaces are the functions activate-preposition-input-method and inactivate-preposition-input-method. --- Kenichi Handa handa@m17n.org (defvar local-input-method-list nil "List of local preposition input methods.") (make-variable-buffer-local 'local-input-method-list) (defvar global-input-method-list nil "List of global preposition input methods.") (defvar normal-input-method nil "Currently active normal (i.e. non-preposion) input method.") (defun multi-input-method-function (event) "Input method function used while some preposition input method is active." (let ((disable-input-method-hook t) unread-command-events) (dolist (elt (append global-input-method-list local-input-method-list)) (let (input-method-history default-input-method) (activate-input-method elt) (setq unread-command-events (nreverse (funcall input-method-function event))) (setq event (car unread-command-events) unread-command-events (cdr unread-command-events)))) (let (input-method-history) (inactivate-input-method)) (setq current-input-method nil current-input-method-title nil) (unwind-protect (if normal-input-method (progn (activate-input-method normal-input-method) (append (funcall input-method-function event) unread-command-events)) (cons event unread-command-events)) (setq input-method-function 'multi-input-method-function)))) ;; If non-nil, disable input-method-hook. (defvar disable-input-method-hook nil) ;; A function for input-method-active-hook used while some preposition ;; input method is active. (defun input-method-activate-hook () (unless disable-input-method-hook (setq normal-input-method current-input-method) (if (or local-input-method-list global-input-method-list) (setq input-method-function 'multi-input-method-function)))) ;; A function for input-method-inactivate-hook used while some ;; preposition input method is active. (defun input-method-inactivate-hook () (unless disable-input-method-hook (setq normal-input-method nil) (if (or local-input-method-list global-input-method-list) (setq input-method-function 'multi-input-method-function)))) (defun activate-preposition-input-method (input-method global) "Activate a preposition INPUT-METHOD. INPUT-METHOD is handled before a normal input method. If the second arg GLOBAL is non-nil, activate it for all buffers. Otherwise, activate it only for the current buffer." (if (and input-method (symbolp input-method)) (setq input-method (symbol-name input-method))) (or (assoc input-method input-method-alist) (error "Unknown input method: %s" input-method)) (unless (assoc-string input-method (if global global-input-method-list local-input-method-list)) (or global-input-method-list local-input-method-list (progn (add-hook 'input-method-activate-hook 'input-method-activate-hook) (add-hook 'input-method-inactivate-hook 'input-method-inactivate-hook))) (setq normal-input-method current-input-method) (if global (progn (push input-method global-input-method-list) (setq-default input-method-function 'multi-input-method-function)) (push input-method local-input-method-list)) (make-local-variable 'input-method-function) (setq input-method-function 'multi-input-method-function))) (defun inactivate-preposition-input-method (input-method global) "Inactivate a proposition INPUT-METHOD. If the second arg GLOBAL is non-nil, inactivate it for all buffers. Otherwise, inactivate it only for the curren buffer." (if (and input-method (symbolp input-method)) (setq input-method (symbol-name input-method))) (if global (setq global-input-method-list (delete input-method global-input-method-list)) (setq local-input-method-list (delete input-method local-input-method-list))) (or global-input-method-list local-input-method-list (progn (remove-hook 'input-method-activate-hook 'input-method-activate-hook) (remove-hook 'input-method-inactivate-hook 'input-method-inactivate-hook) (setq input-method-function nil) (when normal-input-method (inactivate-input-method) (activate-input-method normal-input-method))))) ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-07 5:25 ` Kenichi Handa @ 2008-07-07 19:42 ` Ted Zlatanov 2008-07-07 22:05 ` Juri Linkov 1 sibling, 0 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-07-07 19:42 UTC (permalink / raw) To: emacs-devel On Mon, 07 Jul 2008 14:25:02 +0900 Kenichi Handa <handa@m17n.org> wrote: KH> In article <jwvzlouwbof.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: >> > I remember that activating multiple input methods was >> > discussed a while ago on this list, but don't remember the >> > conclusion. >> :-( My memory was telling me that you had implemented and installed >> it already. KH> Ah!!! Now I remembered that I wrote a experimental code in KH> a little bit tricky way. I dig out the attached mail. But, KH> I completely forgot about it because there was no response KH> at that time. If this can go into the trunk, I'll be glad to use it (my changes will then be unnecessary). The only caution is that universal sequences are not always intuitive; a good example is that I put "/ab" for paragraph because that makes sense in Bulgarian ("абзац" means paragraph, pronounced "abzatz"). So it would be nice to have a universal input method plus custom rules at the intermediate level (e.g. cyrillic-*). Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-07 5:25 ` Kenichi Handa 2008-07-07 19:42 ` Ted Zlatanov @ 2008-07-07 22:05 ` Juri Linkov 2008-07-13 5:11 ` Eli Zaretskii 1 sibling, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-07 22:05 UTC (permalink / raw) To: Kenichi Handa; +Cc: tzz, Stefan Monnier, emacs-devel > Ah!!! Now I remembered that I wrote a experimental code in > a little bit tricky way. I dig out the attached mail. But, > I completely forgot about it because there was no response > at that time. Hmm, I somehow missed this thread. I see now your questions: > What kind of user interface should be provided for > activating and deactivating multiple input methods? I think a good interface is to use `completing-read-multiple' in C-\ to read multiple input methods in a way similar to reading multiple faces in `read-face-name'. > What to do for C-h C-\? This could describe all active input methods. > What to show in the modeline? All codes of active input methods separated by some character like a comma. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-07 22:05 ` Juri Linkov @ 2008-07-13 5:11 ` Eli Zaretskii 2008-07-13 5:17 ` Miles Bader 0 siblings, 1 reply; 77+ messages in thread From: Eli Zaretskii @ 2008-07-13 5:11 UTC (permalink / raw) To: Juri Linkov; +Cc: tzz, emacs-devel, monnier, handa > From: Juri Linkov <juri@jurta.org> > Date: Tue, 08 Jul 2008 01:05:11 +0300 > Cc: tzz@lifelogs.com, Stefan Monnier <monnier@iro.umontreal.ca>, > emacs-devel@gnu.org > > > What to show in the modeline? > > All codes of active input methods separated by some character > like a comma. There's no space on the mode line for such long indicators. I think we should rather display some special character there, meaning that several methods are active, and only list all of them in the tooltip that pops when the mouse is hovering above that indicator. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-13 5:11 ` Eli Zaretskii @ 2008-07-13 5:17 ` Miles Bader 2008-07-13 21:27 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Miles Bader @ 2008-07-13 5:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Juri Linkov, tzz, handa, monnier, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> > What to show in the modeline? >> >> All codes of active input methods separated by some character >> like a comma. > > There's no space on the mode line for such long indicators. I think > we should rather display some special character there, meaning that > several methods are active, and only list all of them in the tooltip > that pops when the mouse is hovering above that indicator. Since currently it shows a single character for the input method, how about just appending a "+" in the case where there are others active as well... This presumes that one is the "main" input method, and others are "subsidiary", but that seems likely to be true in many (most?) cases, and it's certainly more useful than _only_ displaying a "+" or something... -Miles -- Year, n. A period of three hundred and sixty-five disappointments. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-13 5:17 ` Miles Bader @ 2008-07-13 21:27 ` Juri Linkov 2008-07-14 3:18 ` Miles Bader 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-13 21:27 UTC (permalink / raw) To: Miles Bader; +Cc: Eli Zaretskii, handa, tzz, monnier, emacs-devel >> There's no space on the mode line for such long indicators. I think >> we should rather display some special character there, meaning that >> several methods are active, and only list all of them in the tooltip >> that pops when the mouse is hovering above that indicator. > > Since currently it shows a single character for the input method, how > about just appending a "+" in the case where there are others active as > well... > > This presumes that one is the "main" input method, and others are > "subsidiary", but that seems likely to be true in many (most?) cases, > and it's certainly more useful than _only_ displaying a "+" or > something... As `list-input-methods' shows, there is only one input method with a "+" in its name: "ucs" ("U+" in the mode line). But since a "+" is a good indicator for additional input methods, so perhaps it is not a big problem when it will be displayed as "U++". PS: I could submit a patch that provides UI for using multiple input methods after Handa-san will install multi-input-method-function and other related code, possibly renaming to better names. I propose the following changes in names: activate-preposition-input-method -> activate-additional-input-method inactivate-preposition-input-method -> inactivate-additional-input-method local-input-method-list -> current-additional-input-method-list global-input-method-list -> remove this variable since there is no global variant of `current-input-method'. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-13 21:27 ` Juri Linkov @ 2008-07-14 3:18 ` Miles Bader 2008-07-14 4:43 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Miles Bader @ 2008-07-14 3:18 UTC (permalink / raw) To: Juri Linkov; +Cc: Eli Zaretskii, emacs-devel, tzz, monnier, handa Juri Linkov <juri@jurta.org> writes: > activate-preposition-input-method -> activate-additional-input-method > inactivate-preposition-input-method -> inactivate-additional-input-method > local-input-method-list -> current-additional-input-method-list Those names seem to have rather different implications though... `activate-additional-input-method' makes it sound vaguely like the new input method will somehow be "equal" to the existing input method, whereas kenichi's name somewhat implies otherwise, that the new input method being activated is somehow "special" (and I think also suggests that it will be "subsidiary" to any main input method). Which of those is true? -Miles -- You can hack anything you want, with TECO and DDT. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-14 3:18 ` Miles Bader @ 2008-07-14 4:43 ` Kenichi Handa 2008-07-14 21:51 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-07-14 4:43 UTC (permalink / raw) To: Miles Bader; +Cc: juri, eliz, tzz, monnier, emacs-devel In article <buotzetcfwx.fsf@dhapc248.dev.necel.com>, Miles Bader <miles.bader@necel.com> writes: > Juri Linkov <juri@jurta.org> writes: > > activate-preposition-input-method -> activate-additional-input-method > > inactivate-preposition-input-method -> inactivate-additional-input-method > > local-input-method-list -> current-additional-input-method-list > Those names seem to have rather different implications though... > `activate-additional-input-method' makes it sound vaguely like the new > input method will somehow be "equal" to the existing input method, > whereas kenichi's name somewhat implies otherwise, that the new input > method being activated is somehow "special" (and I think also suggests > that it will be "subsidiary" to any main input method). > Which of those is true? My original patch intends that "preposition" input methods are what handled before the normal input method (as documented). Juri Linkov <juri@jurta.org> writes: > global-input-method-list -> remove this variable since there is > no global variant of `current-input-method'. ??? A preposition input method can be activated globally when recorded in global-input-method-list. In that case, in any buffer, when you activate a normal input method, that preposition input method is also activated automatically. Isn't it useful? --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-14 4:43 ` Kenichi Handa @ 2008-07-14 21:51 ` Juri Linkov 2008-07-15 1:24 ` Kenichi Handa 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-14 21:51 UTC (permalink / raw) To: Kenichi Handa; +Cc: eliz, emacs-devel, tzz, monnier, Miles Bader >> > activate-preposition-input-method -> activate-additional-input-method >> > inactivate-preposition-input-method -> inactivate-additional-input-method >> > local-input-method-list -> current-additional-input-method-list > >> Those names seem to have rather different implications though... > >> `activate-additional-input-method' makes it sound vaguely like the new >> input method will somehow be "equal" to the existing input method, >> whereas kenichi's name somewhat implies otherwise, that the new input >> method being activated is somehow "special" (and I think also suggests >> that it will be "subsidiary" to any main input method). > >> Which of those is true? > > My original patch intends that "preposition" input methods > are what handled before the normal input method (as > documented). I'm not sure if "preposition" is the right English word here. >> global-input-method-list -> remove this variable since there is >> no global variant of `current-input-method'. > > ??? A preposition input method can be activated globally > when recorded in global-input-method-list. In that case, in > any buffer, when you activate a normal input method, that > preposition input method is also activated automatically. > Isn't it useful? This is useful, but I think it should work exactly like a single input method works now, i.e. since a global single input method is defined by the variable `default-input-method', so an additional/preposition/subsidiary input method should be defined by a similar global variable like e.g. `default-input-methods' (note the plural `s' at the end of the name) or `default-input-method-list'. This also suggests creating a new similar buffer-local variable `current-input-method-list' (or `current-input-methods'). Why I propose to create these parallel versions is because when I tried to create UI to define multiple input methods it was obvious that they are really very similar variables. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods 2008-07-14 21:51 ` Juri Linkov @ 2008-07-15 1:24 ` Kenichi Handa 2008-07-28 13:30 ` multiple input methods (was: adding consistent extra symbols to input methods) Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Kenichi Handa @ 2008-07-15 1:24 UTC (permalink / raw) To: Juri Linkov; +Cc: eliz, emacs-devel, tzz, monnier, miles In article <87skuckusq.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: > > My original patch intends that "preposition" input methods > > are what handled before the normal input method (as > > documented). > I'm not sure if "preposition" is the right English word here. Me neither. Anyway, it is an input method handled before the normal input method. >>> global-input-method-list -> remove this variable since there is >>> no global variant of `current-input-method'. > > > > ??? A preposition input method can be activated globally > > when recorded in global-input-method-list. In that case, in > > any buffer, when you activate a normal input method, that > > preposition input method is also activated automatically. > > Isn't it useful? > This is useful, but I think it should work exactly like a single input > method works now, i.e. since a global single input method is defined by the > variable `default-input-method', so an additional/preposition/subsidiary > input method should be defined by a similar global variable like e.g. > `default-input-methods' (note the plural `s' at the end of the name) or > `default-input-method-list'. Ah, I see your point. But, I think that the preposition input methods is a little bit different from the normal single input method because even if we activate the different single input method, the same preposition input methods are activated. Though, I'm not sure that this behaviour is the best. It looks convenient for those people who switches multiple input methods but wants a consistent key binding for inputting specific characters. But, perhaps most users don't switches multiple input methods. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: multiple input methods (was: adding consistent extra symbols to input methods) 2008-07-15 1:24 ` Kenichi Handa @ 2008-07-28 13:30 ` Juri Linkov 0 siblings, 0 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-28 13:30 UTC (permalink / raw) To: Kenichi Handa; +Cc: eliz, emacs-devel, tzz, monnier, miles >> > My original patch intends that "preposition" input methods >> > are what handled before the normal input method (as >> > documented). > >> I'm not sure if "preposition" is the right English word here. > > Me neither. Anyway, it is an input method handled before > the normal input method. So it has the same meaning as e.g. the command `pre-command-hook' that runs a hook before each command? But I doubt that the word "pre" in its name stands for "preposition". Maybe we could do the same and name new methods using the same short word "pre"? E.g. activate-pre-input-method inactivate-pre-input-method >>>> global-input-method-list -> remove this variable since there is >>>> no global variant of `current-input-method'. >> > >> > ??? A preposition input method can be activated globally >> > when recorded in global-input-method-list. In that case, in >> > any buffer, when you activate a normal input method, that >> > preposition input method is also activated automatically. >> > Isn't it useful? > >> This is useful, but I think it should work exactly like a single input >> method works now, i.e. since a global single input method is defined by the >> variable `default-input-method', so an additional/preposition/subsidiary >> input method should be defined by a similar global variable like e.g. >> `default-input-methods' (note the plural `s' at the end of the name) or >> `default-input-method-list'. > > Ah, I see your point. But, I think that the preposition > input methods is a little bit different from the normal > single input method because even if we activate the > different single input method, the same preposition input > methods are activated. Though, I'm not sure that this > behaviour is the best. It looks convenient for those people > who switches multiple input methods but wants a consistent > key binding for inputting specific characters. But, perhaps > most users don't switches multiple input methods. I think we should provide UI that activates the normal single input method and all pre input methods at the same time. So a sequence `C-\ IM1,IM2,IM3 RET' will set the normal single input method to IM1, and set pre input methods to a list '(IM2 IM3). -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-06-13 18:09 ` Ted Zlatanov 2008-06-14 9:44 ` Eli Zaretskii @ 2008-07-06 18:41 ` Juri Linkov 2008-07-07 20:12 ` Ted Zlatanov 1 sibling, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-06 18:41 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > Thanks. I did: > > * quail/cyrillic.el: Add quotation marks, paragraph symbol, angled > brackets, number symbol, and accented aeio to cyrillic-translit. > > The quotation marks are both ‚‘ and „“ (plus «» which are also common). > Let me know if I've missed anything. I am disappointed by this change. It has several drawbacks: 1. It uses the acute accent to put the grave accent above letters, e.g. ("'a" ?à) ("'o" ?ò). A correct way to implement this is to use the acute accent to put the acute accent above letters, and to use the grave accent to put the grave accent above letters, as all Latin input methods do, e.g. ("'a" ?á) ("'o" ?ó) ("`a" ?à) ("`o" ?ò). 2. It uses accented Latin letters à, ò that is inappropriate for Cyrillic texts. The only valid way (as I understand according to Unicode specifications) is to use combining characters. 3. It turns "'" into a prefix key, but it is used to input "ь" according to the rule ("'" ?ь). 4. «»“„‘‚§№ is too limited set of necessary characters and this set is not specific to `cyrillic-translit'. Different styles of quotation marks are required by typographic rules in other several languages and scripts besides Cyrillic, and these rules also require using other symbols like dashes of different lengths, nbsp, 1/2, 1/4, subscripts, copyright, currency signs, and many more. So instead of copying the same rules to all input method a better way is to create a separate common input method with all these special symbols and to share it with language specific input methods. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-06 18:41 ` composed characters question and suggestions for quail-cyrillic-* Juri Linkov @ 2008-07-07 20:12 ` Ted Zlatanov 2008-07-07 21:42 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-07 20:12 UTC (permalink / raw) To: emacs-devel On Sun, 06 Jul 2008 21:41:45 +0300 Juri Linkov <juri@jurta.org> wrote: JL> 1. It uses the acute accent to put the grave accent above letters, JL> e.g. ("'a" ?à) ("'o" ?ò). A correct way to implement this is to use the JL> acute accent to put the acute accent above letters, and to use the grave JL> accent to put the grave accent above letters, as all Latin input methods JL> do, e.g. ("'a" ?á) ("'o" ?ó) ("`a" ?à) ("`o" ?ò). You are right. But please note that AFAIK in Cyrillic it's rare to find acute accents, so the idea was "accent the next letter" and the ' key is much more convenient on modern keyboards. For Cyrillic in particular, it may make sense to use ' as the accent prefix or accept it in addition to `. If you still think only ` should be used, I'll commit a patch immediately. JL> 2. It uses accented Latin letters à, ò that is inappropriate for JL> Cyrillic texts. The only valid way (as I understand according to JL> Unicode specifications) is to use combining characters. I think I mentioned this in an earlier post. Combining characters look inconsistent and sometimes take up two lines of text in Emacs, so I thought it would be acceptable to use the accented Latin letters. If not, I'm OK with replacing them with the combining versions. Please note I'm not an expert on this topic, so I greatly appreciate your recommendations. JL> 3. It turns "'" into a prefix key, but it is used to input "ь" according JL> to the rule ("'" ?ь). Would it be possible to move ь under the ' prefix? As I mentioned the ' key is very convenient and ь is not a frequently-needed letter. It actually works fine for me as it is (unless I need to type something like ьо, which is rare), but I see the problem. JL> 4. «»“„‘‚§№ is too limited set of necessary characters and this set is JL> not specific to `cyrillic-translit'. Different styles of quotation JL> marks are required by typographic rules in other several languages and JL> scripts besides Cyrillic, and these rules also require using other JL> symbols like dashes of different lengths, nbsp, 1/2, 1/4, subscripts, JL> copyright, currency signs, and many more. In the specific cases I know (I only write in Bulgarian frequently), the characters I added are most needed. If you or others want to add more characters, go ahead or tell me what needs to be added. JL> So instead of copying the same rules to all input method a better JL> way is to create a separate common input method with all these JL> special symbols and to share it with language specific input JL> methods. My suggestion was essentially to build a prefix tree for Slavic languages, since they share enough typographic rules, and to insert it into every specific input method. Using a secondary input method works better so I hope it can happen (if Kenichi Handa's patch is OK). Thanks again Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-07 20:12 ` Ted Zlatanov @ 2008-07-07 21:42 ` Juri Linkov 2008-07-08 0:48 ` Kenichi Handa ` (3 more replies) 0 siblings, 4 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-07 21:42 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> 1. It uses the acute accent to put the grave accent above letters, > JL> e.g. ("'a" ?à) ("'o" ?ò). A correct way to implement this is to use the > JL> acute accent to put the acute accent above letters, and to use the grave > JL> accent to put the grave accent above letters, as all Latin input methods > JL> do, e.g. ("'a" ?á) ("'o" ?ó) ("`a" ?à) ("`o" ?ò). > > You are right. But please note that AFAIK in Cyrillic it's rare to find > acute accents, so the idea was "accent the next letter" and the ' key is > much more convenient on modern keyboards. For Cyrillic in particular, > it may make sense to use ' as the accent prefix or accept it in addition > to `. If you still think only ` should be used, I'll commit a patch > immediately. Instead of the grave accent `, most Cyrillic languages (including Bulgarian, Russian, Ukrainian) use the acute accent ' to mark the stressed vowel. Please see http://en.wikipedia.org/wiki/Acute_accent#Stress for more information. > JL> 2. It uses accented Latin letters à, ò that is inappropriate for > JL> Cyrillic texts. The only valid way (as I understand according to > JL> Unicode specifications) is to use combining characters. > > I think I mentioned this in an earlier post. Combining characters look > inconsistent and sometimes take up two lines of text in Emacs, so I > thought it would be acceptable to use the accented Latin letters. If > not, I'm OK with replacing them with the combining versions. Please > note I'm not an expert on this topic, so I greatly appreciate your > recommendations. If combining characters take two lines, then it is a bug. I remember that rendering of combining characters was correct before the Unicode merge. If it was possible to do right before the merge, maybe it will be possible to fix this in current code using the same logic? > JL> 3. It turns "'" into a prefix key, but it is used to input "ь" according > JL> to the rule ("'" ?ь). > > Would it be possible to move ь under the ' prefix? As I mentioned the ' > key is very convenient and ь is not a frequently-needed letter. It > actually works fine for me as it is (unless I need to type something > like ьо, which is rare), but I see the problem. In Bulgarian it is rare, but in Russian and Ukrainian it is very frequently used letter ;-) > JL> 4. «»“„‘‚§№ is too limited set of necessary characters and this set is > JL> not specific to `cyrillic-translit'. Different styles of quotation > JL> marks are required by typographic rules in other several languages and > JL> scripts besides Cyrillic, and these rules also require using other > JL> symbols like dashes of different lengths, nbsp, 1/2, 1/4, subscripts, > JL> copyright, currency signs, and many more. > > In the specific cases I know (I only write in Bulgarian frequently), the > characters I added are most needed. If you or others want to add more > characters, go ahead or tell me what needs to be added. Thanks, the characters you added are very needed. Other needed characters to add are at least ”’–—•… > JL> So instead of copying the same rules to all input method a better > JL> way is to create a separate common input method with all these > JL> special symbols and to share it with language specific input > JL> methods. > > My suggestion was essentially to build a prefix tree for Slavic > languages, since they share enough typographic rules, and to insert it > into every specific input method. Using a secondary input method works > better so I hope it can happen (if Kenichi Handa's patch is OK). And in another message you wrote: > If this can go into the trunk, I'll be glad to use it (my changes will > then be unnecessary). The only caution is that universal sequences are > not always intuitive; a good example is that I put "/ab" for paragraph > because that makes sense in Bulgarian ("абзац" means paragraph, > pronounced "abzatz"). So it would be nice to have a universal input > method plus custom rules at the intermediate level (e.g. cyrillic-*). It might be funny but in Russian § is named as a "paragraph sign", so your mnemonics don't work here. And "абзац" is used for a different character, actually the pilcrow. Please compare: http://ru.wikipedia.org/wiki/%C2%B6 http://ru.wikipedia.org/wiki/%D0%97%D0%BD%D0%B0%D0%BA_%D0%BF%D0%B0%D1%80%D0%B0%D0%B3%D1%80%D0%B0%D1%84%D0%B0 -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-07 21:42 ` Juri Linkov @ 2008-07-08 0:48 ` Kenichi Handa 2008-07-08 10:46 ` Werner LEMBERG ` (2 subsequent siblings) 3 siblings, 0 replies; 77+ messages in thread From: Kenichi Handa @ 2008-07-08 0:48 UTC (permalink / raw) To: Juri Linkov; +Cc: tzz, emacs-devel In article <87fxql8j7y.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: > If combining characters take two lines, then it is a bug. I remember > that rendering of combining characters was correct before the Unicode > merge. If it was possible to do right before the merge, maybe it will be > possible to fix this in current code using the same logic? You are right. I'm going to fix this problem as soon as I finish the current work. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-07 21:42 ` Juri Linkov 2008-07-08 0:48 ` Kenichi Handa @ 2008-07-08 10:46 ` Werner LEMBERG 2008-07-08 21:47 ` David Kastrup 2008-07-08 15:37 ` Ted Zlatanov 2008-07-08 15:49 ` James Cloos 3 siblings, 1 reply; 77+ messages in thread From: Werner LEMBERG @ 2008-07-08 10:46 UTC (permalink / raw) To: juri; +Cc: tzz, emacs-devel > It might be funny but in Russian § is named as a "paragraph sign", > so your mnemonics don't work here. And "абзац" is used for a > different character, actually the pilcrow. Both a possibly of German origin: We also call § the `paragraph sign' to enumerate laws. And абзац is definitely a transliteration of the German word `Absatz', meaning paragraph. Werner ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-08 10:46 ` Werner LEMBERG @ 2008-07-08 21:47 ` David Kastrup 0 siblings, 0 replies; 77+ messages in thread From: David Kastrup @ 2008-07-08 21:47 UTC (permalink / raw) To: Werner LEMBERG; +Cc: juri, tzz, emacs-devel Werner LEMBERG <wl@gnu.org> writes: >> It might be funny but in Russian § is named as a "paragraph sign", >> so your mnemonics don't work here. And "абзац" is used for a >> different character, actually the pilcrow. > > Both a possibly of German origin: We also call § the `paragraph sign' > to enumerate laws. No, we don't. We call it "Paragraph" which can be translated to "paragraph sign". > And абзац is definitely a transliteration of the German word `Absatz', > meaning paragraph. Yup. Law navigation is usually done by "Paragraph/Absatz", typically written as "§23, Abs. 3" IIRC. ¶ is not used for navigation, merely for printer instructions. It denotes a paragraph (namely the end of an "Absatz"), not a "Paragraph". The problem in German is that a "Paragraph" really is sort of a numbered section rather than just a paragraph. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-07 21:42 ` Juri Linkov 2008-07-08 0:48 ` Kenichi Handa 2008-07-08 10:46 ` Werner LEMBERG @ 2008-07-08 15:37 ` Ted Zlatanov 2008-07-08 17:38 ` James Cloos 2008-07-08 22:54 ` Juri Linkov 2008-07-08 15:49 ` James Cloos 3 siblings, 2 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-07-08 15:37 UTC (permalink / raw) To: emacs-devel On Tue, 08 Jul 2008 00:42:09 +0300 Juri Linkov <juri@jurta.org> wrote: JL> 1. It uses the acute accent to put the grave accent above letters, JL> e.g. ("'a" ?à) ("'o" ?ò). A correct way to implement this is to use the JL> acute accent to put the acute accent above letters, and to use the grave JL> accent to put the grave accent above letters, as all Latin input methods JL> do, e.g. ("'a" ?á) ("'o" ?ó) ("`a" ?à) ("`o" ?ò). >> >> You are right. But please note that AFAIK in Cyrillic it's rare to find >> acute accents, so the idea was "accent the next letter" and the ' key is >> much more convenient on modern keyboards. For Cyrillic in particular, >> it may make sense to use ' as the accent prefix or accept it in addition >> to `. If you still think only ` should be used, I'll commit a patch >> immediately. JL> Instead of the grave accent `, most Cyrillic languages (including Bulgarian, JL> Russian, Ukrainian) use the acute accent ' to mark the stressed vowel. JL> Please see http://en.wikipedia.org/wiki/Acute_accent#Stress for more JL> information. Take a look at the Unicode Cyrillic chart. Only the grave is available for ѝ for example. They all have to be done with combining. We still need the grave-accented ѝ Ѝ letters, too. So both ' and ` (or something similar) will be needed as prefix keys. JL> 2. It uses accented Latin letters à, ò that is inappropriate for JL> Cyrillic texts. The only valid way (as I understand according to JL> Unicode specifications) is to use combining characters. >> >> I think I mentioned this in an earlier post. Combining characters look >> inconsistent and sometimes take up two lines of text in Emacs, so I >> thought it would be acceptable to use the accented Latin letters. If >> not, I'm OK with replacing them with the combining versions. Please >> note I'm not an expert on this topic, so I greatly appreciate your >> recommendations. JL> If combining characters take two lines, then it is a bug. I remember JL> that rendering of combining characters was correct before the Unicode JL> merge. If it was possible to do right before the merge, maybe it will be JL> possible to fix this in current code using the same logic? OK. Furthermore, I can do (insert (compose-chars ?а ?̀)) but if I try the resulting character in quail-define-rules, it's not a valid character read sequence, being two characters. I also can't specify the `compose-chars' function call or a string there. How do I specify a combined character in the quail rules? JL> 3. It turns "'" into a prefix key, but it is used to input "ь" according JL> to the rule ("'" ?ь). >> >> Would it be possible to move ь under the ' prefix? As I mentioned the ' >> key is very convenient and ь is not a frequently-needed letter. It >> actually works fine for me as it is (unless I need to type something >> like ьо, which is rare), but I see the problem. JL> In Bulgarian it is rare, but in Russian and Ukrainian it is very JL> frequently used letter ;-) Understood, but ' is the most sensible prefix for accents as well. Can we have `' generate acute accents and ` generate grave? That's a decent compromise since accented letters are rarely needed. JL> 4. «»“„‘‚§№ is too limited set of necessary characters and this set is JL> not specific to `cyrillic-translit'. Different styles of quotation JL> marks are required by typographic rules in other several languages and JL> scripts besides Cyrillic, and these rules also require using other JL> symbols like dashes of different lengths, nbsp, 1/2, 1/4, subscripts, JL> copyright, currency signs, and many more. JL> Thanks, the characters you added are very needed. Other needed characters JL> to add are at least ”’–—•… See at end for proposed mappings. JL> So instead of copying the same rules to all input method a better JL> way is to create a separate common input method with all these JL> special symbols and to share it with language specific input JL> methods. >> >> My suggestion was essentially to build a prefix tree for Slavic >> languages, since they share enough typographic rules, and to insert it >> into every specific input method. Using a secondary input method works >> better so I hope it can happen (if Kenichi Handa's patch is OK). JL> And in another message you wrote: >> If this can go into the trunk, I'll be glad to use it (my changes will >> then be unnecessary). The only caution is that universal sequences are >> not always intuitive; a good example is that I put "/ab" for paragraph >> because that makes sense in Bulgarian ("абзац" means paragraph, >> pronounced "abzatz"). So it would be nice to have a universal input >> method plus custom rules at the intermediate level (e.g. cyrillic-*). JL> It might be funny but in Russian § is named as a "paragraph sign", JL> so your mnemonics don't work here. And "абзац" is used for a different JL> character, actually the pilcrow. Please compare: JL> http://ru.wikipedia.org/wiki/%C2%B6 JL> http://ru.wikipedia.org/wiki/%D0%97%D0%BD%D0%B0%D0%BA_%D0%BF%D0%B0%D1%80%D0%B0%D0%B3%D1%80%D0%B0%D1%84%D0%B0 According to http://en.wiktionary.org/wiki/%D0%B0%D0%B1%D0%B7%D0%B0%D1%86 "абзац" is a synonym for paragraph in Russian (and comes from German, so I learned something new :). I don't know what's exactly right here, but we can certainly accomodate /pa as a paragraph prefix that produces §. I would prefer to leave /ab as § as well since (AFAIK) the pilcrow is not as common. Do you agree? The goal is convenience for the users, so I hope we don't build a large prefix tree. Just the limited repertoire here is already hard to remember. How about: "' -> ” (compare to "" for “) /` -> ’ (compare to /' for ‘) /- -> – /-- -> — /. -> • ("big fat dot") /.. -> … (nice typographically) /1{2,4,8,16,32,64} -> 1/fraction (the slash moves to the beginning) /c -> copyright /tm -> trademark /rub-> ruble /kop-> kopek /lev-> leva (л AFAIK) /sto-> stotinki /e -> euro /ce -> cents /pa -> § plus existing ,, -> „ "" -> “ /, -> ‚ /' -> ‘ /& -> § /ab -> § /# -> № /no -> № << -> « >> -> » Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-08 15:37 ` Ted Zlatanov @ 2008-07-08 17:38 ` James Cloos 2008-07-08 22:54 ` Juri Linkov 1 sibling, 0 replies; 77+ messages in thread From: James Cloos @ 2008-07-08 17:38 UTC (permalink / raw) To: emacs-devel; +Cc: Ted Zlatanov >>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes: Ted> The only caution is that universal sequences are not always Ted> intuitive; a good example is that I put "/ab" for paragraph because Ted> that makes sense in Bulgarian ("абзац" means paragraph, pronounced Ted> "abzatz"). JL> It might be funny but in Russian § is named as a "paragraph sign", JL> so your mnemonics don't work here. And "абзац" is used for a JL> different character, actually the pilcrow. Ted> "абзац" is a synonym for paragraph in Russian (and comes from Ted> German, In English, the pilcrow sign (¶) is more widely know as the paragraph sign, and is frequently used (especially in legal texts) to specify a numbered paragraph in a citation, such as in [§4.5¶7] for the seventh paragraph of the fifth subsection of the fourth section. Or [§8ii¶C1] for the first sub-paragraph of the third paragraph of the second sub- section of the eighth section. I'd bet, auf Deutsch, der Absatz is used for ¶, hense абзац. Did the two symbols end up with reversed meanings in different cultures? -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-08 15:37 ` Ted Zlatanov 2008-07-08 17:38 ` James Cloos @ 2008-07-08 22:54 ` Juri Linkov 2008-07-09 16:02 ` Ted Zlatanov 1 sibling, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-08 22:54 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> Instead of the grave accent `, most Cyrillic languages (including Bulgarian, > JL> Russian, Ukrainian) use the acute accent ' to mark the stressed vowel. > JL> Please see http://en.wikipedia.org/wiki/Acute_accent#Stress for more > JL> information. > > Take a look at the Unicode Cyrillic chart. Only the grave is available > for ѝ for example. They all have to be done with combining. > > We still need the grave-accented ѝ Ѝ letters, too. So both ' and ` (or > something similar) will be needed as prefix keys. Yes, I agree. We need both ` and '. For some letters the grave accent is part of the Unicode character (Ѝ, Ѐ), for some letters the acute accent is part of the Unicode character (Ѓ, Ќ), and for vowels the acute accent is used to mark the primary stress, and the grave accent is used to mark the secondary stress. > JL> If combining characters take two lines, then it is a bug. I remember > JL> that rendering of combining characters was correct before the Unicode > JL> merge. If it was possible to do right before the merge, maybe it will be > JL> possible to fix this in current code using the same logic? > > OK. Furthermore, I can do > > (insert (compose-chars ?а ?̀)) > > but if I try the resulting character in quail-define-rules, it's not a > valid character read sequence, being two characters. I also can't > specify the `compose-chars' function call or a string there. How do I > specify a combined character in the quail rules? Maybe something like this should work ("a`" "а̀") ("a'" "а́"). > JL> In Bulgarian it is rare, but in Russian and Ukrainian it is very > JL> frequently used letter ;-) > > Understood, but ' is the most sensible prefix for accents as well. I think we should use ` and ' as the postfix character, like latin-postfix vs latin-prefix. > Can we have `' generate acute accents and ` generate grave? > That's a decent compromise since accented letters are rarely needed. Since ` and ' are more important for accented letters, we should find an alternative key for ь. > According to > > http://en.wiktionary.org/wiki/%D0%B0%D0%B1%D0%B7%D0%B0%D1%86 > > "абзац" is a synonym for paragraph in Russian (and comes from German, so > I learned something new :). I don't know what's exactly right here, but > we can certainly accomodate /pa as a paragraph prefix that produces §. > I would prefer to leave /ab as § as well since (AFAIK) the pilcrow is > not as common. Do you agree? Good, but I wonder if we should provide this kind of mnemonics for all languages (e.g. "/se" for English since it is the section sign :-) > The goal is convenience for the users, so I hope we don't build a large > prefix tree. Just the limited repertoire here is already hard to > remember. I now noticed that I can't type a pair of double quotes in cyrillic-translit that I often do. Maybe this rule should use the slash prefix key /"" -> “ ? -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-08 22:54 ` Juri Linkov @ 2008-07-09 16:02 ` Ted Zlatanov 2008-07-09 18:02 ` James Cloos ` (2 more replies) 0 siblings, 3 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-07-09 16:02 UTC (permalink / raw) To: emacs-devel On Wed, 09 Jul 2008 01:54:35 +0300 Juri Linkov <juri@jurta.org> wrote: >> I can do >> >> (insert (compose-chars ?а ?̀)) >> >> but if I try the resulting character in quail-define-rules, it's not a >> valid character read sequence, being two characters. I also can't >> specify the `compose-chars' function call or a string there. How do I >> specify a combined character in the quail rules? JL> Maybe something like this should work ("a`" "а̀") ("a'" "а́"). It doesn't, according to the docs and it's the first thing I tried :) JL> I think we should use ` and ' as the postfix character, like JL> latin-postfix vs latin-prefix. OK, I changed things accordingly and will commit to CVS when the problem above with doing combining characters in Quail is resolved. >> Can we have `' generate acute accents and ` generate grave? >> That's a decent compromise since accented letters are rarely needed. JL> Since ` and ' are more important for accented letters, we should JL> find an alternative key for ь. I often see it written as q or u in manually transliterated text. Maybe // or /q or /u would work? The corresponding /? /Q /U would uppercase it. >> The goal is convenience for the users, so I hope we don't build a large >> prefix tree. Just the limited repertoire here is already hard to >> remember. JL> I now noticed that I can't type a pair of double quotes in cyrillic-translit JL> that I often do. Maybe this rule should use the slash prefix key /"" -> “ ? OK, it's more consistent that way. Let's eliminate all but << and >> as non-prefixed special characters for consistency (those two are too convenient IMO). So we'd have: /- -> – /-- -> — /. -> • ("big fat dot") /.. -> … (or whatever is appropriate for this one) /1{2,4,8,16,32,64} -> 1/fraction (the slash moves to the beginning) /c -> copyright /tm -> trademark /rub-> ruble /kop-> kopek /lev-> leva (л AFAIK) /sto-> stotinki (с AFAIK) /e -> euro /ce -> cents /pa -> § /`` -> ” (high 9 double quote) /` -> ’ (high 9 single quote) /, -> ‚ (low 9 single quote) /,, -> „ (low 9 double quote) /' -> ‘ (high 6 single quote) /'' -> “ (high 6 double quote) /& -> § /ab -> § /# -> № /no -> № << -> « >> -> » ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 16:02 ` Ted Zlatanov @ 2008-07-09 18:02 ` James Cloos 2008-07-09 18:49 ` Ted Zlatanov 2008-07-09 19:51 ` Juri Linkov 2008-07-09 18:48 ` Ted Zlatanov 2008-07-09 19:21 ` Juri Linkov 2 siblings, 2 replies; 77+ messages in thread From: James Cloos @ 2008-07-09 18:02 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel >>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes: Ted> /.. -> … (or whatever is appropriate for this one) After some further discussions and research, it is looking like the single character is better after all. Irregardless of the character's provenance, it is what users expect. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 18:02 ` James Cloos @ 2008-07-09 18:49 ` Ted Zlatanov 2008-07-09 19:51 ` Juri Linkov 1 sibling, 0 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-07-09 18:49 UTC (permalink / raw) To: emacs-devel On Wed, 09 Jul 2008 14:02:42 -0400 James Cloos <cloos@jhcloos.com> wrote: >>>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes: Ted> /.. -> … (or whatever is appropriate for this one) JC> After some further discussions and research, it is looking like the JC> single character is better after all. Irregardless of the character's JC> provenance, it is what users expect. OK, I'll leave it as it was. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 18:02 ` James Cloos 2008-07-09 18:49 ` Ted Zlatanov @ 2008-07-09 19:51 ` Juri Linkov 1 sibling, 0 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-09 19:51 UTC (permalink / raw) To: James Cloos; +Cc: Ted Zlatanov, emacs-devel > Ted> /.. -> … (or whatever is appropriate for this one) > > After some further discussions and research, it is looking like the > single character is better after all. Irregardless of the character's > provenance, it is what users expect. Yes, this is natural to expect a Unicode input method producing a single Unicode character. Then how about the following rules: /. "․" U2024 # ONE DOT LEADER /.. "‥" U2025 # TWO DOT LEADER /... "…" ellipsis # HORIZONTAL ELLIPSIS -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 16:02 ` Ted Zlatanov 2008-07-09 18:02 ` James Cloos @ 2008-07-09 18:48 ` Ted Zlatanov 2008-07-09 19:33 ` Juri Linkov 2008-07-09 19:21 ` Juri Linkov 2 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-09 18:48 UTC (permalink / raw) To: emacs-devel On Wed, 09 Jul 2008 11:02:29 -0500 Ted Zlatanov <tzz@lifelogs.com> wrote: TZ> On Wed, 09 Jul 2008 01:54:35 +0300 Juri Linkov <juri@jurta.org> wrote: >>> I can do >>> >>> (insert (compose-chars ?а ?̀)) >>> >>> but if I try the resulting character in quail-define-rules, it's not a >>> valid character read sequence, being two characters. I also can't >>> specify the `compose-chars' function call or a string there. How do I >>> specify a combined character in the quail rules? JL> Maybe something like this should work ("a`" "а̀") ("a'" "а́"). TZ> It doesn't, according to the docs and it's the first thing I tried :) JL> I think we should use ` and ' as the postfix character, like JL> latin-postfix vs latin-prefix. TZ> OK, I changed things accordingly and will commit to CVS when the problem TZ> above with doing combining characters in Quail is resolved. This is the only thing stopping me. I adjusted the accents and added some more special characters; here's the special characters with some comments. ("/c" ?©) ("/tm" ?®) ;; I couldn't find a TM glyph in Unicode ("/rub" ?R) ;; see http://en.wikipedia.org/wiki/Russian_ruble, new glyph may be under development ("/kop" ?к) ;; not sure ("/lev" ?л) ("/sto" ?с) ("/e" ?€) ("/ce" ?¢) ;; is this the right glyph for Euro cents? I think so. ;; fractions (non-combined) ("/78" ?⅞) ("/58" ?⅝) ("/38" ?⅜) ("/18" ?⅛) ("/56" ?⅚) ("/16" ?⅙) ("/45" ?⅘) ("/35" ?⅗) ("/25" ?⅖) ("/15" ?⅕) ("/23" ?⅔) ("/13" ?⅓) ("/34" ?¾) ("/12" ?½) ("/14" ?¼) ;; we'll have combined fractions later I guess ;; Roman numerals, commonly used for months ("/I" ?Ⅰ) ("/II" ?Ⅱ) ("/III" ?Ⅲ) ("/IV" ?Ⅳ) ("/V" ?Ⅴ) ("/VI" ?Ⅵ) ("/VII" ?Ⅶ) ("/VIII" ?Ⅷ) ("/IX" ?Ⅸ) ("/X" ?Ⅹ) ("/XI" ?Ⅺ) ("/XII" ?Ⅻ) ("/-" ?–) ("/--" ?—) ("/." ?•) ("/.." ?…) ("/``" ?”) ("/`" ?’) ("/,," ?„) ("/''" ?“) ("/," ?‚) ("/'" ?‘) ("/&" ?§) ("/ab" ?§) ; _аб_зац ("/pa" ?§) ; _pa_ragraph ("/#" ?№) ("/no" ?№) ; _но_мер ("<<" ?«) (">>" ?») ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 18:48 ` Ted Zlatanov @ 2008-07-09 19:33 ` Juri Linkov 2008-07-09 22:14 ` Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-09 19:33 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > TZ> OK, I changed things accordingly and will commit to CVS when the problem > TZ> above with doing combining characters in Quail is resolved. > > This is the only thing stopping me. I adjusted the accents and added > some more special characters; here's the special characters with some > comments. > > ("/c" ?©) > ("/tm" ?®) ;; I couldn't find a TM glyph in Unicode It is ?\u2122. > ("/rub" ?R) ;; see http://en.wikipedia.org/wiki/Russian_ruble, > new glyph may be under development There is no Unicode character yet, but since a new glyph is gaining popularity, we can expect a codepoint for it in next Unicode versions (BTW, a new glyph is very similar to the glyph I designed many years ago). And the Hryvnia sign was already added to Unicode in 2004, it is ?\u20B4. http://en.wikipedia.org/wiki/%E2%82%B4 > ("/kop" ?к) ;; not sure No Unicode character, and never will be I guess. > ("/lev" ?л) > ("/sto" ?с) Is it more correct to write with a period as "л." and "с.", or even "лв."? > ("/e" ?€) > ("/ce" ?¢) ;; is this the right glyph for Euro cents? I think so. Yes, it is. > ;; fractions (non-combined) > ("/78" ?⅞) > ("/58" ?⅝) > ("/38" ?⅜) > ("/18" ?⅛) > ("/56" ?⅚) > ("/16" ?⅙) > ("/45" ?⅘) > ("/35" ?⅗) > ("/25" ?⅖) > ("/15" ?⅕) > ("/23" ?⅔) > ("/13" ?⅓) > ("/34" ?¾) > ("/12" ?½) > ("/14" ?¼) > ;; we'll have combined fractions later I guess > > ;; Roman numerals, commonly used for months and sometimes for chapter/section numbers > ("/I" ?Ⅰ) > ("/II" ?Ⅱ) > ("/III" ?Ⅲ) > ("/IV" ?Ⅳ) > ("/V" ?Ⅴ) > ("/VI" ?Ⅵ) > ("/VII" ?Ⅶ) > ("/VIII" ?Ⅷ) > ("/IX" ?Ⅸ) > ("/X" ?Ⅹ) > ("/XI" ?Ⅺ) > ("/XII" ?Ⅻ) > > ("/-" ?–) > ("/--" ?—) > ("/." ?•) > ("/.." ?…) > ("/``" ?”) > ("/`" ?’) > ("/,," ?„) > ("/''" ?“) > ("/," ?‚) > ("/'" ?‘) > > ("/&" ?§) > ("/ab" ?§) ; _аб_зац > ("/pa" ?§) ; _pa_ragraph > ("/#" ?№) > ("/no" ?№) ; _но_мер > ("<<" ?«) > (">>" ?») -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 19:33 ` Juri Linkov @ 2008-07-09 22:14 ` Ted Zlatanov 2008-07-09 23:52 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-09 22:14 UTC (permalink / raw) To: emacs-devel On Wed, 09 Jul 2008 22:33:38 +0300 Juri Linkov <juri@jurta.org> wrote: TZ> OK, I changed things accordingly and will commit to CVS when the problem TZ> above with doing combining characters in Quail is resolved. >> >> This is the only thing stopping me. I adjusted the accents and added >> some more special characters; here's the special characters with some >> comments. >> >> ("/c" ?©) >> ("/tm" ?®) ;; I couldn't find a TM glyph in Unicode JL> It is ?\u2122. JL> It is ?\u2122. OK, I changed to: ("/tm" ?™) ("/reg" ?®) >> ("/kop" ?к) ;; not sure JL> No Unicode character, and never will be I guess. Probably not worth the trouble then. Same for rubles, leva, and stotinki, their symbols are currently trivial so I'll remove them. The situation is different for Roman numerals which have distinct code points from the corresponding ASCII characters. I'm not including lowercase versions but I could (downcase-region is so nice, this took 2 seconds): ("/i" ?ⅰ) ("/ii" ?ⅱ) ("/iii" ?ⅲ) ("/iv" ?ⅳ) ("/v" ?ⅴ) ("/vi" ?ⅵ) ("/vii" ?ⅶ) ("/viii" ?ⅷ) ("/ix" ?ⅸ) ("/x" ?ⅹ) ("/xi" ?ⅺ) ("/xii" ?ⅻ) I don't recall them being used for anything, but then again it's been a while since I wrote in Bulgarian every day, and Russian and others may want them. >> I often see it written as q or u in manually transliterated text. Maybe >> // or /q or /u would work? The corresponding /? /Q /U would uppercase it. JL> I've never seen it written as q or u. And indeed there is no such rule JL> on http://en.wikipedia.org/wiki/Translit or on this page in other JL> languages (Bulgarian, Russian) where q is used for ю, and q for я. It's ad hoc I guess. No matter. JL> One possible letter for ь is x. I guess it is from jcuken-only JL> keyboards, but since it is already taken for x in cyrillic-translit, JL> maybe we should replace the rule ("x" ?х) with ("x" ?ь)? OK, ditto for X -> Ь. We'll be lynched, surely. JL> I like your current set of rules, they are easy to remember and input. JL> But I have doubts about << and >> since they are inconsistent with JL> other sequences with the leading slash. Yes, but it's very rare to need << or >> in any other context. Let's try it, I'm sure only shell users will complain :) At worst you have to do C-\ Shift-, Shift-, C-\ to get the regular << meaning. Compared to / Shift-, Shift-, / Shift-. Shift-. all the time while typing, I think it's better. JL> Since these rules will be needed in parallel with other input methods, JL> I suggest you to create two separate input methods: one with the name JL> like `typographic' (if you are going to provide only typographic JL> characters) for rules without the leading slash and `typographic-pre' JL> for rules with the leading slash. And with the code Handa-san provided JL> it will be possible to activate multiple input methods. When that code is in the CVS trunk I can do that, but I'm not convinced having two methods (with/without prefix) would be any better than a single common one. A single method means less to remember, less to confuse the user. JL> Then how about the following rules: JL> /. "․" U2024 # ONE DOT LEADER JL> /.. "‥" U2025 # TWO DOT LEADER JL> /... "…" ellipsis # HORIZONTAL ELLIPSIS Added also. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 22:14 ` Ted Zlatanov @ 2008-07-09 23:52 ` Juri Linkov 2008-07-10 12:47 ` Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-09 23:52 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > The situation is different for Roman numerals which have distinct code > points from the corresponding ASCII characters. I'm not including > lowercase versions but I could (downcase-region is so nice, this took 2 > seconds): > > ("/i" ?ⅰ) > ("/ii" ?ⅱ) > ("/iii" ?ⅲ) > ("/iv" ?ⅳ) > ("/v" ?ⅴ) > ("/vi" ?ⅵ) > ("/vii" ?ⅶ) > ("/viii" ?ⅷ) > ("/ix" ?ⅸ) > ("/x" ?ⅹ) > ("/xi" ?ⅺ) > ("/xii" ?ⅻ) > > I don't recall them being used for anything, but then again it's been a > while since I wrote in Bulgarian every day, and Russian and others may > want them. You could include lowercase versions since they are sometimes used to numerate subsections. But please be careful to not create conflicts with the existing rules in cyrillic-translit like ("/i" ?і) and ("/I" ?І) where І is CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I. > JL> One possible letter for ь is x. I guess it is from jcuken-only > JL> keyboards, but since it is already taken for x in cyrillic-translit, > JL> maybe we should replace the rule ("x" ?х) with ("x" ?ь)? > > OK, ditto for X -> Ь. We'll be lynched, surely. This is why I suggest you to move out most new rules to a separate input method, or at least reduce the likelihood of possible annoyance. > JL> I like your current set of rules, they are easy to remember and input. > JL> But I have doubts about << and >> since they are inconsistent with > JL> other sequences with the leading slash. > > Yes, but it's very rare to need << or >> in any other context. Let's > try it, I'm sure only shell users will complain :) At worst you have to > do > > C-\ Shift-, Shift-, C-\ > > to get the regular << meaning. Compared to > > / Shift-, Shift-, > / Shift-. Shift-. > > all the time while typing, I think it's better. I'm not convinced of the need of creating an exception for these two characters. Such exceptions do more harm than good, e.g. the user has to remember which quotes require the prefix slash and which don't. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 23:52 ` Juri Linkov @ 2008-07-10 12:47 ` Ted Zlatanov 2008-07-10 18:45 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-10 12:47 UTC (permalink / raw) To: emacs-devel On Thu, 10 Jul 2008 02:52:57 +0300 Juri Linkov <juri@jurta.org> wrote: JL> You could include lowercase versions since they are sometimes used to JL> numerate subsections. But please be careful to not create conflicts JL> with the existing rules in cyrillic-translit like ("/i" ?і) and ("/I" ?І) JL> where І is CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I. How about ;; Roman numerals, commonly used for months and section/subsection numbers ("/RI" ?Ⅰ) ("/RII" ?Ⅱ) ("/RIII" ?Ⅲ) ("/RIV" ?Ⅳ) ("/RV" ?Ⅴ) ("/RVI" ?Ⅵ) ("/RVII" ?Ⅶ) ("/RVIII" ?Ⅷ) ("/RIX" ?Ⅸ) ("/RX" ?Ⅹ) ("/RXI" ?Ⅺ) ("/RXII" ?Ⅻ) ("/ri" ?ⅰ) ("/rii" ?ⅱ) ("/riii" ?ⅲ) ("/riv" ?ⅳ) ("/rv" ?ⅴ) ("/rvi" ?ⅵ) ("/rvii" ?ⅶ) ("/rviii" ?ⅷ) ("/rix" ?ⅸ) ("/rx" ?ⅹ) ("/rxi" ?ⅺ) ("/rxii" ?ⅻ) JL> One possible letter for ь is x. I guess it is from jcuken-only JL> keyboards, but since it is already taken for x in cyrillic-translit, JL> maybe we should replace the rule ("x" ?х) with ("x" ?ь)? >> >> OK, ditto for X -> Ь. We'll be lynched, surely. JL> This is why I suggest you to move out most new rules to a separate JL> input method, or at least reduce the likelihood of possible annoyance. Users still need to use the method at some point. Delaying the annoyance until then doesn't make it any less of a problem--we're still interfering with x/X in the secondary method. I'll make the change in the core method for now, and let's see if it's a problem. JL> I like your current set of rules, they are easy to remember and input. JL> But I have doubts about << and >> since they are inconsistent with JL> other sequences with the leading slash. ... JL> I'm not convinced of the need of creating an exception for these two JL> characters. Such exceptions do more harm than good, e.g. the user has JL> to remember which quotes require the prefix slash and which don't. OK, I've made it /<< and />> for those two. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 12:47 ` Ted Zlatanov @ 2008-07-10 18:45 ` Juri Linkov 2008-07-10 19:10 ` Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-10 18:45 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> You could include lowercase versions since they are sometimes used to > JL> numerate subsections. But please be careful to not create conflicts > JL> with the existing rules in cyrillic-translit like ("/i" ?і) and ("/I" ?І) > JL> where І is CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I. > > How about > > ;; Roman numerals, commonly used for months and section/subsection numbers > ("/RI" ?Ⅰ) > ("/RII" ?Ⅱ) > ("/RIII" ?Ⅲ) > ("/RIV" ?Ⅳ) > ("/RV" ?Ⅴ) > ("/RVI" ?Ⅵ) > ("/RVII" ?Ⅶ) > ("/RVIII" ?Ⅷ) > ("/RIX" ?Ⅸ) > ("/RX" ?Ⅹ) > ("/RXI" ?Ⅺ) > ("/RXII" ?Ⅻ) > > ("/ri" ?ⅰ) > ("/rii" ?ⅱ) > ("/riii" ?ⅲ) > ("/riv" ?ⅳ) > ("/rv" ?ⅴ) > ("/rvi" ?ⅵ) > ("/rvii" ?ⅶ) > ("/rviii" ?ⅷ) > ("/rix" ?ⅸ) > ("/rx" ?ⅹ) > ("/rxi" ?ⅺ) > ("/rxii" ?ⅻ) This looks better. > JL> One possible letter for ь is x. I guess it is from jcuken-only > JL> keyboards, but since it is already taken for x in cyrillic-translit, > JL> maybe we should replace the rule ("x" ?х) with ("x" ?ь)? >>> >>> OK, ditto for X -> Ь. We'll be lynched, surely. > > JL> This is why I suggest you to move out most new rules to a separate > JL> input method, or at least reduce the likelihood of possible annoyance. > > Users still need to use the method at some point. Delaying the > annoyance until then doesn't make it any less of a problem--we're still > interfering with x/X in the secondary method. I'll make the change in > the core method for now, and let's see if it's a problem. Ok. And what about combining characters? Is it possible to insert them using input-method rules? > JL> I like your current set of rules, they are easy to remember and input. > JL> But I have doubts about << and >> since they are inconsistent with > JL> other sequences with the leading slash. > ... > JL> I'm not convinced of the need of creating an exception for these two > JL> characters. Such exceptions do more harm than good, e.g. the user has > JL> to remember which quotes require the prefix slash and which don't. > > OK, I've made it /<< and />> for those two. Thanks. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 18:45 ` Juri Linkov @ 2008-07-10 19:10 ` Ted Zlatanov 2008-07-10 19:52 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-10 19:10 UTC (permalink / raw) To: emacs-devel On Thu, 10 Jul 2008 21:45:46 +0300 Juri Linkov <juri@jurta.org> wrote: JL> Ok. And what about combining characters? Is it possible to insert them JL> using input-method rules? I figured out how to do it (with a vector, e.g. ["full string"] is the mapping). All the changes we've discussed are now in cyrillic-translit. When Handa-san commits his multiple input method support, I'll split rules into multiple methods as we discussed. Thanks Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 19:10 ` Ted Zlatanov @ 2008-07-10 19:52 ` Juri Linkov 2008-07-10 20:40 ` Ted Zlatanov 2008-07-10 22:09 ` Stefan Monnier 0 siblings, 2 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-10 19:52 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > I figured out how to do it (with a vector, e.g. ["full string"] is the > mapping). > > All the changes we've discussed are now in cyrillic-translit. Thanks! I have a few comments: 1. Could you also add rules to input vowels with the combining acute accent like you did for the combining grave accent, i.e. could you add ("a'" ["а́"]) and other vowels with the primary stress? 2. There are now two conflicting rules: ("E`" ["Ѐ"]) and ("E`" ?Э). 3. I just realized that we could leave the rule ("'" ?ь), because it has no conflict with ("a'" ["а́"]). The letter `ь' is never used after a vowel. 4. Please swap the mappings between ("/``" ?”) and ("/''" ?“), and also between ("/'" ?‘) and ("/`" ?’), because usually backquotes are used for left quotation marks, and apostrophes are used for right quotation marks. 5. Something is wrong with encoding of small roman numeral 1-10. Maybe we should recode cyrillic.el from iso-2022-7bit to utf-8 to avoid these problems? > When Handa-san commits his multiple input method support, I'll split > rules into multiple methods as we discussed. I tried the patch that Handa-san submitted in 2007, and it works well (with only one change of removing unnecessary `nreverse'). So I think we should now design a good UI to use multiple input method support. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 19:52 ` Juri Linkov @ 2008-07-10 20:40 ` Ted Zlatanov 2008-07-10 22:01 ` Juri Linkov 2008-07-10 22:09 ` Stefan Monnier 1 sibling, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-10 20:40 UTC (permalink / raw) To: emacs-devel On Thu, 10 Jul 2008 22:52:11 +0300 Juri Linkov <juri@jurta.org> wrote: JL> 1. Could you also add rules to input vowels with the combining acute JL> accent like you did for the combining grave accent, i.e. could you add JL> ("a'" ["а́"]) and other vowels with the primary stress? Done. I had to remove some conflicting rules. JL> 2. There are now two conflicting rules: ("E`" ["Ѐ"]) and ("E`" ?Э). Moved Э to @@ (since э was on @ already). JL> 3. I just realized that we could leave the rule ("'" ?ь), because JL> it has no conflict with ("a'" ["а́"]). The letter `ь' is never used JL> after a vowel. Done, and '' is also back to Ь. x/X again input Cyrillic х/Х. JL> 4. Please swap the mappings between ("/``" ?”) and ("/''" ?“), JL> and also between ("/'" ?‘) and ("/`" ?’), because usually backquotes JL> are used for left quotation marks, and apostrophes are used for right JL> quotation marks. OK. Please check my work, I did my best to proofread the rules. JL> 5. Something is wrong with encoding of small roman numeral 1-10. JL> Maybe we should recode cyrillic.el from iso-2022-7bit to utf-8 JL> to avoid these problems? Looks OK to me, but do whatever you think is necessary. >> When Handa-san commits his multiple input method support, I'll split >> rules into multiple methods as we discussed. JL> I tried the patch that Handa-san submitted in 2007, and it works well JL> (with only one change of removing unnecessary `nreverse'). So I think JL> we should now design a good UI to use multiple input method support. I think the variable default-input-method should accept a list with implied priority (first is highest) or a function (returning a list or a string), and DTRT. C-u C-\ should only enable the first method in the list. C-u 2 C-\ should enable the first two methods. C-\ enables all the methods in the chain. All of these should collapse to the current C-\ behavior if default-input-method is a string or a function returning a string. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 20:40 ` Ted Zlatanov @ 2008-07-10 22:01 ` Juri Linkov 2008-07-12 20:51 ` Juri Linkov 2008-07-14 14:01 ` Ted Zlatanov 0 siblings, 2 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-10 22:01 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> 1. Could you also add rules to input vowels with the combining acute > JL> accent like you did for the combining grave accent, i.e. could you add > JL> ("a'" ["а́"]) and other vowels with the primary stress? > > Done. I had to remove some conflicting rules. Thanks, this is close to perfect now. However, I noticed yet another conflicting rule: ("u'" ?ў) ("U'" ?Ў). Since it is У with breve, we could use the same key tilde ~ as used by Latin input methods to input Latin letters with breve: ("u~" ?ў) ("U~" ?Ў). > JL> 2. There are now two conflicting rules: ("E`" ["Ѐ"]) and ("E`" ?Э). > > Moved Э to @@ (since э was on @ already). It is too bad to drop ("e'" ?э) since it is a frequently used letter. I propose to add a rule ("e\\" ?э) because `э' is named REVERSED E, and REVERSE SOLIDUS `\' has similar mnemonics. > JL> 3. I just realized that we could leave the rule ("'" ?ь), because > JL> it has no conflict with ("a'" ["а́"]). The letter `ь' is never used > JL> after a vowel. > > Done, and '' is also back to Ь. x/X again input Cyrillic х/Х. Thanks, it is important to keep mappings backward-compatible if possible. > JL> 4. Please swap the mappings between ("/``" ?”) and ("/''" ?“), > JL> and also between ("/'" ?‘) and ("/`" ?’), because usually backquotes > JL> are used for left quotation marks, and apostrophes are used for right > JL> quotation marks. > > OK. Please check my work, I did my best to proofread the rules. Everything seems correct now. > JL> I tried the patch that Handa-san submitted in 2007, and it works well > JL> (with only one change of removing unnecessary `nreverse'). So I think > JL> we should now design a good UI to use multiple input method support. > > I think the variable default-input-method should accept a list with > implied priority (first is highest) or a function (returning a list or a > string), and DTRT. I agree that this is the most reasonable way to implement this feature. I only have doubts about backward-compatibility if it might break some existing code. > C-u C-\ should only enable the first method in the list. > C-u 2 C-\ should enable the first two methods. > C-\ enables all the methods in the chain. I think there is no need in such complexity since in the minibuffer after typing C-\ you can select multiple input methods. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 22:01 ` Juri Linkov @ 2008-07-12 20:51 ` Juri Linkov 2008-07-14 14:01 ` Ted Zlatanov 1 sibling, 0 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-12 20:51 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > I noticed yet another conflicting rule: ("u'" ?ў) ("U'" ?Ў). Since it > is У with breve, we could use the same key tilde ~ as used by Latin > input methods to input Latin letters with breve: ("u~" ?ў) ("U~" ?Ў). and > It is too bad to drop ("e'" ?э) since it is a frequently used letter. > I propose to add a rule ("e\\" ?э) because `э' is named REVERSED E, > and REVERSE SOLIDUS `\' has similar mnemonics. I fixed these remaining problems. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 22:01 ` Juri Linkov 2008-07-12 20:51 ` Juri Linkov @ 2008-07-14 14:01 ` Ted Zlatanov 2008-07-14 21:47 ` Juri Linkov 1 sibling, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-14 14:01 UTC (permalink / raw) To: emacs-devel On Fri, 11 Jul 2008 01:01:36 +0300 Juri Linkov <juri@jurta.org> wrote: JL> Thanks, this is close to perfect now. However, I noticed yet another JL> conflicting rule: ("u'" ?ў) ("U'" ?Ў). Since it is У with breve, JL> we could use the same key tilde ~ as used by Latin input methods to JL> input Latin letters with breve: ("u~" ?ў) ("U~" ?Ў). That's fine. These are rarely used anyhow. JL> 2. There are now two conflicting rules: ("E`" ["Ѐ"]) and ("E`" ?Э). >> >> Moved Э to @@ (since э was on @ already). JL> It is too bad to drop ("e'" ?э) since it is a frequently used letter. JL> I propose to add a rule ("e\\" ?э) because `э' is named REVERSED E, JL> and REVERSE SOLIDUS `\' has similar mnemonics. OK. Thanks for fixing these two issues. I think this is a very nice input method for Cyrillic now, after all your help. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-14 14:01 ` Ted Zlatanov @ 2008-07-14 21:47 ` Juri Linkov 2008-07-15 15:06 ` Ted Zlatanov 0 siblings, 1 reply; 77+ messages in thread From: Juri Linkov @ 2008-07-14 21:47 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> It is too bad to drop ("e'" ?э) since it is a frequently used letter. > JL> I propose to add a rule ("e\\" ?э) because `э' is named REVERSED E, > JL> and REVERSE SOLIDUS `\' has similar mnemonics. > > OK. Thanks for fixing these two issues. I think this is a very nice > input method for Cyrillic now, after all your help. Hmm, the more I think about our latest changes, the more problems I see ;-) One main problem is that only 5 vowels have the rules with the combining accent. But there are more vowels that require it. And most of them are multi-key, so e.g. a rule like ("ju" ?ю) will be extended into 3-key sequence ("ju'" ["ю́"]). This will over-complicate the current rules. A better solution is to create a separate rule for the combining accent. Since the combining accent is a separate character we can create a separate rule for it! So we can leave only two composite rules ("i`" ?ѝ) ("I`" ?Ѝ) because their accent is not a separate character. And then remove composite rules for the existing 5 vowels, and use a new rule to input the combining accent character. But currently I see no good key to input a separate accent because a natural key sequence /' is already assigned to the single quote character ’. Could you propose a good key for the combining acute accent and for combining grave accent? Maybe ("\\'" ?́) and ("\\`" ?̀)? -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-14 21:47 ` Juri Linkov @ 2008-07-15 15:06 ` Ted Zlatanov 2008-07-15 20:32 ` Juri Linkov 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-15 15:06 UTC (permalink / raw) To: emacs-devel On Tue, 15 Jul 2008 00:47:57 +0300 Juri Linkov <juri@jurta.org> wrote: JL> A better solution is to create a separate rule for the combining JL> accent. Since the combining accent is a separate character we can create JL> a separate rule for it! So we can leave only two composite rules JL> ("i`" ?ѝ) ("I`" ?Ѝ) because their accent is not a separate character. JL> And then remove composite rules for the existing 5 vowels, and use JL> a new rule to input the combining accent character. JL> But currently I see no good key to input a separate accent because JL> a natural key sequence /' is already assigned to the single quote JL> character ’. Could you propose a good key for the combining acute accent JL> and for combining grave accent? Maybe ("\\'" ?́) and ("\\`" ?̀)? //' and //` would work. Let's stick with / as the "extended charset" trigger, so // would be the "character attributes" trigger. It could also do //~, //c (cedilla), //^ (superscript), //_ (subscript), etc. in a more generic multi-use input method. But for now those two are sensible. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-15 15:06 ` Ted Zlatanov @ 2008-07-15 20:32 ` Juri Linkov 2008-08-01 21:07 ` Ted Zlatanov 2008-08-05 21:00 ` Ted Zlatanov 0 siblings, 2 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-15 20:32 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> A better solution is to create a separate rule for the combining > JL> accent. Since the combining accent is a separate character we can create > JL> a separate rule for it! So we can leave only two composite rules > JL> ("i`" ?ѝ) ("I`" ?Ѝ) because their accent is not a separate character. > JL> And then remove composite rules for the existing 5 vowels, and use > JL> a new rule to input the combining accent character. > > JL> But currently I see no good key to input a separate accent because > JL> a natural key sequence /' is already assigned to the single quote > JL> character ’. Could you propose a good key for the combining acute accent > JL> and for combining grave accent? Maybe ("\\'" ?́) and ("\\`" ?̀)? > > //' and //` would work. Let's stick with / as the "extended charset" > trigger, so // would be the "character attributes" trigger. It could > also do //~, //c (cedilla), //^ (superscript), //_ (subscript), etc. in > a more generic multi-use input method. But for now those two are > sensible. I suggested \\' because this rule is already used by latin-ltx LaTeX-like input method. But since \ is part of TeX syntax, it makes sense only for latin-ltx. So I see no better rule than your proposed //' and //`. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-15 20:32 ` Juri Linkov @ 2008-08-01 21:07 ` Ted Zlatanov 2008-08-05 21:00 ` Ted Zlatanov 1 sibling, 0 replies; 77+ messages in thread From: Ted Zlatanov @ 2008-08-01 21:07 UTC (permalink / raw) To: emacs-devel I updated NEWS with a brief summary of the cyrillic-translit method changes. Juri, feel free to adjust as needed. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-15 20:32 ` Juri Linkov 2008-08-01 21:07 ` Ted Zlatanov @ 2008-08-05 21:00 ` Ted Zlatanov 2008-08-05 22:05 ` Chong Yidong 1 sibling, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-08-05 21:00 UTC (permalink / raw) To: emacs-devel What's the ETA for alternative input methods, after the next release or before? It would be nice to provide a list of the new bindings in the Quail cyrillic-translit method. Is that already available as a refcard or in a manual? I couldn't find it. It should be generated directly from the input method description, if possible. Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-08-05 21:00 ` Ted Zlatanov @ 2008-08-05 22:05 ` Chong Yidong 0 siblings, 0 replies; 77+ messages in thread From: Chong Yidong @ 2008-08-05 22:05 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel Ted Zlatanov <tzz@lifelogs.com> writes: > What's the ETA for alternative input methods, after the next release > or before? After. ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 19:52 ` Juri Linkov 2008-07-10 20:40 ` Ted Zlatanov @ 2008-07-10 22:09 ` Stefan Monnier 2008-07-10 22:54 ` Juri Linkov 1 sibling, 1 reply; 77+ messages in thread From: Stefan Monnier @ 2008-07-10 22:09 UTC (permalink / raw) To: Juri Linkov; +Cc: Ted Zlatanov, emacs-devel > 5. Something is wrong with encoding of small roman numeral 1-10. > Maybe we should recode cyrillic.el from iso-2022-7bit to utf-8 > to avoid these problems? In general Elisp files should now use utf-8 except for the rare rare cases where this is not an option. Stefan ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 22:09 ` Stefan Monnier @ 2008-07-10 22:54 ` Juri Linkov 2008-07-11 1:26 ` Stefan Monnier 2008-07-11 2:08 ` Kenichi Handa 0 siblings, 2 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-10 22:54 UTC (permalink / raw) To: Stefan Monnier; +Cc: Ted Zlatanov, emacs-devel >> 5. Something is wrong with encoding of small roman numeral 1-10. >> Maybe we should recode cyrillic.el from iso-2022-7bit to utf-8 >> to avoid these problems? > > In general Elisp files should now use utf-8 except for the rare rare > cases where this is not an option. Then it also makes sense to convert HELLO because the section "Unicode charset" in the beginning of HELLO is now too confusing given that Unicode is now the internal coding. However, when I tried to convert it to UTF-8, it choked on some characters. Is it the rare case where this is not an option? -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 22:54 ` Juri Linkov @ 2008-07-11 1:26 ` Stefan Monnier 2008-07-11 2:08 ` Kenichi Handa 1 sibling, 0 replies; 77+ messages in thread From: Stefan Monnier @ 2008-07-11 1:26 UTC (permalink / raw) To: Juri Linkov; +Cc: Ted Zlatanov, emacs-devel > Then it also makes sense to convert HELLO because the section > "Unicode charset" in the beginning of HELLO is now too confusing > given that Unicode is now the internal coding. > However, when I tried to convert it to UTF-8, it choked on some > characters. Is it the rare case where this is not an option? Could very well be, indeed, Stefan ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-10 22:54 ` Juri Linkov 2008-07-11 1:26 ` Stefan Monnier @ 2008-07-11 2:08 ` Kenichi Handa 1 sibling, 0 replies; 77+ messages in thread From: Kenichi Handa @ 2008-07-11 2:08 UTC (permalink / raw) To: Juri Linkov; +Cc: tzz, monnier, emacs-devel In article <87wsjte4fh.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes: > Then it also makes sense to convert HELLO because the section > "Unicode charset" in the beginning of HELLO is now too confusing > given that Unicode is now the internal coding. > However, when I tried to convert it to UTF-8, it choked on some > characters. Is it the rare case where this is not an option? That's because the Arabic chars in Non-ASCII examples was in `arabic-1-column' charset (this charset is kept just for backward compatibility). I changed them to iso-8859-6 characters (of course they are unified with Unicode). But, please don't change the encoding. As the file is in iso-2022-7bit, when Emacs decodes it, `charset' text properties (e.g. japanese-jisx0208 to Japanese charactes, chinese-gb2312 to Chinese charaters) are added. So, the font-selection works better. --- Kenichi Handa handa@ni.aist.go.jp ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-09 16:02 ` Ted Zlatanov 2008-07-09 18:02 ` James Cloos 2008-07-09 18:48 ` Ted Zlatanov @ 2008-07-09 19:21 ` Juri Linkov 2 siblings, 0 replies; 77+ messages in thread From: Juri Linkov @ 2008-07-09 19:21 UTC (permalink / raw) To: Ted Zlatanov; +Cc: emacs-devel > JL> Since ` and ' are more important for accented letters, we should > JL> find an alternative key for ь. > > I often see it written as q or u in manually transliterated text. Maybe > // or /q or /u would work? The corresponding /? /Q /U would uppercase it. I've never seen it written as q or u. And indeed there is no such rule on http://en.wikipedia.org/wiki/Translit or on this page in other languages (Bulgarian, Russian) where q is used for ю, and q for я. One possible letter for ь is x. I guess it is from jcuken-only keyboards, but since it is already taken for x in cyrillic-translit, maybe we should replace the rule ("x" ?х) with ("x" ?ь)? > JL> I now noticed that I can't type a pair of double quotes in cyrillic-translit > JL> that I often do. Maybe this rule should use the slash prefix key /"" -> “ ? > > OK, it's more consistent that way. Let's eliminate all but << and >> as > non-prefixed special characters for consistency (those two are too > convenient IMO). So we'd have: I like your current set of rules, they are easy to remember and input. But I have doubts about << and >> since they are inconsistent with other sequences with the leading slash. Since these rules will be needed in parallel with other input methods, I suggest you to create two separate input methods: one with the name like `typographic' (if you are going to provide only typographic characters) for rules without the leading slash and `typographic-pre' for rules with the leading slash. And with the code Handa-san provided it will be possible to activate multiple input methods. -- Juri Linkov http://www.jurta.org/emacs/ ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-07 21:42 ` Juri Linkov ` (2 preceding siblings ...) 2008-07-08 15:37 ` Ted Zlatanov @ 2008-07-08 15:49 ` James Cloos 2008-07-08 18:50 ` Ted Zlatanov 3 siblings, 1 reply; 77+ messages in thread From: James Cloos @ 2008-07-08 15:49 UTC (permalink / raw) To: emacs-devel; +Cc: Juri Linkov, Ted Zlatanov >>>>> "Juri" == Juri Linkov <juri@jurta.org> writes: Juri> Other needed characters to add are at least ”’–—•… A recent thread on the unicode list informs that the character U+2026 HORIZONTAL ELLIPSIS was added to the UCS to represent a THREE DOT LEADER (along with U+2024 ONE DOT LEADER and U+2025 TWO DOT LEADER. That it ended up named ELLIPSIS, according to that thread, is a mistake. The strong recomendation is to always use a series of U+002E FULL STOPs for an ellipsis, possibly with same space between. U+202F NARROW NO-BREAK SPACE is likely appropriate. One can argue, then, that there is no need for an input system to support U+2026. Therefore, it should either be optional, or shouldn't get in the way of typing '...'. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-08 15:49 ` James Cloos @ 2008-07-08 18:50 ` Ted Zlatanov 2008-07-08 19:50 ` James Cloos 0 siblings, 1 reply; 77+ messages in thread From: Ted Zlatanov @ 2008-07-08 18:50 UTC (permalink / raw) To: emacs-devel On Tue, 08 Jul 2008 11:49:08 -0400 James Cloos <cloos@jhcloos.com> wrote: >>>>>> "Juri" == Juri Linkov <juri@jurta.org> writes: Juri> Other needed characters to add are at least ”’–—•… JC> A recent thread on the unicode list informs that the character JC> U+2026 HORIZONTAL ELLIPSIS was added to the UCS to represent a JC> THREE DOT LEADER (along with U+2024 ONE DOT LEADER and U+2025 JC> TWO DOT LEADER. That it ended up named ELLIPSIS, according to JC> that thread, is a mistake. JC> The strong recomendation is to always use a series of U+002E JC> FULL STOPs for an ellipsis, possibly with same space between. JC> U+202F NARROW NO-BREAK SPACE is likely appropriate. JC> One can argue, then, that there is no need for an input system JC> to support U+2026. JC> Therefore, it should either be optional, or shouldn't get in JC> the way of typing '...'. I didn't know about the ELLIPSIS issue, thanks. My proposed shortcut would not get in the way of normal typing. Do I understand correctly, though, that people should just type ... (with or without NARROW NO-BREAK SPACE in between) instead of using the … character? Is there value in providing that sequence with the /.. quail key sequence instead of … in Emacs, or is that unnecessary? Ted ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for quail-cyrillic-* 2008-07-08 18:50 ` Ted Zlatanov @ 2008-07-08 19:50 ` James Cloos 2008-07-08 20:26 ` composed characters question and suggestions for?quail-cyrillic-* Teemu Likonen 0 siblings, 1 reply; 77+ messages in thread From: James Cloos @ 2008-07-08 19:50 UTC (permalink / raw) To: emacs-devel; +Cc: Ted Zlatanov >>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes: Ted> I didn't know about the ELLIPSIS issue, thanks. It was news to me, too. Ted> My proposed shortcut would not get in the way of normal typing. Cool. Ted> Do I understand correctly, though, that people should just type Ted> ... (with or without NARROW NO-BREAK SPACE in between) instead of Ted> using the … character? Essentially. Most who use a space, though, tend to use an ASCII SPACE rather than any type of a nonbreaking space. (I've worked with publishers who insisted on ". . ." or " . . .".) And some might prefer to use U+A0. But I expect a narrower space looks better, and you /really/ want to avoid linebreaks inside an ellipses. :) Ted> Is there value in providing that sequence with the /.. quail key Ted> sequence instead of … in Emacs, or is that unnecessary? Hmmm. I doubt anyone would use it over ... or . . ., but I suppose it would be useful for those who do want to ensure non-breaking spaces. And . . . does look better than . . . in DejaVu Serif (by way of variable- pitch-mode)—and not only because ' ' looks different than ' ' since it (unlike U+202F) is displayed in the nobreak-space face. So, I’d say yes. I’m thinking of adding: <Multi_key> <period> <1> : "․" U2024 # ONE DOT LEADER <Multi_key> <period> <2> : "‥" U2025 # TWO DOT LEADER <Multi_key> <period> <3> : "…" ellipsis # HORIZONTAL ELLIPSIS to the UTF-8 Compose.pre files. The X11 compose framework converts the compose sequence to a string (rather than a character), so I could also have <Multi_key> <period> <period> insert ". . ." rather than "…", so I could make it match the proposed change to Emacs. I'll have to give that some thought. (As another aside; I just remembered that at least some typographers maintain that the leader glyphs should not be the same as the period (or full stop) glyphs, but should be a bit lighter/smaller. Probably a related issue.) -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ^ permalink raw reply [flat|nested] 77+ messages in thread
* Re: composed characters question and suggestions for?quail-cyrillic-* 2008-07-08 19:50 ` James Cloos @ 2008-07-08 20:26 ` Teemu Likonen 0 siblings, 0 replies; 77+ messages in thread From: Teemu Likonen @ 2008-07-08 20:26 UTC (permalink / raw) To: emacs-devel James Cloos wrote: > I’m thinking of adding: > > <Multi_key> <period> <1> : "․" U2024 # ONE DOT LEADER > <Multi_key> <period> <2> : "‥" U2025 # TWO DOT LEADER > <Multi_key> <period> <3> : "…" ellipsis # HORIZONTAL ELLIPSIS > > to the UTF-8 Compose.pre files. The X11 compose framework converts > the compose sequence to a string (rather than a character), so I could > also have <Multi_key> <period> <period> insert ". . ." rather than > "…", so I could make it match the proposed change to Emacs. I'll have > to give that some thought. Just FYI: HORIZONTAL ELLIPSIS has been added to libX11, though I'm not sure if it's in any released version yet. The key sequence is this: <Multi_key> <period> <period> : "…" ellipsis # HORIZONTAL ELLIPSIS See the relevant commit in libX11's Git repository: http://cgit.freedesktop.org/xorg/lib/libX11/commit/?id=b0a8f2ec4ba698841683f8ce389f9d72e6bce53e (You can click the diffstat for file nls/en_US.UTF-8/Compose.pre .) ^ permalink raw reply [flat|nested] 77+ messages in thread
end of thread, other threads:[~2008-08-05 22:05 UTC | newest] Thread overview: 77+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-13 14:27 composed characters question and suggestions for quail-cyrillic-* Ted Zlatanov 2008-06-13 15:11 ` Eli Zaretskii 2008-06-13 15:56 ` Jason Rumney 2008-06-13 18:09 ` Ted Zlatanov 2008-06-14 9:44 ` Eli Zaretskii 2008-06-14 18:55 ` Stephen J. Turnbull 2008-06-14 19:45 ` Eli Zaretskii 2008-06-18 20:17 ` Ted Zlatanov 2008-06-19 11:45 ` Kenichi Handa 2008-07-02 20:25 ` Ted Zlatanov 2008-07-03 2:29 ` Kenichi Handa 2008-07-03 19:53 ` adding consistent extra symbols to input methods (cyrillic-*, croatian-*, slov*, czech-* etc.) input methods Ted Zlatanov 2008-07-05 12:54 ` Kenichi Handa 2008-07-06 18:40 ` Juri Linkov 2008-07-06 22:54 ` Miles Bader 2008-07-10 0:09 ` Juri Linkov 2008-07-10 0:37 ` Kenichi Handa 2008-07-10 0:52 ` Juri Linkov 2008-07-10 1:44 ` Kenichi Handa 2008-07-10 1:15 ` Stefan Monnier 2008-07-10 0:27 ` Juri Linkov 2008-07-10 1:16 ` Miles Bader 2008-07-10 18:43 ` Juri Linkov 2008-07-11 2:52 ` Miles Bader 2008-07-07 1:57 ` Kenichi Handa 2008-07-07 4:39 ` Stefan Monnier 2008-07-07 5:25 ` Kenichi Handa 2008-07-07 19:42 ` Ted Zlatanov 2008-07-07 22:05 ` Juri Linkov 2008-07-13 5:11 ` Eli Zaretskii 2008-07-13 5:17 ` Miles Bader 2008-07-13 21:27 ` Juri Linkov 2008-07-14 3:18 ` Miles Bader 2008-07-14 4:43 ` Kenichi Handa 2008-07-14 21:51 ` Juri Linkov 2008-07-15 1:24 ` Kenichi Handa 2008-07-28 13:30 ` multiple input methods (was: adding consistent extra symbols to input methods) Juri Linkov 2008-07-06 18:41 ` composed characters question and suggestions for quail-cyrillic-* Juri Linkov 2008-07-07 20:12 ` Ted Zlatanov 2008-07-07 21:42 ` Juri Linkov 2008-07-08 0:48 ` Kenichi Handa 2008-07-08 10:46 ` Werner LEMBERG 2008-07-08 21:47 ` David Kastrup 2008-07-08 15:37 ` Ted Zlatanov 2008-07-08 17:38 ` James Cloos 2008-07-08 22:54 ` Juri Linkov 2008-07-09 16:02 ` Ted Zlatanov 2008-07-09 18:02 ` James Cloos 2008-07-09 18:49 ` Ted Zlatanov 2008-07-09 19:51 ` Juri Linkov 2008-07-09 18:48 ` Ted Zlatanov 2008-07-09 19:33 ` Juri Linkov 2008-07-09 22:14 ` Ted Zlatanov 2008-07-09 23:52 ` Juri Linkov 2008-07-10 12:47 ` Ted Zlatanov 2008-07-10 18:45 ` Juri Linkov 2008-07-10 19:10 ` Ted Zlatanov 2008-07-10 19:52 ` Juri Linkov 2008-07-10 20:40 ` Ted Zlatanov 2008-07-10 22:01 ` Juri Linkov 2008-07-12 20:51 ` Juri Linkov 2008-07-14 14:01 ` Ted Zlatanov 2008-07-14 21:47 ` Juri Linkov 2008-07-15 15:06 ` Ted Zlatanov 2008-07-15 20:32 ` Juri Linkov 2008-08-01 21:07 ` Ted Zlatanov 2008-08-05 21:00 ` Ted Zlatanov 2008-08-05 22:05 ` Chong Yidong 2008-07-10 22:09 ` Stefan Monnier 2008-07-10 22:54 ` Juri Linkov 2008-07-11 1:26 ` Stefan Monnier 2008-07-11 2:08 ` Kenichi Handa 2008-07-09 19:21 ` Juri Linkov 2008-07-08 15:49 ` James Cloos 2008-07-08 18:50 ` Ted Zlatanov 2008-07-08 19:50 ` James Cloos 2008-07-08 20:26 ` composed characters question and suggestions for?quail-cyrillic-* Teemu Likonen
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.