* How to recognize keyboard insertion? @ 2009-10-31 15:57 Eli Zaretskii 2009-10-31 16:58 ` David De La Harpe Golden 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-10-31 15:57 UTC (permalink / raw) To: emacs-devel Do we have infrastructure for detecting, inside one of the functions that insert text into buffers, characters that were inserted via the keyboard or keyboard macros? Failing that, can I safely assume that self-insert-command and its optimized variant in command_loop_1 are the only ways to insert characters from keyboard and keyboard macros, and that self-insert-command is only supposed to be invoked by characters typed at the keyboard? I'm asking because, in bidirectional editing, characters that are mirrored at display time need to be mirrored at keyboard input time. For example, when typing right-to-left text, the character `)' should be mirrored so that what ends up in the buffer is `(', because what the user means is to produce an open parenthesis. (Displaying this text will then mirror again, and display `)'; this last part already works in the bidi Emacs I'm working on). So I need to mirror characters typed at the keyboard, but not characters yanked from the kill ring or pasted from X selections. How can I discern the first kind from the second? TIA ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 15:57 How to recognize keyboard insertion? Eli Zaretskii @ 2009-10-31 16:58 ` David De La Harpe Golden 2009-10-31 17:20 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: David De La Harpe Golden @ 2009-10-31 16:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: > I'm asking because, in bidirectional editing, characters that are > mirrored at display time need to be mirrored at keyboard input time. > For example, when typing right-to-left text, the character `)' should > be mirrored so that what ends up in the buffer is `(', because what > the user means is to produce an open parenthesis. (Displaying this > text will then mirror again, and display `)'; this last part already > works in the bidi Emacs I'm working on). > Do you? I'm not really knowledgeable about RtL, but reason I ask is because when I switch on an arabic OS-level keyboard layout, Shift-9 actually generates a ) parenright keysym and shift-0 a ( parenleft, which I think is then displayed mirrored as per the last bit of your post in RtL contexts. ثثثث(321)ثثث You can see the transposition in /usr/share/X11/xkb/symbols on typical gnu+linux distros So the right place to do such keyboard mirroring intra-emacs might be in quail, i.e. for when people are trying to work RtL only intra-emacs still with a western os-level keymap. And indeed, the transposition is shown in the commentary in emacs/leim/quail/arabic.el , though I think it's missing from the actual map at present (possibly because emacs lacks RtL until you're done!) So, since one can assume either the OS keymap or quail will be pre-mirrroing in practice, you probably don't need to distinguish keyboard vs. paste here. N.B. I could be quite wrong here, not expert by any means. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 16:58 ` David De La Harpe Golden @ 2009-10-31 17:20 ` Eli Zaretskii 2009-10-31 17:37 ` David De La Harpe Golden ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Eli Zaretskii @ 2009-10-31 17:20 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: emacs-devel > Date: Sat, 31 Oct 2009 16:58:45 +0000 > From: David De La Harpe Golden <david@harpegolden.net> > Cc: emacs-devel@gnu.org > > Do you? I'm not really knowledgeable about RtL, but reason I ask is > because when I switch on an arabic OS-level keyboard layout, Shift-9 > actually generates a ) parenright keysym and shift-0 a ( parenleft, > which I think is then displayed mirrored as per the last bit of your > post in RtL contexts. But that is wrong: per the Unicode Bidirectional Algorithm (a.k.a. UAX#9), a `(' should only be mirrored if its resolved directionality is R: L4. A character is depicted by a mirrored glyph if and only if (a) the resolved directionality of that character is R, and (b) the Bidi_Mirrored property value of that character is true. To simplify, this means that a `(' should be mirrored when surrounded by strong R2L characters, but not when surrounded by Latin characters or European digits. What you describe above means that, when typing mixed Arabic and Latin text, the user needs to switch back from Arabic when she types mirrored characters, even if these characters are surrounded by digits, for example. In Emacs, this means that we would need to switch away from the input method, even when typing characters whose keys are not translated by the input method. That sounds like a nuisance. Alternatively, we will need to mirror characters even if their directionality is L, which is against UAX#9 and will cause incorrect display in some not-so-rare cases. For example, try typing "9*(4+5)" after switching to Arabic keyboard. What do you get? > So the right place to do such keyboard mirroring intra-emacs might be > in quail, i.e. for when people are trying to work RtL only intra-emacs > still with a western os-level keymap. Quail cannot easily know the context: it can only mirror these characters always, which is not right, since the display will mirror them only if they are surrounded by strong R2L characters. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 17:20 ` Eli Zaretskii @ 2009-10-31 17:37 ` David De La Harpe Golden 2009-10-31 17:43 ` David De La Harpe Golden 2009-10-31 18:15 ` Eli Zaretskii 2009-11-01 1:30 ` Jason Rumney [not found] ` <837huac8gg.fsf@gnu.org> 2 siblings, 2 replies; 28+ messages in thread From: David De La Harpe Golden @ 2009-10-31 17:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 1395 bytes --] Eli Zaretskii wrote: > To simplify, this means that a `(' should be mirrored when surrounded > by strong R2L characters, but not when surrounded by Latin characters > or European digits. > It IS only mirrored when surrounded by rtl characters, that was what I included the nonsense string for. When not so surrounded, it, Shift-9 generates ")" /and it is shown as ")"/. Arabic keyboards still have ( printed on 9, so I guess they work majority-rtl (which would make sense). > Alternatively, we will need to mirror characters even if their > directionality is L I don't see why. Other apps don't. > For example, try typing "9*(4+5)" after switching to Arabic keyboard. > What do you get? > 9*)4+5( surrounded: ثثث9*)4+5(ثثث - but that was when I typed the expression as if LtR (i.e. hitting 9 first), I suspect an arabic person might type ثثث(5+4)*9ثثث - i.e. hitting ")" first when transcribing "9*(4+5)". Then it just works I think as above (I'm including a screenshot from icedove just in case) > Quail cannot easily know the context: it can only mirror these > characters always, which is not right, since the display will mirror > them only if they are surrounded by strong R2L characters. > > I expect that's in fact what arabic users expect, though an actual arabic person might want to speak up... [-- Attachment #2: rtlicedove1.png --] [-- Type: image/png, Size: 68247 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 17:37 ` David De La Harpe Golden @ 2009-10-31 17:43 ` David De La Harpe Golden 2009-10-31 18:15 ` Eli Zaretskii 1 sibling, 0 replies; 28+ messages in thread From: David De La Harpe Golden @ 2009-10-31 17:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel David De La Harpe Golden wrote: > ثثث(5+4)*9ثثث > > - i.e. hitting ")" first when transcribing "9*(4+5)". Just in case, there I meant hitting KEY with ")" physically printed on it, producing "(" of course, then shown as ")" ... ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 17:37 ` David De La Harpe Golden 2009-10-31 17:43 ` David De La Harpe Golden @ 2009-10-31 18:15 ` Eli Zaretskii 2009-10-31 19:26 ` David De La Harpe Golden 2009-11-01 5:44 ` tomas 1 sibling, 2 replies; 28+ messages in thread From: Eli Zaretskii @ 2009-10-31 18:15 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: emacs-devel > Date: Sat, 31 Oct 2009 17:37:07 +0000 > From: David De La Harpe Golden <david@harpegolden.net> > CC: emacs-devel@gnu.org > > > To simplify, this means that a `(' should be mirrored when surrounded > > by strong R2L characters, but not when surrounded by Latin characters > > or European digits. > > > > It IS only mirrored when surrounded by rtl characters, that was > what I included the nonsense string for. When not so surrounded, > it, Shift-9 generates ")" /and it is shown as ")"/. My understanding is that Shift-9 generates `(' or `)' depending on whether the current keyboard is Latin or Arabic, not depending on the characters surrounding the parenthesis. All your examples show that (and I see the same on my Windows box if I switch the keyboard to Hebrew). Do you agree? > > For example, try typing "9*(4+5)" after switching to Arabic keyboard. > > What do you get? > > > > > 9*)4+5( Which is wrong, don't you think? > surrounded: > > ثثث9*)4+5(ثثث > - but that was when I typed the expression as if LtR (i.e. hitting 9 > first) This is how digits and other mathematical expressions are typed in bidirectional text. > I suspect an arabic person might type > > ثثث(5+4)*9ثثث > > - i.e. hitting ")" first when transcribing "9*(4+5)". Maybe if the digits are Arabic digits. I don't know enough Arabic to judge this example. Hebrew uses European digits, and they are typed left to right, exactly like in Latin scripts. We could, of course, tell users to switch off Hebrew input method when typing math, but that's an annoyance, IMO. > > Quail cannot easily know the context: it can only mirror these > > characters always, which is not right, since the display will mirror > > them only if they are surrounded by strong R2L characters. > > I expect that's in fact what arabic users expect, though an actual > arabic person might want to speak up... Maybe, I really don't know. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 18:15 ` Eli Zaretskii @ 2009-10-31 19:26 ` David De La Harpe Golden 2009-10-31 20:01 ` Eli Zaretskii 2009-11-01 3:40 ` Stephen J. Turnbull 2009-11-01 5:44 ` tomas 1 sibling, 2 replies; 28+ messages in thread From: David De La Harpe Golden @ 2009-10-31 19:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: > My understanding is that Shift-9 generates `(' or `)' depending on > whether the current keyboard is Latin or Arabic, not depending on the > characters surrounding the parenthesis. All your examples show that > (and I see the same on my Windows box if I switch the keyboard to > Hebrew). Do you agree? Probably - Shift-9 generates parenleft (#x28) or parenright (#x29) depending on current keyboard layout. #x28 is then displayed as ( in ltr context, or ) in rtl context. > >>> For example, try typing "9*(4+5)" after switching to Arabic keyboard. >>> What do you get? >>> >> >> 9*)4+5( > > Which is wrong, don't you think? > It's clearly not a valid arithmetical expression... It is however how mature bidi capable apps I tried behave, for better or worse. I don't think this is an area where emacs, bidi latecomer, should diverge from established practice, especially not by default - rtl-native users presumably by now expect to press the key labelled ")" to get "(" when using their native keymap but in an ltr context. Maybe they regard that as an annoyance, I dunno, or maybe it's a semantic-map feature, since the same shift-0 keypress still makes an opening paren (modern hebrew text, at least, seems to sometimes use () in text, not just arithmetic much like english, at least judging by wikipedia hebrew texts). But therefore it's not necessary to track whether the character was entered by keyboard unless you want to provide a further unusual "smart" layer that doesn't work like typical bidi apps. >> I suspect an arabic person might type >> >> ثثث(5+4)*9ثثث >> >> - i.e. hitting ")" first when transcribing "9*(4+5)". > > Maybe if the digits are Arabic digits. I don't know enough Arabic to > judge this example. Note that western "arabic numerals" vs. eastern arabic numerals is apparently a matter of font+bidi display (again for better or worse) - i.e. if I then copy just the expression above from within the rtl string and paste it into a ltr context, I get: (5+4)*9 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 19:26 ` David De La Harpe Golden @ 2009-10-31 20:01 ` Eli Zaretskii 2009-10-31 20:42 ` David De La Harpe Golden 2009-11-01 3:40 ` Stephen J. Turnbull 1 sibling, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-10-31 20:01 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: emacs-devel > Date: Sat, 31 Oct 2009 19:26:02 +0000 > From: David De La Harpe Golden <david@harpegolden.net> > CC: emacs-devel@gnu.org > > But therefore it's not necessary to track whether the character was > entered by keyboard unless you want to provide a further unusual "smart" > layer that doesn't work like typical bidi apps. I'm not convinced, sorry. There are mirrored characters that are not part of the localized keyboards, at least. They are also not supported by most language-oriented input methods. We still need to DTRT with them, even if they are inserted as Unicode codepoints or in some other way. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 20:01 ` Eli Zaretskii @ 2009-10-31 20:42 ` David De La Harpe Golden 2009-10-31 21:23 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: David De La Harpe Golden @ 2009-10-31 20:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: >> Date: Sat, 31 Oct 2009 19:26:02 +0000 >> From: David De La Harpe Golden <david@harpegolden.net> >> CC: emacs-devel@gnu.org >> >> But therefore it's not necessary to track whether the character was >> entered by keyboard unless you want to provide a further unusual "smart" >> layer that doesn't work like typical bidi apps. > > I'm not convinced, sorry. There are mirrored characters that are not > part of the localized keyboards, at least. They are also not > supported by most language-oriented input methods. We still need to > DTRT with them, even if they are inserted as Unicode codepoints or in > some other way. > > Well, you're writing the code, I'm in a ltr (not counting btt standing stones) area, I was just pointing out how the existing crop bidi apps do "handle" (i.e. not do anything clever) the issue. Maybe one way to handle it would be to make an (emacs level) input method autoswitcher, that swaps emacs input methods as the rtl/ltr context switches with point movement. i.e. allow (but don't require) rtl and ltr contexts to have different emacs input methods. Then there could be variant emacs input methods with various transpositions suitable for use with various os-level keymaps. And you don't have to be able to _record_ whether an inserted character came from the keyboard. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 20:42 ` David De La Harpe Golden @ 2009-10-31 21:23 ` Eli Zaretskii 2009-10-31 21:49 ` David De La Harpe Golden 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-10-31 21:23 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: emacs-devel > Date: Sat, 31 Oct 2009 20:42:48 +0000 > From: David De La Harpe Golden <david@harpegolden.net> > CC: emacs-devel@gnu.org > > Maybe one way to handle it would be to make an (emacs level) input > method autoswitcher, that swaps emacs input methods as the rtl/ltr > context switches with point movement. i.e. allow (but don't require) > rtl and ltr contexts to have different emacs input methods. We can have input methods switched on and off depending on surrounding characters, but how will this solve the problem that different methods of inputting the same character behave differently with mirrored characters? A user can conceptually type a character either via an Emacs input method or via the OS keyboard, in the same place, can't she? > Then there could be variant emacs input methods with various > transpositions suitable for use with various os-level keymaps. Are you saying that Emacs should have a way of knowing which OS-level keyboard layout (or keyboard language, in Windows parlance) was used to insert the character? If so, how to do that? Or are you saying that switching on a suitable input method, depending on surrounding characters, will eliminate the need to know how the character was inserted? If so, please explain why you think so, because I don't follow. > And you don't have to be able to _record_ whether an inserted > character came from the keyboard. I don't need to record that, I just need to know that when the character is inserted. After it's inserted, this information is not needed anymore, because display-time mirroring has enough information to DTRT. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 21:23 ` Eli Zaretskii @ 2009-10-31 21:49 ` David De La Harpe Golden 2009-11-01 3:44 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: David De La Harpe Golden @ 2009-10-31 21:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: > Are you saying that Emacs should have a way of knowing which OS-level > keyboard layout (or keyboard language, in Windows parlance) was used > to insert the character? If so, how to do that? > Not as such (though asking the os what the current keyboard layout is should be possible on any reasonable os via the platform analog of XkbGetKeyboard() ?) - A user with a hebrew os keyboard layout who liked auto-switching could define that in an ltr context, an emacs input method revpar* should be switched to. That input method would yield "(" when the os sends ")" to emacs. A user with a us os keyboard layout who wanted to use the hebrew emacs input method and also auto-switch could use "hebrew" and "hebrew-revpar" rtl and ltr input methods. > Or are you saying that switching on a suitable input method, depending > on surrounding characters, will eliminate the need to know how the > character was inserted? If so, please explain why you think so, > because I don't follow. Because it would always insert the appropriate character? * (require quail) (quail-define-package "revpar" "revpar" ")(" t "transpose inserted parens" nil nil nil nil nil nil nil nil nil nil t) (quail-define-rules ("(" ?\)) (")" ?\()) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 21:49 ` David De La Harpe Golden @ 2009-11-01 3:44 ` Eli Zaretskii 2009-11-01 5:24 ` David De La Harpe Golden 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-11-01 3:44 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: emacs-devel > Date: Sat, 31 Oct 2009 21:49:20 +0000 > From: David De La Harpe Golden <david@harpegolden.net> > CC: emacs-devel@gnu.org > > - A user with a hebrew os keyboard layout who liked auto-switching could > define that in an ltr context, an emacs input method revpar* should be > switched to. That input method would yield "(" when the os sends ")" to > emacs. A user with a us os keyboard layout who wanted to use the hebrew > emacs input method and also auto-switch could use "hebrew" and > "hebrew-revpar" rtl and ltr input methods. OK, but knowing whether to mirror or not requires information about whether a given keyboard already mirrors characters. Can this be found somewhere, or queried at run time? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 3:44 ` Eli Zaretskii @ 2009-11-01 5:24 ` David De La Harpe Golden 2009-11-01 19:59 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: David De La Harpe Golden @ 2009-11-01 5:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: > OK, but knowing whether to mirror or not requires information about > whether a given keyboard already mirrors characters. Can this be > found somewhere, or queried at run time? [Only if you want to do it automatically, if it was a user preference the user would just be setting ltr and rtl current input methods as desired? Or maybe it would be more intuitive to use just one input method, e.g. hebrew-parenjuggle, expanding the input method layer to support rtl/ltr context sensitive definitions for individual rules] It is possible to find out if the current os keyboard layout is us or hebrew or whatever which could be coupled with prior information that it is standard for certain layouts to mirror. I don't know exhaustively which ones do, though a lot could probably be extracted by inspection of the xkb database. If OTOH you wanted to find out whether the code a keypress returns under the current os layout actually corresponds to the glyph printed on the keyboard, you can't really - Only the user knows that at present as current keyboards don't really inform the computer what glyphs they have physically printed on them AFAIK (though it would certainly be technically feasible for a keyboard to e.g. say "Hi, I am physically a standard british qwerty 105 key keyboard" to a computer with some well-defined wire protocol, I don't think typical PC ones do). ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 5:24 ` David De La Harpe Golden @ 2009-11-01 19:59 ` Eli Zaretskii 2009-11-01 20:19 ` David De La Harpe Golden 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-11-01 19:59 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: emacs-devel > Date: Sun, 01 Nov 2009 05:24:44 +0000 > From: David De La Harpe Golden <david@harpegolden.net> > CC: emacs-devel@gnu.org > > If OTOH you wanted to find out whether the code a keypress returns under > the current os layout actually corresponds to the glyph printed on the > keyboard, you can't really Isn't there some API to get the key's symbol, rather than the character it produced? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 19:59 ` Eli Zaretskii @ 2009-11-01 20:19 ` David De La Harpe Golden 0 siblings, 0 replies; 28+ messages in thread From: David De La Harpe Golden @ 2009-11-01 20:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: >> Date: Sun, 01 Nov 2009 05:24:44 +0000 >> From: David De La Harpe Golden <david@harpegolden.net> >> CC: emacs-devel@gnu.org >> >> If OTOH you wanted to find out whether the code a keypress returns under >> the current os layout actually corresponds to the glyph printed on the >> keyboard, you can't really > > Isn't there some API to get the key's symbol, rather than the > character it produced? There are only properties of the logical os keyboard layout available, not the physical keyboard. If I set my os keyboard layout to "US", there's presently no way for the computer to interrogate my keyboard to find out I really have "£" printed above 3 not "#" despite my os keyboard layout setting. It'll just have to take my word for it that I've got a US keyboard. Keyboards just don't say "I am physically british layout" down the wire to the computer. They easily could and probably should, but don't. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 19:26 ` David De La Harpe Golden 2009-10-31 20:01 ` Eli Zaretskii @ 2009-11-01 3:40 ` Stephen J. Turnbull 2009-11-01 5:46 ` David De La Harpe Golden 1 sibling, 1 reply; 28+ messages in thread From: Stephen J. Turnbull @ 2009-11-01 3:40 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: Eli Zaretskii, emacs-devel David De La Harpe Golden writes: > Eli Zaretskii wrote: > >>> For example, try typing "9*(4+5)" after switching to Arabic keyboard. > >>> What do you get? > >> > >> 9*)4+5( > > > > Which is wrong, don't you think? > > It's clearly not a valid arithmetical expression... It is however how > mature bidi capable apps I tried behave, for better or worse. I think you should name the apps, so that people can judge for themselves whether those are "generally high quality" implementations if they have experience with them. I'm only interested in bidi in an academic sense, but I see an analogy to development of MUA features for handling mailing list traffic. Many "mature" MUAs impose substantial user pain because they don't recognize the RFC 2369 List-Post header as a signal to prefer to reply to list, although that header was standardized in 1998, and making this the default would essentially eliminate all demand for Reply-To munging. (Eg, Thunderbird 3 finally got this feature in the "Reply" button but it is still not bound in the key shortcuts.) It's possible that (like reply to list) the current audience of Emacs would prefer to learn context-dependent typing idioms for mirrored characters in bidi rather than be able to use the same "logical" sequence of keystrokes for "9*(4+5)" regardless of context. OTOH, that may be a barrier to reaching a new audience. An advanced algorithm certainly should be the default in betas. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 3:40 ` Stephen J. Turnbull @ 2009-11-01 5:46 ` David De La Harpe Golden 0 siblings, 0 replies; 28+ messages in thread From: David De La Harpe Golden @ 2009-11-01 5:46 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Eli Zaretskii, emacs-devel Stephen J. Turnbull wrote: > I think you should name the apps, so that people can judge for > themselves whether those are "generally high quality" implementations > if they have experience with them. Actually, I think they just break down to gtk+/pango based, qt based, and openoffice. qt and openoffice aren't magically displaying eastern arabic numerals sometimes like gkt+/pango does, but otherwise seem similar in my (cursory) tests. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 18:15 ` Eli Zaretskii 2009-10-31 19:26 ` David De La Harpe Golden @ 2009-11-01 5:44 ` tomas 2009-11-01 18:48 ` Eli Zaretskii 1 sibling, 1 reply; 28+ messages in thread From: tomas @ 2009-11-01 5:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, David De La Harpe Golden -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sat, Oct 31, 2009 at 08:15:58PM +0200, Eli Zaretskii wrote: > > Date: Sat, 31 Oct 2009 17:37:07 +0000 > > From: David De La Harpe Golden <david@harpegolden.net> [...] > > ثثث9*)4+5(ثثث > > - but that was when I typed the expression as if LtR (i.e. hitting 9 > > first) > > This is how digits and other mathematical expressions are typed in > bidirectional text. > > > I suspect an arabic person might type > > > > ثثث(5+4)*9ثثث > > > > - i.e. hitting ")" first when transcribing "9*(4+5)". > > Maybe if the digits are Arabic digits. I don't know enough Arabic to > judge this example [...] I don't either, but FWIW, i can say that although Arabic uses different glyphs to represent digits, the write direction for numerals is the same as in Latin and Hebrew, i.e. most significant digit to the left. Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFK7SBMBcgs9XrR2kYRAgNEAJ0dkBlhLxLZ26PoFJ4kMf0Pbg8BFgCfRHFu iql3sxmw8+BKk/pwOoYk2tg= =dJQ5 -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 5:44 ` tomas @ 2009-11-01 18:48 ` Eli Zaretskii 2009-11-01 20:09 ` David De La Harpe Golden 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-11-01 18:48 UTC (permalink / raw) To: tomas; +Cc: emacs-devel, david > Date: Sun, 1 Nov 2009 06:44:44 +0100 > From: tomas@tuxteam.de > Cc: David De La Harpe Golden <david@harpegolden.net>, emacs-devel@gnu.org > > > > I suspect an arabic person might type > > > > > > ثثث(5+4)*9ثثث > > > > > > - i.e. hitting ")" first when transcribing "9*(4+5)". > > > > Maybe if the digits are Arabic digits. I don't know enough Arabic to > > judge this example [...] > > I don't either, but FWIW, i can say that although Arabic uses different > glyphs to represent digits, the write direction for numerals is the same > as in Latin and Hebrew, i.e. most significant digit to the left. Maybe so, but UAX#9 treats European digits and Arabic digits differently. They have different bidirectional properties. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 18:48 ` Eli Zaretskii @ 2009-11-01 20:09 ` David De La Harpe Golden 2009-11-02 5:03 ` tomas 0 siblings, 1 reply; 28+ messages in thread From: David De La Harpe Golden @ 2009-11-01 20:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: tomas, emacs-devel Eli Zaretskii wrote: > Maybe so, but UAX#9 treats European digits and Arabic digits > differently. They have different bidirectional properties. The eastern arabic digits U+06F0 to U+06F9 and usual western digits U+0030 to U+0039 do have separate code point ranges. Apparently the former code points are less used - the arabic keyboard layout returns the western codes for the 0-9 keypresses. gtk+/pango seems to be choosing to merely _display_ the latter with the glyphs of the former depending on surrounding language. Based on post dug up here http://markmail.org/message/72on34u7nupadioh they might be availing of a "higher level protocol" freedom granted to them. Or something. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 20:09 ` David De La Harpe Golden @ 2009-11-02 5:03 ` tomas 0 siblings, 0 replies; 28+ messages in thread From: tomas @ 2009-11-02 5:03 UTC (permalink / raw) To: David De La Harpe Golden; +Cc: Eli Zaretskii, tomas, emacs-devel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, Nov 01, 2009 at 08:09:19PM +0000, David De La Harpe Golden wrote: > Eli Zaretskii wrote: > > >> Maybe so, but UAX#9 treats European digits and Arabic digits >> differently. They have different bidirectional properties. Thanks for the info. Every day something new, I guess :-) > The eastern arabic digits U+06F0 to U+06F9 and usual western digits U+0030 > to U+0039 do have separate code point ranges. Apparently > the former code points are less used - the arabic keyboard layout returns > the western codes for the 0-9 keypresses. > > gtk+/pango seems to be choosing to merely _display_ the latter with the > glyphs of the former depending on surrounding language. [...] Yes, that's how I imagined things work, from my limited experience. Seems my view was a bit naïve. Thanks for the insights - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFK7mgtBcgs9XrR2kYRAkURAJ4vEqgx7PQdJz2Z+8xBklx5Bi3EGQCdH+09 KKdXu41zDoubLwjP1Q13Yz8= =dJrR -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-10-31 17:20 ` Eli Zaretskii 2009-10-31 17:37 ` David De La Harpe Golden @ 2009-11-01 1:30 ` Jason Rumney 2009-11-01 4:02 ` Eli Zaretskii [not found] ` <837huac8gg.fsf@gnu.org> 2 siblings, 1 reply; 28+ messages in thread From: Jason Rumney @ 2009-11-01 1:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, David De La Harpe Golden Eli Zaretskii wrote: > But that is wrong: per the Unicode Bidirectional Algorithm > (a.k.a. UAX#9), a `(' should only be mirrored if its resolved > directionality is R: > I don't think you can do that mirroring on input, as the directionality will change as the user types (Assume below letters represent an Arabic or Hebrew character): User types: ABCD( Displayed as: )DCBA User types: ABCD(4 Displayed as: 4)DCBA or (4DCBA? I suspect the first, as the user might type something other than a number next User types: ABCD(4+5) Displayed as: (4+5)DCBA regardless of how directionality of parens is interpreted. User types: ABCD(4+5)*9 Displayed as: (4+5)*9DCBA Parens here must be LTR I guess this is why the mirroring happens at keyboard driver level and applications do not try to do it correctly, because in practice doing it correctly results in text jumping around, confusing the user more than it confuses them to manually fix the problems of a dumb implementation. And there are always going to be ambiguous cases, where leaving the user to manually DTRT will be the only option. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 1:30 ` Jason Rumney @ 2009-11-01 4:02 ` Eli Zaretskii 2009-11-01 5:25 ` Stephen J. Turnbull 0 siblings, 1 reply; 28+ messages in thread From: Eli Zaretskii @ 2009-11-01 4:02 UTC (permalink / raw) To: Jason Rumney; +Cc: emacs-devel, david > Date: Sun, 01 Nov 2009 09:30:35 +0800 > From: Jason Rumney <jasonr@gnu.org> > CC: David De La Harpe Golden <david@harpegolden.net>, > emacs-devel@gnu.org > > User types: ABCD( > Displayed as: )DCBA Yes. > User types: ABCD(4 > Displayed as: 4)DCBA or (4DCBA? I suspect the first, as the user might > type something other than a number next The first, yes. > User types: ABCD(4+5) > Displayed as: (4+5)DCBA regardless of how directionality of parens is > interpreted. Yes. But I don't understand the ``regardless'' part. If you want to know the resolved directionality of each paren, I can tell you what the current algorithm does (what UAX#9 requires). > User types: ABCD(4+5)*9 > Displayed as: (4+5)*9DCBA Parens here must be LTR No, it's displayed as 9*(4+5)DCBA. > I guess this is why the mirroring happens at keyboard driver level and > applications do not try to do it correctly, because in practice doing it > correctly results in text jumping around, confusing the user more than > it confuses them to manually fix the problems of a dumb implementation. Sorry, I don't understand: what jumping around are we talking about, and how is mirroring related to that? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 4:02 ` Eli Zaretskii @ 2009-11-01 5:25 ` Stephen J. Turnbull 2009-11-01 13:59 ` David De La Harpe Golden 2009-11-01 19:57 ` Eli Zaretskii 0 siblings, 2 replies; 28+ messages in thread From: Stephen J. Turnbull @ 2009-11-01 5:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: david, emacs-devel, Jason Rumney Eli Zaretskii writes: > > User types: ABCD(4+5)*9 > > Displayed as: (4+5)*9DCBA Parens here must be LTR > > No, it's displayed as 9*(4+5)DCBA. That seems weird to me. From my (probably imperfect) understanding of UAX#9 I would expect the following sequence of displays starting with an empty buffer (notation: uppercase letters are RTL, lowercase letters and digits are LTR, -!- is point): -!- -!-A -!-BA -!-CBA -!-DCBA -!-)DCBA 4-!-)DCBA 4+-!-)DCBA 4+5-!-)DCBA [1] -!-(4+5)DCBA <-- point jumps (4+5)*-!-DCBA <-- point jumps again (4+5)*9-!-DCBA I gather you're saying the correct interpretation of UAX#9 is (starting from [1]): 4+5-!-)DCBA [1] -!-(4+5)DCBA <-- point jumps -!-*(4+5)DCBA 9-!-*(4+5)DCBA ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 5:25 ` Stephen J. Turnbull @ 2009-11-01 13:59 ` David De La Harpe Golden 2009-11-01 19:57 ` Eli Zaretskii 1 sibling, 0 replies; 28+ messages in thread From: David De La Harpe Golden @ 2009-11-01 13:59 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Eli Zaretskii, Jason Rumney, emacs-devel Turning on arabic layout and using keypresses "ABCD(5+4)*96EF" i.e. pressing a then b then c then d then Shift-0 (so generating code for "(" not ")", remembering the arabic layout has them reversed) then 5 then + then 4 then Shift-9 (so code ")" not "(") then * then 9 then 6 then e then f. And now using "(" and ")" to indicate displayed orientation not underlying code: -!- is cursor: icedove / kmail / openoffice all do: -!- -!-A -!-BA -!-CBA -!-DCBA DCBA(-!- 5-!-)DCBA 5)DCBA+-!- 4-!-+5)DCBA 4+5)DCBA)-!- 4+5)DCBA)*-!- 9-!-*(4+5)DCBA 96-!-*(4+5)DCBA -!-E96*(4+5)DCBA -!-FE96*(4+5)DCBA ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-01 5:25 ` Stephen J. Turnbull 2009-11-01 13:59 ` David De La Harpe Golden @ 2009-11-01 19:57 ` Eli Zaretskii 1 sibling, 0 replies; 28+ messages in thread From: Eli Zaretskii @ 2009-11-01 19:57 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: david, emacs-devel, jasonr > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: Jason Rumney <jasonr@gnu.org>, > emacs-devel@gnu.org, > david@harpegolden.net > Date: Sun, 01 Nov 2009 14:25:42 +0900 > > Eli Zaretskii writes: > > > > User types: ABCD(4+5)*9 > > > Displayed as: (4+5)*9DCBA Parens here must be LTR > > > > No, it's displayed as 9*(4+5)DCBA. > > That seems weird to me. From my (probably imperfect) understanding of > UAX#9 I would expect the following sequence of displays starting with > an empty buffer (notation: uppercase letters are RTL, lowercase > letters and digits are LTR, -!- is point): > > -!- > -!-A > -!-BA > -!-CBA > -!-DCBA > -!-)DCBA > 4-!-)DCBA > 4+-!-)DCBA > 4+5-!-)DCBA [1] > -!-(4+5)DCBA <-- point jumps > (4+5)*-!-DCBA <-- point jumps again > (4+5)*9-!-DCBA (When you say "point jumps", you actually mean "cursor jumps", right? Because point does not jump at all, it always is after the last character typed on each of the above lines.) > I gather you're saying the correct interpretation of UAX#9 is > (starting from [1]): > > 4+5-!-)DCBA [1] > -!-(4+5)DCBA <-- point jumps > -!-*(4+5)DCBA > 9-!-*(4+5)DCBA Not exactly. I didn't say anything about point or cursor location. UAX#9 does not specify where to put the cursor and how it should move during text insertion, and different implementations do it differently for various reasons, some valid, some less so. (There are two equally ``correct'' locations of the cursor, because buffer position changes non-linearly with screen position, and "between two adjacent characters" is no longer well defined.) I didn't yet implement in Emacs anything beyond basic logical-order cursor motion, whereby C-f moves to the next character in the logical order. I expect some quite heated debates regarding this, when the time comes. But for now I'm deliberately ignoring this issue, because it's not a fundamental one. It's a usability and UI issue, and all I care at this point is to provide enough infrastructure to implement any behavior we will want (and probably more than one) when the time comes. Coming back to the example, cursor motion is not important here. Assume that this text comes from a file, where you have ABCD(4+5)*9 in logical order. The way this will be displayed depends on the properties of the characters. The key reason for the fact that * and 9 are to the left of the (4+5) is that (, ), and + are all "neutral" characters, in UAX#9 parlance, while * is a "weak separator" character. That, and the fact that numbers get higher resolved levels than the surrounding text, see 3.3.5 in UAX#9. That's why "*9" is not rendered to the right of "(4+5)". ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <837huac8gg.fsf@gnu.org>]
* Re: How to recognize keyboard insertion? [not found] ` <837huac8gg.fsf@gnu.org> @ 2009-11-02 14:49 ` Ehud Karni 2009-11-02 19:02 ` Eli Zaretskii 0 siblings, 1 reply; 28+ messages in thread From: Ehud Karni @ 2009-11-02 14:49 UTC (permalink / raw) To: eliz; +Cc: emacs-devel On Sun, 01 Nov 2009 22:10:23 Eli Zaretskii wrote: > > Ehud, I'd appreciate your opinion on this matter. I read all the messages in the thread from the beginning. First, I want to remind that UAX#9 only deal with converting logical order to visual order, and not how to create the "Logical" text. I think that we should separate the ordering for display (your code) from getting the input (input method or keyboard layout). It seems that both Microsoft and the Xorg developers decided to use mirroring for Hebrew keyboard (see /usr/share/X11/xkb/symbols/il). So if the user uses an external "input method" (i.e. keyboard map) the 4 pairs - () [] {} <>, are already mirrored. If she prefers to use an Emacs internal input method (like I use with my hebeng.el) the mirroring should be an option. Ehud BTW. The keying of the RTL text and arithmetic expression discussed previously on the thread, Typing (from left to right): "A B C D ( 4 + 5 ) * 9" results in: 9*)4+5(DCBA That is because of parens mirroring (at the keyboard). -- Ehud Karni Tel: +972-3-7966-561 /"\ Mivtach - Simon Fax: +972-3-7976-561 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D <http://www.keyserver.net/> Better Safe Than Sorry ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How to recognize keyboard insertion? 2009-11-02 14:49 ` Ehud Karni @ 2009-11-02 19:02 ` Eli Zaretskii 0 siblings, 0 replies; 28+ messages in thread From: Eli Zaretskii @ 2009-11-02 19:02 UTC (permalink / raw) To: ehud; +Cc: emacs-devel > Date: Mon, 2 Nov 2009 16:49:46 +0200 > From: "Ehud Karni" <ehud@unix.mvs.co.il> > Cc: emacs-devel@gnu.org > > It seems that both Microsoft and the Xorg developers decided to use > mirroring for Hebrew keyboard (see /usr/share/X11/xkb/symbols/il). > > So if the user uses an external "input method" (i.e. keyboard map) > the 4 pairs - () [] {} <>, are already mirrored. If she prefers to > use an Emacs internal input method (like I use with my hebeng.el) > the mirroring should be an option. I tend to agree. The automatic mirroring is not 100% correct, but it's probably right 80% of the time, and the rest could be fixed by introducing a command to mirror the character at point. > BTW. The keying of the RTL text and arithmetic expression discussed > previously on the thread, Typing (from left to right): > "A B C D ( 4 + 5 ) * 9" results in: 9*)4+5(DCBA > That is because of parens mirroring (at the keyboard). Yes, that's exactly one manifestation of why the automatic mirroring is wrong: it assumes too much about the application which will get this input. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2009-11-02 19:02 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-31 15:57 How to recognize keyboard insertion? Eli Zaretskii 2009-10-31 16:58 ` David De La Harpe Golden 2009-10-31 17:20 ` Eli Zaretskii 2009-10-31 17:37 ` David De La Harpe Golden 2009-10-31 17:43 ` David De La Harpe Golden 2009-10-31 18:15 ` Eli Zaretskii 2009-10-31 19:26 ` David De La Harpe Golden 2009-10-31 20:01 ` Eli Zaretskii 2009-10-31 20:42 ` David De La Harpe Golden 2009-10-31 21:23 ` Eli Zaretskii 2009-10-31 21:49 ` David De La Harpe Golden 2009-11-01 3:44 ` Eli Zaretskii 2009-11-01 5:24 ` David De La Harpe Golden 2009-11-01 19:59 ` Eli Zaretskii 2009-11-01 20:19 ` David De La Harpe Golden 2009-11-01 3:40 ` Stephen J. Turnbull 2009-11-01 5:46 ` David De La Harpe Golden 2009-11-01 5:44 ` tomas 2009-11-01 18:48 ` Eli Zaretskii 2009-11-01 20:09 ` David De La Harpe Golden 2009-11-02 5:03 ` tomas 2009-11-01 1:30 ` Jason Rumney 2009-11-01 4:02 ` Eli Zaretskii 2009-11-01 5:25 ` Stephen J. Turnbull 2009-11-01 13:59 ` David De La Harpe Golden 2009-11-01 19:57 ` Eli Zaretskii [not found] ` <837huac8gg.fsf@gnu.org> 2009-11-02 14:49 ` Ehud Karni 2009-11-02 19:02 ` Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.