On Fri, Aug 27, 2010 at 12:56 PM, Eli Zaretskii wrote: > > From: Kenichi Handa > > Date: Thu, 26 Aug 2010 10:10:05 +0900 > > > > I've just committed changes to trunk for Arabic shaping. If > > there're any Arabic users in this list, please check the > > displaying of Arabic text. On GNU/Linux system, you must > > compile Emacs with libotf and m17n-lib (configure script > > should detect them automatically). > > Thanks. However, today's build behaves very strangely in a GUI > session on MS-Windows. For starters, cursor motion seems to jump > across many characters in the "Arabic" line of etc/HELLO. For > example, typing C-f in that line, I first move one character at a time > across "Arabic", as expected, then the cursor jumps to the right paren > of the leftmost parenthesized part, again as expected, and then I see > the following strange behavior: > > . C-f moves one character to the left, to buffer position 758, as > expected. > > . the next C-f jumps across many characters on the screen and lands > on position 764. > > . another C-f jumps to what is reported as position 765, but on the > screen those are several characters, maybe 5 or 6. > > . another C-f moves to the left paren at position 766, as expected. > > . yet another C-f moves to position 767, but on the screen the > cursor jumps back into one of the characters it jumped across when > it landed on position 765 two C-f keypresses earlier. > > . if I type C-b 4 times from this point, I enter a "trap", whereby > typing C-b jumps between two characters, whose buffer positions > are 764 and 765. The only way to get out of the trap is with C-a > or C-e or C-f. > > I don't read Arabic, so I cannot really say whether any of this is > expected behavior. (The "trap" with C-b is certainly not the expected > behavior.) Do you see anything similar on X? > > 1) I confirm that Arabic shaping seems to work fine on my build (27/8/10 rev. 101200, on Linux+X (Debian unstable)). 2) Logical movement with C-f/C-b in the hello file seems fine (I do not see the trap described above). 3) My Arabic is very basic, and I am not familiar with Arabic computing (keyboards etc.) - I noticed the following points, but I am not sure what is the expected behavior (I can only compare to other programs - gedit in this case): a) Column numbers (column-number-mode) behave strangely (I suspect that m17n-lib's invisible markup consume column numbers). For example as you move using C-f in the word "هذا" column numbers go through "0,1,4,5" (i.e. the second character takes up 3 columns). If I change that to "بهذا", the column positions are "0,1,4,6,7" (the second and third chars take up 3 and 2 columns resp.?). In gedit column positions are 1 character per column and do not depend on the shaping. b) Arabic keyboard has the ligature "Lam-Alef" (U+FEFB) on the key marked "B" in qwerty keyboards. When I type this in emacs, I get Lam and Alef (which are auto-shaped correctly as the proper ligature). C-d when cursor is on the ligature erases the Alef and another C-d erases the Lam. This seems like proper behavior to me. However, in gedit, the "B" key produces a (U+FEFB) which is always displayed as a ligature, deleted in a single Del press, and never connected to previous character. Cut and pasting this into emacs, I get a similar behavior there. The question is: do Arabic users expect to be able to produce this "stiff" ligature? Is the behavior of gedit a bug? Should the emacs "Lam-Alef" key behave as it does (i.e. produce two characters)? thanks, Amit Aronovitch