* bidi and shaping problems in describe-input-method @ 2012-03-06 22:17 Mohsen BANAN 2012-03-07 4:05 ` Eli Zaretskii 2012-03-08 4:30 ` Miles Bader 0 siblings, 2 replies; 41+ messages in thread From: Mohsen BANAN @ 2012-03-06 22:17 UTC (permalink / raw) To: emacs-devel There are two minor problems in describe-input-method which I think we can easily fix. The first problem is bidi related: Try: (describe-input-method 'arabic) and then try: (describe-input-method 'hebrew) In the case of 'arabic note how the entire keyboard is flipped to the right. The second problem is shaping related: Inside of a cell on the keyboard layout, when there are two characters that can be joined, they are joined -- be default. They should not be. Consider for example غإ which should have been غإ instead. The fix can involve inserting a (ucs-insert 8204) between the two characters in each cell. That is a ZERO WIDTH NON-JOINER. I can help with this and work with maintainer of describe-input-method to add the above changes. Good input methods help/documentation for bidi languages are important when emacs24 comes out. Thanks. ...Mohsen ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-06 22:17 bidi and shaping problems in describe-input-method Mohsen BANAN @ 2012-03-07 4:05 ` Eli Zaretskii 2012-03-07 18:49 ` Eli Zaretskii 2012-03-07 21:32 ` Mohsen BANAN 2012-03-08 4:30 ` Miles Bader 1 sibling, 2 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-07 4:05 UTC (permalink / raw) To: Mohsen BANAN; +Cc: emacs-devel > From: Mohsen BANAN <list-general@mohsen.1.banan.byname.net> > Date: Tue, 06 Mar 2012 14:17:41 -0800 > > Try: > (describe-input-method 'arabic) > and then try: > (describe-input-method 'hebrew) > > In the case of 'arabic note how the entire > keyboard is flipped to the right. That's easy to fix. > The second problem is shaping related: > > Inside of a cell on the keyboard layout, when > there are two characters that can be joined, they > are joined -- be default. They should not be. How can one know when they should be joined and when not? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-07 4:05 ` Eli Zaretskii @ 2012-03-07 18:49 ` Eli Zaretskii 2012-03-07 21:32 ` Mohsen BANAN 1 sibling, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-07 18:49 UTC (permalink / raw) To: list-general; +Cc: emacs-devel > Date: Wed, 07 Mar 2012 06:05:19 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > > From: Mohsen BANAN <list-general@mohsen.1.banan.byname.net> > > Date: Tue, 06 Mar 2012 14:17:41 -0800 > > > > Try: > > (describe-input-method 'arabic) > > and then try: > > (describe-input-method 'hebrew) > > > > In the case of 'arabic note how the entire > > keyboard is flipped to the right. > > That's easy to fix. Done. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-07 4:05 ` Eli Zaretskii 2012-03-07 18:49 ` Eli Zaretskii @ 2012-03-07 21:32 ` Mohsen BANAN 2012-03-08 15:30 ` Kenichi Handa 1 sibling, 1 reply; 41+ messages in thread From: Mohsen BANAN @ 2012-03-07 21:32 UTC (permalink / raw) To: emacs-devel >>>>> On Wed, 07 Mar 2012 06:05:19 +0200, Eli Zaretskii <eliz@gnu.org> said: >> From: Mohsen BANAN <list-general@mohsen.1.banan.byname.net> Mohsen> In the case of 'arabic note how the entire Mohsen> keyboard is flipped to the right. Eli> That's easy to fix. Great! Thanks for having taken care of that. Mohsen> The second problem is shaping related: Mohsen> Mohsen> Inside of a cell on the keyboard layout, when Mohsen> there are two characters that can be joined, they Mohsen> are joined -- be default. They should not be. Eli> How can one know when they should be joined and when not? I think the simple answer is: always isolated -- never joined. For Persian and Arabic I am sure that they should never be joined -- always isolated. For other shaped languages, it is hard to imagine an input method designer would ever want them joined. For non-shaped languages (e.g., latin keyboards) the insertion of an zero width non-joiner between lower and upper case is harmless and invisible. So, the simplest fix (and perhaps the-right-thing-to-do) is to ALWAYS insert a (ucs-insert 8204) -- zero width non-joiner -- between the two characters in each and every keyboard cell. Thanks, ...Mohsen ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-07 21:32 ` Mohsen BANAN @ 2012-03-08 15:30 ` Kenichi Handa 2012-03-08 18:24 ` Eli Zaretskii 2012-03-08 18:30 ` Eli Zaretskii 0 siblings, 2 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-08 15:30 UTC (permalink / raw) To: Mohsen BANAN; +Cc: eliz, emacs-devel In article <yx262eg9jxk.fsf@mohsen.1.banan.byname.net>, Mohsen BANAN <list-general@mohsen.1.banan.byname.net> writes: Mohsen> The second problem is shaping related: Mohsen> Mohsen> Inside of a cell on the keyboard layout, when Mohsen> there are two characters that can be joined, they Mohsen> are joined -- be default. They should not be. Eli> How can one know when they should be joined and when not? > I think the simple answer is: always isolated -- never joined. > For Persian and Arabic I am sure that they should > never be joined -- always isolated. Sure. > For other shaped languages, it is hard to imagine > an input method designer would ever want them joined. I agree. > For non-shaped languages (e.g., latin keyboards) > the insertion of an zero width non-joiner between > lower and upper case is harmless and invisible. > So, the simplest fix (and perhaps > the-right-thing-to-do) is to ALWAYS insert a > (ucs-insert 8204) -- zero width non-joiner -- > between the two characters in each and every > keyboard cell. If we insert something unconditionally, I think inserting (propertize " " 'invisible t) is safer. It should work on tty terminal too. By the way, for this bug: Mohsen> In the case of 'arabic note how the entire Mohsen> keyboard is flipped to the right. just setting bidi-paragraph-direction to 'left-to-right is not enough, because keyboard cells in a row are still re-ordered. For this, the easiest fix is to set bidi-display-reordering to nil. But, then we can't use actual Arabic and Hebrew words in the docstrings of those input methods. What we want is to display bidi reordering only for the keyboard layout part. Eli, don't you have any good idea? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 15:30 ` Kenichi Handa @ 2012-03-08 18:24 ` Eli Zaretskii 2012-03-08 23:48 ` Kenichi Handa 2012-03-08 18:30 ` Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-08 18:24 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: emacs-devel@gnu.org, eliz@gnu.org > Date: Fri, 09 Mar 2012 00:30:25 +0900 > > By the way, for this bug: > > Mohsen> In the case of 'arabic note how the entire > Mohsen> keyboard is flipped to the right. > > just setting bidi-paragraph-direction to 'left-to-right is > not enough, because keyboard cells in a row are still > re-ordered. Right. I didn't notice it because I don't read Arabic. > For this, the easiest fix is to set bidi-display-reordering to nil. > But, then we can't use actual Arabic and Hebrew words in the > docstrings of those input methods. What we want is to display bidi > reordering only for the keyboard layout part. Eli, don't you have > any good idea? Revision 107535 is the best I can do. I'll let Mohsen judge if it's good enough. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 18:24 ` Eli Zaretskii @ 2012-03-08 23:48 ` Kenichi Handa 2012-03-09 8:11 ` Eli Zaretskii 2012-03-09 8:17 ` Eli Zaretskii 0 siblings, 2 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-08 23:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83ipif0x46.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > just setting bidi-paragraph-direction to 'left-to-right is > > not enough, because keyboard cells in a row are still > > re-ordered. > Right. I didn't notice it because I don't read Arabic. That's re-ordering happens for Hebrew input method too. :-p > > For this, the easiest fix is to set bidi-display-reordering to nil. > > But, then we can't use actual Arabic and Hebrew words in the > > docstrings of those input methods. What we want is to display bidi > > reordering only for the keyboard layout part. Eli, don't you have > > any good idea? > Revision 107535 is the best I can do. I'll let Mohsen judge if it's > good enough. If possible, I'd like to avoid inserting LRM unconditionally. Is it possible to have this kind of function? (defun quail-help-require-LRM (char) (or (eq (get-char-code-property char 'bidi-class) 'L) ...)) Then, we can use it in quail-insert-kbd-layout as below: (if (quail-help-require-LRM (if (stringp lower) (aref lower 0) lower)) (insert #x200e)) (insert lower) (if (quail-help-require-LRM (if (stringp upper) (aref upper 0) upper)) (insert #x200e)) (insert upper) --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 23:48 ` Kenichi Handa @ 2012-03-09 8:11 ` Eli Zaretskii 2012-03-09 14:03 ` Kenichi Handa 2012-03-09 8:17 ` Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 8:11 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 09 Mar 2012 08:48:35 +0900 > > If possible, I'd like to avoid inserting LRM unconditionally. Why? They are invisible, so they are not displayed at all. > Is it possible to have this kind of function? > > (defun quail-help-require-LRM (char) > (or (eq (get-char-code-property char 'bidi-class) 'L) > ...)) It's possible, but why bother? And with this function you will insert the LRM for many characters that don't need that, like punctuation, numbers, etc. Also, `lower' and `upper' could be strings, in which case you need a more complex test. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 8:11 ` Eli Zaretskii @ 2012-03-09 14:03 ` Kenichi Handa 2012-03-09 16:12 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2012-03-09 14:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83d38m19dk.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > If possible, I'd like to avoid inserting LRM unconditionally. > Why? They are invisible, so they are not displayed at all. In general, it's smarter to use LRM only where necessary. And when one cut&paste the keyboard layout (or some part of it) of l2r characters, he will be surprized by LRM characters. > > Is it possible to have this kind of function? > > > > (defun quail-help-require-LRM (char) > > (or (eq (get-char-code-property char 'bidi-class) 'L) > > ...)) > It's possible, but why bother? And with this function you will insert > the LRM for many characters that don't need that, like punctuation, > numbers, etc. ??? I want a function that returns t only for a character that require preceding LRM in the keyboard layout. > Also, `lower' and `upper' could be strings, in which case you need a > more complex test. We can give (if (string lower) (aref lower 0) lower) to that function. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 14:03 ` Kenichi Handa @ 2012-03-09 16:12 ` Eli Zaretskii 2012-03-10 2:55 ` Kenichi Handa 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 16:12 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 09 Mar 2012 23:03:32 +0900 > > In article <83d38m19dk.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > > If possible, I'd like to avoid inserting LRM unconditionally. > > > Why? They are invisible, so they are not displayed at all. > > In general, it's smarter to use LRM only where necessary. Testing whether they are necessary is a problem in itself. You can easily avoid inserting the marks for strong L2R characters, but they are the minority. Most of the characters are not in that category. And of course keyboard layouts include such characters. > > > (defun quail-help-require-LRM (char) > > > (or (eq (get-char-code-property char 'bidi-class) 'L) > > > ...)) > > > It's possible, but why bother? And with this function you will insert > > the LRM for many characters that don't need that, like punctuation, > > numbers, etc. > > ??? I want a function that returns t only for a character > that require preceding LRM in the keyboard layout. Yes, I understand that. But the test you are suggesting, i.e. avoid the LRM only for characters whose bidi-class is L, will not catch numbers, punctuation, and other non-L characters. > > Also, `lower' and `upper' could be strings, in which case you need a > > more complex test. > > We can give (if (string lower) (aref lower 0) lower) to that > function. But that doesn't DTRT. Here's an example where it will fail: ".A". AFAIK, the only reliable way of telling whether a given string will be reordered is to actually reorder it, and then compare with the logical-order original. That's a nuisance, and also the results may well depend on the characters before and after the string in the buffer, so you need to know the context in advance, which you normally don't. I tried also a different solution: enclose each row of the keyboard layout in an L2R override embedding, LRO..PDF. This inserts only 2 control characters per row, and doesn't insert them inside the keyboard cells, so it is cleaner, I think. But using this means that no key description in the layout can be a string that requires reordering individually. (By contrast, inserting an LRM between the lower and the upper key still allows each description to be reordered.) Can we live with such a restriction? I don't know enough about Quail to tell. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 16:12 ` Eli Zaretskii @ 2012-03-10 2:55 ` Kenichi Handa 2012-03-10 10:27 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2012-03-10 2:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83pqclzrb5.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > In general, it's smarter to use LRM only where necessary. > Testing whether they are necessary is a problem in itself. You can > easily avoid inserting the marks for strong L2R characters, but they > are the minority. Most of the characters are not in that category. > And of course keyboard layouts include such characters. > > > > (defun quail-help-require-LRM (char) > > > > (or (eq (get-char-code-property char 'bidi-class) 'L) > > > > ...)) > > > > > It's possible, but why bother? And with this function you will insert > > > the LRM for many characters that don't need that, like punctuation, > > > numbers, etc. > > > > ??? I want a function that returns t only for a character > > that require preceding LRM in the keyboard layout. > Yes, I understand that. But the test you are suggesting, i.e. avoid > the LRM only for characters whose bidi-class is L, will not catch > numbers, punctuation, and other non-L characters. The function body I wrote is just an idea, not a complete solution, and of cource checking against L is apparently a bug. At least we must check against R (and AL). > > > Also, `lower' and `upper' could be strings, in which case you need a > > > more complex test. > > > > We can give (if (string lower) (aref lower 0) lower) to that > > function. > But that doesn't DTRT. Here's an example where it will fail: ".A". Why? Keyboard cells in the keyboard layout has typically this form: (L is for lower key, U is for upper (shifted) key) ... | LU | LU | ... What we want is to display the left LU to the left of the right LU, and display each L (character or string) to the right of the corresponding U. Even if the L (of the left LU) is ".A", we don't need LRM for it. We have to insert LRM only before a character that may reorder the previous characters, and after a character that may reorder the following character. Isn't it right? > AFAIK, the only reliable way of telling whether a given string will be > reordered is to actually reorder it, and then compare with the > logical-order original. That's a nuisance, and also the results may > well depend on the characters before and after the string in the > buffer, so you need to know the context in advance, which you normally > don't. > I tried also a different solution: enclose each row of the keyboard > layout in an L2R override embedding, LRO..PDF. This inserts only 2 > control characters per row, and doesn't insert them inside the > keyboard cells, so it is cleaner, I think. But using this means that > no key description in the layout can be a string that requires > reordering individually. (By contrast, inserting an LRM between the > lower and the upper key still allows each description to be > reordered.) Can we live with such a restriction? I don't know enough > about Quail to tell. As it's possible to assign a string to a key, there will be the case that the characters in the string must be reordered. In the above case, if L is a hebrew "שלום", it must be reordered. But, even if we surround that word with LRE and PDF, the word itself is reordered correctly, right? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-10 2:55 ` Kenichi Handa @ 2012-03-10 10:27 ` Eli Zaretskii 2012-03-12 7:47 ` Kenichi Handa 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-10 10:27 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Date: Sat, 10 Mar 2012 11:55:54 +0900 > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > > The function body I wrote is just an idea, not a complete > solution, and of cource checking against L is apparently > a bug. At least we must check against R (and AL). > > > > > Also, `lower' and `upper' could be strings, in which case you need a > > > > more complex test. > > > > > > We can give (if (string lower) (aref lower 0) lower) to that > > > function. > > > But that doesn't DTRT. Here's an example where it will fail: ".A". > > Why? I was explaining why testing for L is not TRT. > ... | LU | LU | ... > > What we want is to display the left LU to the left of the > right LU, and display each L (character or string) to the > right of the corresponding U. > > Even if the L (of the left LU) is ".A", we don't need LRM > for it. We have to insert LRM only before a character that > may reorder the previous characters, and after a character that > may reorder the following character. Isn't it right? You are describing what bidi-string-mark-left-to-right does, I believe. Note that it will still insert LRM in some cases where it is not strictly needed. > > I tried also a different solution: enclose each row of the keyboard > > layout in an L2R override embedding, LRO..PDF. This inserts only 2 > > control characters per row, and doesn't insert them inside the > > keyboard cells, so it is cleaner, I think. But using this means that > > no key description in the layout can be a string that requires > > reordering individually. (By contrast, inserting an LRM between the > > lower and the upper key still allows each description to be > > reordered.) Can we live with such a restriction? I don't know enough > > about Quail to tell. > > As it's possible to assign a string to a key, there will be > the case that the characters in the string must be > reordered. In the above case, if L is a hebrew "שלום", it > must be reordered. But, even if we surround that word with > LRE and PDF, the word itself is reordered correctly, right? Yes. But surrounding each `lower' and `upper' key labels in the layout with LRE..PDF inserts even more bidirectional control characters than just inserting LRM. By contrast, using LRO..PDF around the whole row of keys inserts just 2 such characters, so if it were not for the need to reorder the individual key labels, LRO..PDF would be a better alternative. I mentioned it because it does exactly what you originally asked for: it effectively disables bidi-display-reordering inside the embedded text, while still leaving the rest of the buffer reordered as usual. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-10 10:27 ` Eli Zaretskii @ 2012-03-12 7:47 ` Kenichi Handa 2012-03-12 17:42 ` Eli Zaretskii 2012-03-13 5:46 ` Mohsen BANAN 0 siblings, 2 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-12 7:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <8362eczr73.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > Yes. But surrounding each `lower' and `upper' key labels in the > layout with LRE..PDF inserts even more bidirectional control > characters than just inserting LRM. By contrast, using LRO..PDF > around the whole row of keys inserts just 2 such characters, so if it > were not for the need to reorder the individual key labels, LRO..PDF > would be a better alternative. I mentioned it because it does exactly > what you originally asked for: it effectively disables > bidi-display-reordering inside the embedded text, while still leaving > the rest of the buffer reordered as usual. I mixed up with LRE and LRO, sorry. Anyway, if LRO..PDF works, it is surely better than many LRMs. I've just installed a proper change including the magic of compose-string. Please try the latest code. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-12 7:47 ` Kenichi Handa @ 2012-03-12 17:42 ` Eli Zaretskii 2012-03-13 0:58 ` Kenichi Handa 2012-03-13 5:46 ` Mohsen BANAN 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-12 17:42 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Mon, 12 Mar 2012 16:47:11 +0900 > > In article <8362eczr73.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > Yes. But surrounding each `lower' and `upper' key labels in the > > layout with LRE..PDF inserts even more bidirectional control > > characters than just inserting LRM. By contrast, using LRO..PDF > > around the whole row of keys inserts just 2 such characters, so if it > > were not for the need to reorder the individual key labels, LRO..PDF > > would be a better alternative. I mentioned it because it does exactly > > what you originally asked for: it effectively disables > > bidi-display-reordering inside the embedded text, while still leaving > > the rest of the buffer reordered as usual. > > I mixed up with LRE and LRO, sorry. Anyway, if LRO..PDF > works, it is surely better than many LRMs. I've just > installed a proper change including the magic of > compose-string. Please try the latest code. It works fine for me, thanks. However, using LRO..PDF means that no label on a key can use a string that needs to be reordered. That's because the LRO overrides the bidirectional properties of all the following characters to be strong L. If we can live with this limitation, I agree that this is better. But I think you said earlier that such a restriction is more than we can bear. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-12 17:42 ` Eli Zaretskii @ 2012-03-13 0:58 ` Kenichi Handa 2012-03-13 3:58 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2012-03-13 0:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <8362e9yaum.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > However, using LRO..PDF means that no label on a key can use a string > that needs to be reordered. That's because the LRO overrides the > bidirectional properties of all the following characters to be strong > L. Ahh, ummm, that's not good. I'm still misunderstanding LRO. :-( > If we can live with this limitation, I agree that this is better. > But I think you said earlier that such a restriction is more than we > can bear. What we need is to display (only capital letters are Hebrew): ... | HEB REW | ABC DEF | ... as ... | BEH WER | CBA FED | ... If none of LRO..PDF, LRE..PDF work, and if there's no easy way to determine when to insert LRM, the only way is to insert LRMs unconditionally. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-13 0:58 ` Kenichi Handa @ 2012-03-13 3:58 ` Eli Zaretskii 2012-03-22 4:26 ` Kenichi Handa 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-13 3:58 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Tue, 13 Mar 2012 09:58:46 +0900 > > What we need is to display (only capital letters are Hebrew): > ... | HEB REW | ABC DEF | ... > as > ... | BEH WER | CBA FED | ... Right. > If none of LRO..PDF, LRE..PDF work, and if there's no easy > way to determine when to insert LRM, the only way is to > insert LRMs unconditionally. You can use bidi-string-mark-left-to-right, I think, which will refrain from inserting the LRM characters where possible. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-13 3:58 ` Eli Zaretskii @ 2012-03-22 4:26 ` Kenichi Handa 2012-03-22 17:23 ` Eli Zaretskii 2012-03-22 21:59 ` Mohsen BANAN 0 siblings, 2 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-22 4:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83ty1tw3rs.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > If none of LRO..PDF, LRE..PDF work, and if there's no easy > > way to determine when to insert LRM, the only way is to > > insert LRMs unconditionally. > You can use bidi-string-mark-left-to-right, I think, which will > refrain from inserting the LRM characters where possible. I see. I've just committed a change to use bidi-string-mark-left-to-right. Mohsen, could you please try again with the latest code? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-22 4:26 ` Kenichi Handa @ 2012-03-22 17:23 ` Eli Zaretskii 2012-03-23 1:41 ` Kenichi Handa 2012-03-22 21:59 ` Mohsen BANAN 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-22 17:23 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Thu, 22 Mar 2012 13:26:32 +0900 > > In article <83ty1tw3rs.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > > If none of LRO..PDF, LRE..PDF work, and if there's no easy > > > way to determine when to insert LRM, the only way is to > > > insert LRMs unconditionally. > > > You can use bidi-string-mark-left-to-right, I think, which will > > refrain from inserting the LRM characters where possible. > > I see. I've just committed a change to use > bidi-string-mark-left-to-right. Looks good to me (but Mohsen should tell). Btw, there's some strange problem in displaying one label of the hebrew-biblical-tiro input method: the character u+05ba (inserted by Shift-5 key) is displayed as a blank rectangle. It looks like my fonts have no glyph for this character, but then why don't we display this like any other glyphless character: as a hex code inside a small rectangle? That's what I get if I insert this character into a buffer, but somehow the way we display it in the keyboard layout (and in the "C-u C-x =" display under "decomposition") behaves differently. Why is that? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-22 17:23 ` Eli Zaretskii @ 2012-03-23 1:41 ` Kenichi Handa 2012-03-23 10:12 ` bug#11072: Display of glyphless non-spacing modifiers via a static composition Eli Zaretskii 2012-03-23 10:12 ` bidi and shaping problems in describe-input-method Eli Zaretskii 0 siblings, 2 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-23 1:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83fwd0wnwl.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > Btw, there's some strange problem in displaying one label of the > hebrew-biblical-tiro input method: the character u+05ba (inserted by > Shift-5 key) is displayed as a blank rectangle. It looks like my > fonts have no glyph for this character, but then why don't we display > this like any other glyphless character: as a hex code inside a small > rectangle? That's what I get if I insert this character into a > buffer, but somehow the way we display it in the keyboard layout (and > in the "C-u C-x =" display under "decomposition") behaves differently. > Why is that? As that character is a non-spacing modifier, we display it with a static composition, and a glyph in a static composition are displayed by a blank rectangle if no font is available. This is because a hex code makes the resulting display of composition (several glyphs may occupy a single column) unreadable. It may be possible to change the current code to use a hex code displaying if a composition contains just one glyph and that glyph has no font, but it may be for 24.2. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#11072: Display of glyphless non-spacing modifiers via a static composition 2012-03-23 1:41 ` Kenichi Handa @ 2012-03-23 10:12 ` Eli Zaretskii 2019-10-30 23:13 ` Stefan Kangas 2012-03-23 10:12 ` bidi and shaping problems in describe-input-method Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-23 10:12 UTC (permalink / raw) To: Kenichi Handa; +Cc: 11072 > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 23 Mar 2012 10:41:07 +0900 > > In article <83fwd0wnwl.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > Btw, there's some strange problem in displaying one label of the > > hebrew-biblical-tiro input method: the character u+05ba (inserted by > > Shift-5 key) is displayed as a blank rectangle. It looks like my > > fonts have no glyph for this character, but then why don't we display > > this like any other glyphless character: as a hex code inside a small > > rectangle? That's what I get if I insert this character into a > > buffer, but somehow the way we display it in the keyboard layout (and > > in the "C-u C-x =" display under "decomposition") behaves differently. > > Why is that? > > As that character is a non-spacing modifier, we display it > with a static composition, and a glyph in a static > composition are displayed by a blank rectangle if no font is > available. This is because a hex code makes the resulting > display of composition (several glyphs may occupy a single > column) unreadable. > > It may be possible to change the current code to use a hex > code displaying if a composition contains just one glyph and > that glyph has no font, but it may be for 24.2. So this bug will wait for after Emacs 24.1 release to be fixed. Thanks. ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#11072: Display of glyphless non-spacing modifiers via a static composition 2012-03-23 10:12 ` bug#11072: Display of glyphless non-spacing modifiers via a static composition Eli Zaretskii @ 2019-10-30 23:13 ` Stefan Kangas 2019-10-31 14:13 ` Eli Zaretskii 0 siblings, 1 reply; 41+ messages in thread From: Stefan Kangas @ 2019-10-30 23:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 11072, Kenichi Handa Eli Zaretskii <eliz@gnu.org> writes: >> From: Kenichi Handa <handa@m17n.org> >> Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org >> Date: Fri, 23 Mar 2012 10:41:07 +0900 >> >> In article <83fwd0wnwl.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: >> >> > Btw, there's some strange problem in displaying one label of the >> > hebrew-biblical-tiro input method: the character u+05ba (inserted by >> > Shift-5 key) is displayed as a blank rectangle. It looks like my >> > fonts have no glyph for this character, but then why don't we display >> > this like any other glyphless character: as a hex code inside a small >> > rectangle? That's what I get if I insert this character into a >> > buffer, but somehow the way we display it in the keyboard layout (and >> > in the "C-u C-x =" display under "decomposition") behaves differently. >> > Why is that? >> >> As that character is a non-spacing modifier, we display it >> with a static composition, and a glyph in a static >> composition are displayed by a blank rectangle if no font is >> available. This is because a hex code makes the resulting >> display of composition (several glyphs may occupy a single >> column) unreadable. >> >> It may be possible to change the current code to use a hex >> code displaying if a composition contains just one glyph and >> that glyph has no font, but it may be for 24.2. > > So this bug will wait for after Emacs 24.1 release to be fixed. > > Thanks. Hi Eli, No update on this bug in 7 years. Has this been fixed in the intervening time? Best regards, Stefan Kangas ^ permalink raw reply [flat|nested] 41+ messages in thread
* bug#11072: Display of glyphless non-spacing modifiers via a static composition 2019-10-30 23:13 ` Stefan Kangas @ 2019-10-31 14:13 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2019-10-31 14:13 UTC (permalink / raw) To: Stefan Kangas; +Cc: 11072-done, handa > From: Stefan Kangas <stefan@marxist.se> > Cc: Kenichi Handa <handa@m17n.org>, 11072@debbugs.gnu.org > Date: Thu, 31 Oct 2019 00:13:16 +0100 > > >> As that character is a non-spacing modifier, we display it > >> with a static composition, and a glyph in a static > >> composition are displayed by a blank rectangle if no font is > >> available. This is because a hex code makes the resulting > >> display of composition (several glyphs may occupy a single > >> column) unreadable. > >> > >> It may be possible to change the current code to use a hex > >> code displaying if a composition contains just one glyph and > >> that glyph has no font, but it may be for 24.2. > > > > So this bug will wait for after Emacs 24.1 release to be fixed. > > > > Thanks. > > Hi Eli, > > No update on this bug in 7 years. Has this been fixed in the > intervening time? It seems so: I now see a box with a hex code. Closing. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-23 1:41 ` Kenichi Handa 2012-03-23 10:12 ` bug#11072: Display of glyphless non-spacing modifiers via a static composition Eli Zaretskii @ 2012-03-23 10:12 ` Eli Zaretskii 1 sibling, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-23 10:12 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 23 Mar 2012 10:41:07 +0900 > > In article <83fwd0wnwl.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > Btw, there's some strange problem in displaying one label of the > > hebrew-biblical-tiro input method: the character u+05ba (inserted by > > Shift-5 key) is displayed as a blank rectangle. It looks like my > > fonts have no glyph for this character, but then why don't we display > > this like any other glyphless character: as a hex code inside a small > > rectangle? That's what I get if I insert this character into a > > buffer, but somehow the way we display it in the keyboard layout (and > > in the "C-u C-x =" display under "decomposition") behaves differently. > > Why is that? > > As that character is a non-spacing modifier, we display it > with a static composition, and a glyph in a static > composition are displayed by a blank rectangle if no font is > available. This is because a hex code makes the resulting > display of composition (several glyphs may occupy a single > column) unreadable. > > It may be possible to change the current code to use a hex > code displaying if a composition contains just one glyph and > that glyph has no font, but it may be for 24.2. Fair enough. I filed a bug report about this, so it doesn't get forgotten. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-22 4:26 ` Kenichi Handa 2012-03-22 17:23 ` Eli Zaretskii @ 2012-03-22 21:59 ` Mohsen BANAN 1 sibling, 0 replies; 41+ messages in thread From: Mohsen BANAN @ 2012-03-22 21:59 UTC (permalink / raw) To: Kenichi Handa; +Cc: Eli Zaretskii, list-general, emacs-devel >>>>> On Thu, 22 Mar 2012 13:26:32 +0900, Kenichi Handa <handa@m17n.org> said: Kenichi> In article <83ty1tw3rs.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: >> You can use bidi-string-mark-left-to-right, I think, which will >> refrain from inserting the LRM characters where possible. Kenichi> I see. I've just committed a change to use Kenichi> bidi-string-mark-left-to-right. Kenichi> Mohsen, could you please try again with the latest code? I tried describe-input-method for arabic and persian input methods. All 3 looked correct. Thanks. ...Mohsen ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-12 7:47 ` Kenichi Handa 2012-03-12 17:42 ` Eli Zaretskii @ 2012-03-13 5:46 ` Mohsen BANAN 1 sibling, 0 replies; 41+ messages in thread From: Mohsen BANAN @ 2012-03-13 5:46 UTC (permalink / raw) To: Kenichi Handa; +Cc: Eli Zaretskii, list-general, emacs-devel >>>>> On Mon, 12 Mar 2012 16:47:11 +0900, Kenichi Handa <handa@m17n.org> said: Kenichi> I've just installed a proper change Kenichi> including the magic of compose-string. Kenichi> Please try the latest code. I tried the latest code and both shaping and bidi describe-input-method problems are properly fixed for both persian and arabic keybaords. Thanks Kenichi. Thanks Eli. ...Mohsen ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 23:48 ` Kenichi Handa 2012-03-09 8:11 ` Eli Zaretskii @ 2012-03-09 8:17 ` Eli Zaretskii 1 sibling, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 8:17 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 09 Mar 2012 08:48:35 +0900 > > In article <83ipif0x46.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > > just setting bidi-paragraph-direction to 'left-to-right is > > > not enough, because keyboard cells in a row are still > > > re-ordered. > > > Right. I didn't notice it because I don't read Arabic. > > That's re-ordering happens for Hebrew input method too. :-p But it's all but impossible to notice it there, because it only happens for a few key cells in the middle of a boring keyboard layout. You cannot see it unless you move the cursor past those cells. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 15:30 ` Kenichi Handa 2012-03-08 18:24 ` Eli Zaretskii @ 2012-03-08 18:30 ` Eli Zaretskii 2012-03-08 18:53 ` Eli Zaretskii 2012-03-08 23:19 ` Kenichi Handa 1 sibling, 2 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-08 18:30 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: emacs-devel@gnu.org, eliz@gnu.org > Date: Fri, 09 Mar 2012 00:30:25 +0900 > > > So, the simplest fix (and perhaps > > the-right-thing-to-do) is to ALWAYS insert a > > (ucs-insert 8204) -- zero width non-joiner -- > > between the two characters in each and every > > keyboard cell. > > If we insert something unconditionally, I think inserting > (propertize " " 'invisible t) is safer. Unfortunately, this doesn't work: invisible characters are not handed to the shaping engine, they are silently skipped by the display engine. So the characters are still joined. We need something smarter here. I'll let you and Mohsen find the solution to this one. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 18:30 ` Eli Zaretskii @ 2012-03-08 18:53 ` Eli Zaretskii 2012-03-08 23:27 ` Kenichi Handa 2012-03-08 23:19 ` Kenichi Handa 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-08 18:53 UTC (permalink / raw) To: handa; +Cc: list-general, emacs-devel > Date: Thu, 08 Mar 2012 20:30:51 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > > > If we insert something unconditionally, I think inserting > > (propertize " " 'invisible t) is safer. > > Unfortunately, this doesn't work: invisible characters are not handed > to the shaping engine, they are silently skipped by the display > engine. So the characters are still joined. > > We need something smarter here. One obvious possibility is to turn off auto-composition-mode. But when I tried that, unexpected characters showed up in some cells, e.g. in the T cell an in the G cell. I guess some characters shown in the Arabic keyboard layout do need auto-composition-mode? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 18:53 ` Eli Zaretskii @ 2012-03-08 23:27 ` Kenichi Handa 0 siblings, 0 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-08 23:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83fwdi2ac3.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > One obvious possibility is to turn off auto-composition-mode. But > when I tried that, unexpected characters showed up in some cells, > e.g. in the T cell an in the G cell. I guess some characters shown in > the Arabic keyboard layout do need auto-composition-mode? Yes. Some keys insert two characters that are composed into one glyph. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 18:30 ` Eli Zaretskii 2012-03-08 18:53 ` Eli Zaretskii @ 2012-03-08 23:19 ` Kenichi Handa 2012-03-09 8:15 ` Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2012-03-08 23:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83haxz0wtg.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > If we insert something unconditionally, I think inserting > > (propertize " " 'invisible t) is safer. > Unfortunately, this doesn't work: invisible characters are not handed > to the shaping engine, they are silently skipped by the display > engine. So the characters are still joined. No, the shaping engine checks buffer/string contents. So, if there's a space between A and B, the rule for shaping AB sequence is not activated. Please try these two: (insert #x642 #x64C) (insert #x642 (propertize " " 'invisible t) #x64C) --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-08 23:19 ` Kenichi Handa @ 2012-03-09 8:15 ` Eli Zaretskii 2012-03-09 9:01 ` Juanma Barranquero 2012-03-09 13:54 ` Kenichi Handa 0 siblings, 2 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 8:15 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 09 Mar 2012 08:19:20 +0900 > > No, the shaping engine checks buffer/string contents. So, > if there's a space between A and B, the rule for shaping AB > sequence is not activated. Please try these two: > > (insert #x642 #x64C) > (insert #x642 (propertize " " 'invisible t) #x64C) This looks exactly identical to me (on MS-Windows), except that the second one causes annoying behavior of cursor motion around the inserted text. Does it work for you on GNU/Linux? If so, does it work for you to change quail-insert-kbd-layout to use this trick in order to separate the `lower' from the `upper' in the key cells? I tried that on my machine, and it didn't have the desired effect. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 8:15 ` Eli Zaretskii @ 2012-03-09 9:01 ` Juanma Barranquero 2012-03-09 9:45 ` Eli Zaretskii 2012-03-09 13:54 ` Kenichi Handa 1 sibling, 1 reply; 41+ messages in thread From: Juanma Barranquero @ 2012-03-09 9:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, list-general, Kenichi Handa [-- Attachment #1: Type: text/plain, Size: 304 bytes --] On Fri, Mar 9, 2012 at 09:15, Eli Zaretskii <eliz@gnu.org> wrote: > This looks exactly identical to me (on MS-Windows), except that the > second one causes annoying behavior of cursor motion around the > inserted text. It does not look identical to me on W7. See attached image. Juanma [-- Attachment #2: bug.png --] [-- Type: image/png, Size: 2205 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 9:01 ` Juanma Barranquero @ 2012-03-09 9:45 ` Eli Zaretskii 2012-03-09 10:02 ` Eli Zaretskii 2012-03-09 11:19 ` Juanma Barranquero 0 siblings, 2 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 9:45 UTC (permalink / raw) To: Juanma Barranquero; +Cc: emacs-devel, list-general, handa > From: Juanma Barranquero <lekktu@gmail.com> > Date: Fri, 9 Mar 2012 10:01:31 +0100 > Cc: Kenichi Handa <handa@m17n.org>, list-general@mohsen.1.banan.byname.net, > emacs-devel@gnu.org > > > This looks exactly identical to me (on MS-Windows), except that the > > second one causes annoying behavior of cursor motion around the > > inserted text. > > It does not look identical to me on W7. See attached image. What font is used on your machine to render the #x64C character? On my machine it is this: uniscribe:-outline-Courier New-normal-normal-normal-mono-13-*-*-*-c-*-iso10646-1 (#x2F2) If the font doesn't explain that, then perhaps what I see is a bug in the version of Uniscribe on XP. Btw, at least on the screenshot you sent, the display of #x64C is incorrect. Compare with what you see when you type "C-u C-x =" for that character. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 9:45 ` Eli Zaretskii @ 2012-03-09 10:02 ` Eli Zaretskii 2012-03-09 14:11 ` Kenichi Handa 2012-03-09 11:19 ` Juanma Barranquero 1 sibling, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 10:02 UTC (permalink / raw) To: handa; +Cc: lekktu, list-general, emacs-devel > Date: Fri, 09 Mar 2012 11:45:46 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org, list-general@mohsen.1.banan.byname.net, handa@m17n.org > > Btw, at least on the screenshot you sent, the display of #x64C is > incorrect. Compare with what you see when you type "C-u C-x =" for > that character. The display in "C-u C-x =" is generated by this snippet (from descr-text.el): (insert (char-code-property-description 'decomposition '(#x64C))) Somehow, using this produces a correct display of the character (albeit enclosed in quotes) without any problems. Perhaps Handa-san could explain what kind of magic the above does, as compared to simply inserting the same character into the buffer. The only sign of magic I see is this: (char-code-property-description 'decomposition '(#x64C)) => #("'ٌ'" 1 2 (composition ((1 . " ٌ ")))) So the string produced by char-code-property-description has the `composition' text property on the character we want to display. The value of the text property, in case you wonder, is this: ((1 . "\t\x64C\t")) But how does this countermand the problems is a mystery to me; the ELisp manual says about the value of this property: `composition' This text property is used to display a sequence of characters as a single glyph composed from components. But the value of the property itself is completely internal to Emacs and should not be manipulated directly by, for instance, `put-text-property'. (A.k.a.: "this is need-to-know only, and you don't need to know".) Anyway, maybe we could use something like this in generating the keyboard layouts by quail.el. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 10:02 ` Eli Zaretskii @ 2012-03-09 14:11 ` Kenichi Handa 0 siblings, 0 replies; 41+ messages in thread From: Kenichi Handa @ 2012-03-09 14:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lekktu, list-general, emacs-devel In article <83399i149j.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > (insert (char-code-property-description 'decomposition '(#x64C))) > Somehow, using this produces a correct display of the character > (albeit enclosed in quotes) without any problems. Perhaps Handa-san > could explain what kind of magic the above does, as compared to simply > inserting the same character into the buffer. The only sign of magic > I see is this: > (char-code-property-description 'decomposition '(#x64C)) >>> #("'ٌ'" 1 2 (composition ((1 . " ٌ ")))) Yes, that function inserts a static composition that uses this magic (excerpt from the docstring of compose-region): ------------------------------------------------------------ If it is a string, the elements are alternate characters. In this case, TAB element has a special meaning. If the first character is TAB, the glyphs are displayed with left padding space so that no pixel overlaps with the previous column. If the last character is TAB, the glyphs are displayed with right padding space so that no pixel overlaps with the following column. ------------------------------------------------------------ And if there's a static composition, the automatic (dynamic) composition is surpressed. > Anyway, maybe we could use something like this in generating the > keyboard layouts by quail.el. I agree. I'll commit a proper fix soon. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 9:45 ` Eli Zaretskii 2012-03-09 10:02 ` Eli Zaretskii @ 2012-03-09 11:19 ` Juanma Barranquero 2012-03-09 11:41 ` Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Juanma Barranquero @ 2012-03-09 11:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: handa, list-general, emacs-devel On Fri, Mar 9, 2012 at 10:45, Eli Zaretskii <eliz@gnu.org> wrote: > What font is used on your machine to render the #x64C character? - In the first case: position: 212 of 267 (79%), column: 20 character: ق (displayed as ق) (codepoint 1602, #o3102, #x642) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0x0642 syntax: w which means: word category: .:Base, R:Right-to-left (strong), b:Arabic buffer code: #xD9 #x82 file code: #xD9 #x82 (encoded by coding system nil) display: composed to form "قٌ" (see below) Composed with the following character(s) "ٌ" using this font: uniscribe:-outline-Courier New-normal-normal-normal-mono-13-*-*-*-c-*-iso10646-1 by these glyphs: [0 1 1602 981 8 1 8 12 4 nil] [0 1 1602 754 0 2 6 12 4 [1 -1 0]] Character code properties: customize what to show name: ARABIC LETTER QAF general-category: Lo (Letter, Other) decomposition: (1602) ('ق') There are text properties here: fontified t - In the second one: position: 265 of 267 (99%), column: 50 character: ق (displayed as ق) (codepoint 1602, #o3102, #x642) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0x0642 syntax: w which means: word category: .:Base, R:Right-to-left (strong), b:Arabic buffer code: #xD9 #x82 file code: #xD9 #x82 (encoded by coding system nil) display: by this font (glyph code) uniscribe:-outline-Courier New-normal-normal-normal-mono-13-*-*-*-c-*-iso10646-1 (#x3D5) Character code properties: customize what to show name: ARABIC LETTER QAF general-category: Lo (Letter, Other) decomposition: (1602) ('ق') There are text properties here: fontified t Juanma ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 11:19 ` Juanma Barranquero @ 2012-03-09 11:41 ` Eli Zaretskii 2012-03-09 14:56 ` Juanma Barranquero 0 siblings, 1 reply; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 11:41 UTC (permalink / raw) To: Juanma Barranquero; +Cc: handa, list-general, emacs-devel > From: Juanma Barranquero <lekktu@gmail.com> > Date: Fri, 9 Mar 2012 12:19:41 +0100 > Cc: emacs-devel@gnu.org, list-general@mohsen.1.banan.byname.net, > handa@m17n.org > > On Fri, Mar 9, 2012 at 10:45, Eli Zaretskii <eliz@gnu.org> wrote: > > > What font is used on your machine to render the #x64C character? > > - In the first case: > > position: 212 of 267 (79%), column: 20 > character: ق (displayed as ق) (codepoint 1602, #o3102, #x642) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0x0642 > syntax: w which means: word > category: .:Base, R:Right-to-left (strong), b:Arabic > buffer code: #xD9 #x82 > file code: #xD9 #x82 (encoded by coding system nil) > display: composed to form "قٌ" (see below) > > Composed with the following character(s) "ٌ" using this font: > uniscribe:-outline-Courier > New-normal-normal-normal-mono-13-*-*-*-c-*-iso10646-1 > by these glyphs: > [0 1 1602 981 8 1 8 12 4 nil] > [0 1 1602 754 0 2 6 12 4 [1 -1 0]] > > Character code properties: customize what to show > name: ARABIC LETTER QAF > general-category: Lo (Letter, Other) > decomposition: (1602) ('ق') > > There are text properties here: > fontified t > > - In the second one: > > position: 265 of 267 (99%), column: 50 > character: ق (displayed as ق) (codepoint 1602, #o3102, #x642) > preferred charset: unicode (Unicode (ISO10646)) > code point in charset: 0x0642 > syntax: w which means: word > category: .:Base, R:Right-to-left (strong), b:Arabic > buffer code: #xD9 #x82 > file code: #xD9 #x82 (encoded by coding system nil) > display: by this font (glyph code) > uniscribe:-outline-Courier > New-normal-normal-normal-mono-13-*-*-*-c-*-iso10646-1 (#x3D5) > > Character code properties: customize what to show > name: ARABIC LETTER QAF > general-category: Lo (Letter, Other) > decomposition: (1602) ('ق') I asked about the #x64C character (1612), not about #x642. The latter is displayed just fine here, it's the former that causes some kind of trouble. Thanks. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 11:41 ` Eli Zaretskii @ 2012-03-09 14:56 ` Juanma Barranquero 0 siblings, 0 replies; 41+ messages in thread From: Juanma Barranquero @ 2012-03-09 14:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: handa, list-general, emacs-devel On Fri, Mar 9, 2012 at 12:41, Eli Zaretskii <eliz@gnu.org> wrote: > I asked about the #x64C character (1612), not about #x642. Yes, sorry. In the second case, describe-char says that it is a space: position: 266 of 267 (99%), column: 51 character: SPC (displayed as SPC) (codepoint 32, #o40, #x20) preferred charset: ascii (ASCII (ISO646 IRV)) code point in charset: 0x20 syntax: which means: whitespace category: .:Base, a:ASCII, l:Latin buffer code: #x20 file code: #x20 (encoded by coding system nil) display: by this font (glyph code) uniscribe:-outline-Courier New-normal-normal-normal-mono-13-*-*-*-c-*-iso8859-1 (#x03) Character code properties: customize what to show name: SPACE general-category: Zs (Separator, Space) decomposition: (32) (' ') There are text properties here: fontified t invisible t If inserted alone: position: 285 of 289 (98%), column: 0 character: ٌ (displayed as ٌ) (codepoint 1612, #o3114, #x64c) preferred charset: unicode (Unicode (ISO10646)) code point in charset: 0x064C syntax: w which means: word category: b:Arabic buffer code: #xD9 #x8C file code: #xD9 #x8C (encoded by coding system nil) display: composed to form "ٌ" (see below) Composed using this font: uniscribe:-outline-Courier New-normal-normal-normal-mono-13-*-*-*-c-*-iso10646-1 by these glyphs: [0 0 1612 2673 8 0 8 12 4 nil] [0 0 1612 754 0 2 6 12 4 nil] Character code properties: customize what to show name: ARABIC DAMMATAN general-category: Mn (Mark, Nonspacing) decomposition: (1612) ('ٌ') There are text properties here: fontified t ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 8:15 ` Eli Zaretskii 2012-03-09 9:01 ` Juanma Barranquero @ 2012-03-09 13:54 ` Kenichi Handa 2012-03-09 16:15 ` Eli Zaretskii 1 sibling, 1 reply; 41+ messages in thread From: Kenichi Handa @ 2012-03-09 13:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: list-general, emacs-devel In article <83boo61972.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > (insert #x642 #x64C) > > (insert #x642 (propertize " " 'invisible t) #x64C) > This looks exactly identical to me (on MS-Windows), except that the > second one causes annoying behavior of cursor motion around the > inserted text. > Does it work for you on GNU/Linux? Yes. > If so, does it work for you to change > quail-insert-kbd-layout to use this trick in order to > separate the `lower' from the `upper' in the key cells? Yes. But it depends on the font selected for arabic and the shaping engine for that font. Some shapers display a glyph for an independent combining character with dotted circle (if the width of the glyph is zero). The better result is done by this: (insert #x642 (compose-string "\x64C" 0 1 "\t\x64C\t")) I tried it with 4 fonts on GNU/Linux and all were ok. > I tried that on my machine, and it didn't have the desired > effect. Please try above. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-09 13:54 ` Kenichi Handa @ 2012-03-09 16:15 ` Eli Zaretskii 0 siblings, 0 replies; 41+ messages in thread From: Eli Zaretskii @ 2012-03-09 16:15 UTC (permalink / raw) To: Kenichi Handa; +Cc: list-general, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: list-general@mohsen.1.banan.byname.net, emacs-devel@gnu.org > Date: Fri, 09 Mar 2012 22:54:53 +0900 > > In article <83boo61972.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > > > > (insert #x642 #x64C) > > > (insert #x642 (propertize " " 'invisible t) #x64C) > > > This looks exactly identical to me (on MS-Windows), except that the > > second one causes annoying behavior of cursor motion around the > > inserted text. > > > Does it work for you on GNU/Linux? > > Yes. > > > If so, does it work for you to change > > quail-insert-kbd-layout to use this trick in order to > > separate the `lower' from the `upper' in the key cells? > > Yes. But it depends on the font selected for arabic and the > shaping engine for that font. Some shapers display a glyph > for an independent combining character with dotted circle (if > the width of the glyph is zero). It looks like Uniscribe on Windows, or at least its version supplied with XP, doesn't live in peace with zero-width combining characters, which is why I don't see the effect of inserting an invisible space. > The better result is done by this: > > (insert #x642 (compose-string "\x64C" 0 1 "\t\x64C\t")) Yes, that's what "C-u C-x =" does, and it works for me as well. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: bidi and shaping problems in describe-input-method 2012-03-06 22:17 bidi and shaping problems in describe-input-method Mohsen BANAN 2012-03-07 4:05 ` Eli Zaretskii @ 2012-03-08 4:30 ` Miles Bader 1 sibling, 0 replies; 41+ messages in thread From: Miles Bader @ 2012-03-08 4:30 UTC (permalink / raw) To: Mohsen BANAN; +Cc: emacs-devel Incidentally, how does one enable `quail-show-keyboard-layout' support for a given input method? I think it would be super useful for e.g. `korean-hangul', but (activate-input-method 'korean-hangul) (quail-show-keyboard-layout) yields nonsense... Thanks, -miles -- Sabbath, n. A weekly festival having its origin in the fact that God made the world in six days and was arrested on the seventh. ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2019-10-31 14:13 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-06 22:17 bidi and shaping problems in describe-input-method Mohsen BANAN 2012-03-07 4:05 ` Eli Zaretskii 2012-03-07 18:49 ` Eli Zaretskii 2012-03-07 21:32 ` Mohsen BANAN 2012-03-08 15:30 ` Kenichi Handa 2012-03-08 18:24 ` Eli Zaretskii 2012-03-08 23:48 ` Kenichi Handa 2012-03-09 8:11 ` Eli Zaretskii 2012-03-09 14:03 ` Kenichi Handa 2012-03-09 16:12 ` Eli Zaretskii 2012-03-10 2:55 ` Kenichi Handa 2012-03-10 10:27 ` Eli Zaretskii 2012-03-12 7:47 ` Kenichi Handa 2012-03-12 17:42 ` Eli Zaretskii 2012-03-13 0:58 ` Kenichi Handa 2012-03-13 3:58 ` Eli Zaretskii 2012-03-22 4:26 ` Kenichi Handa 2012-03-22 17:23 ` Eli Zaretskii 2012-03-23 1:41 ` Kenichi Handa 2012-03-23 10:12 ` bug#11072: Display of glyphless non-spacing modifiers via a static composition Eli Zaretskii 2019-10-30 23:13 ` Stefan Kangas 2019-10-31 14:13 ` Eli Zaretskii 2012-03-23 10:12 ` bidi and shaping problems in describe-input-method Eli Zaretskii 2012-03-22 21:59 ` Mohsen BANAN 2012-03-13 5:46 ` Mohsen BANAN 2012-03-09 8:17 ` Eli Zaretskii 2012-03-08 18:30 ` Eli Zaretskii 2012-03-08 18:53 ` Eli Zaretskii 2012-03-08 23:27 ` Kenichi Handa 2012-03-08 23:19 ` Kenichi Handa 2012-03-09 8:15 ` Eli Zaretskii 2012-03-09 9:01 ` Juanma Barranquero 2012-03-09 9:45 ` Eli Zaretskii 2012-03-09 10:02 ` Eli Zaretskii 2012-03-09 14:11 ` Kenichi Handa 2012-03-09 11:19 ` Juanma Barranquero 2012-03-09 11:41 ` Eli Zaretskii 2012-03-09 14:56 ` Juanma Barranquero 2012-03-09 13:54 ` Kenichi Handa 2012-03-09 16:15 ` Eli Zaretskii 2012-03-08 4:30 ` Miles Bader
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.