unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* RLM and LRM are composed?
@ 2010-03-29 16:06 Eli Zaretskii
  2010-03-29 16:53 ` Eli Zaretskii
  2010-04-01  6:39 ` Kenichi Handa
  0 siblings, 2 replies; 5+ messages in thread
From: Eli Zaretskii @ 2010-03-29 16:06 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

Evaluate this form:

  (aset standard-display-table ?‎ (vconcat "->"))

and then visit a file with this single line:

    Hebrew ‏(עברית)	שלום

The character being set up in the standard-display-table is RLM,
RIGHT-TO-LEFT MARK.  If you are reading this in a GUI session, chances
are it will be displayed as whitespace.  The same character is before
the left paren after "Hebrew".  However, Emacs does not display "->"
instead of it, as I'd expect.  It thinks it does (try "C-u C-x =" on
that character), but it doesn't.

If I step with a debugger through produce_glyphs (in the TTY case) or
through x_produce_glyphs (in the GUI case), I see that the glyph we
produce for displaying this character is not IT_CHARACTER, but
IT_COMPOSITION.

Questions:

1. Why do we display this character as composition?  It is not
   supposed to be composed with anything, AFAIK.

2. Is it a bug or a feature that composed characters don't go through
   the display table?  If it's a feature, what is its purpose?

TIA





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RLM and LRM are composed?
  2010-03-29 16:06 RLM and LRM are composed? Eli Zaretskii
@ 2010-03-29 16:53 ` Eli Zaretskii
  2010-04-01  6:39 ` Kenichi Handa
  1 sibling, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2010-03-29 16:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, handa

> Date: Mon, 29 Mar 2010 19:06:32 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> Evaluate this form:
> 
>   (aset standard-display-table ?‎ (vconcat "->"))

Sorry, that was a mistake.  Please use this line instead:

  (aset standard-display-table ?‏ (vconcat "<-"))





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RLM and LRM are composed?
  2010-03-29 16:06 RLM and LRM are composed? Eli Zaretskii
  2010-03-29 16:53 ` Eli Zaretskii
@ 2010-04-01  6:39 ` Kenichi Handa
  2010-04-01  7:56   ` Eli Zaretskii
  1 sibling, 1 reply; 5+ messages in thread
From: Kenichi Handa @ 2010-04-01  6:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

In article <837hov159z.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> Evaluate this form:
>   (aset standard-display-table ?‎ (vconcat "->"))

> and then visit a file with this single line:

>     Hebrew ‏(עברית)	שלום

> The character being set up in the standard-display-table is RLM,
> RIGHT-TO-LEFT MARK.  If you are reading this in a GUI session, chances
> are it will be displayed as whitespace.  The same character is before
> the left paren after "Hebrew".  However, Emacs does not display "->"
> instead of it, as I'd expect.  It thinks it does (try "C-u C-x =" on
> that character), but it doesn't.

> If I step with a debugger through produce_glyphs (in the TTY case) or
> through x_produce_glyphs (in the GUI case), I see that the glyph we
> produce for displaying this character is not IT_CHARACTER, but
> IT_COMPOSITION.

Current code try to compose any non-spacing mark characters
with the previous spacing characters.  But, the detection of
non-spacing mark is done by (= (aref char-width-table CH)
0).  This should be changed to check char-code-property
`general-category'.  I'll fix the code soon, but at the
moment, you can workaround the problem by this:

  (aset composition-function-table #x200f nil)

---
Kenichi Handa
handa@m17n.org




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RLM and LRM are composed?
  2010-04-01  6:39 ` Kenichi Handa
@ 2010-04-01  7:56   ` Eli Zaretskii
  2010-04-01 12:55     ` Kenichi Handa
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2010-04-01  7:56 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

> From: Kenichi Handa <handa@m17n.org>
> Date: Thu, 01 Apr 2010 15:39:30 +0900
> Cc: emacs-devel@gnu.org
> 
> Current code try to compose any non-spacing mark characters
> with the previous spacing characters.

Which is probably not a bad idea.

> But, the detection of non-spacing mark is done by
> (= (aref char-width-table CH) 0).

Hmm.. and why is this wrong?

Anyway, (aref char-width-table #x200f) => 0, so it sounds like the
current detection should have worked.  What am I missing?

> at the moment, you can workaround the problem by this:
> 
>   (aset composition-function-table #x200f nil)

Yes, this displays according to the display table as expected, thanks.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RLM and LRM are composed?
  2010-04-01  7:56   ` Eli Zaretskii
@ 2010-04-01 12:55     ` Kenichi Handa
  0 siblings, 0 replies; 5+ messages in thread
From: Kenichi Handa @ 2010-04-01 12:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

In article <83eiizy5bg.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> > But, the detection of non-spacing mark is done by
> > (= (aref char-width-table CH) 0).

> Hmm.. and why is this wrong?

All formatting characters has width 0 but they are not
combining characters that Unicode expect to be combined with
a preceding base character.

> Anyway, (aref char-width-table #x200f) => 0, so it sounds like the
> current detection should have worked.  What am I missing?

The situation is a little bit complicated.  For U+200F, we
set this list in the composition-function-table.

(["\\c.\\c^+" 1 compose-gstring-for-graphic]
 [nil 0 compose-gstring-for-graphic])

This should read as follows (provided that the buffer
position of U+200F is POS).

(cond
 ((save-excursion (goto-char (1- POS)) (looking-at "\\c.\\c^+"))
  (compose-gstring-for-graphic
   (composition-get-gstring (1- POS) (mathc-end 0) ...)))
 (t (compose-gstring-for-graphic
     (composition-get-gstring POS (1+ POS) ...))))

Here as U+200F doesn't has category "^" (combining), the
second condition succeeds, and compose-gstring-for-graphic
tries to compose just one char U+200F.  The problem here is
that the original intention of the second condition is for
an independent combining character not following a base
character, not for a non-combining character of zero width.

What compose-gstring-for-graphic does for a single character
is to adjust the metrics of the glyph to display it as if it
is a spacing character so that a user can edit that
character easily.

Please give me more time to consider the detail of the
current situation.

For your 2nd question:

> 2. Is it a bug or a feature that composed characters don't go through
>    the display table?  If it's a feature, what is its purpose?

perhaps we should apply the display table at least to a
character that is composed only by itself (i.e. one-char
composition as in the above case).

---
Kenichi Handa
handa@m17n.org




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-04-01 12:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-29 16:06 RLM and LRM are composed? Eli Zaretskii
2010-03-29 16:53 ` Eli Zaretskii
2010-04-01  6:39 ` Kenichi Handa
2010-04-01  7:56   ` Eli Zaretskii
2010-04-01 12:55     ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).