From: Kenichi Handa <handa@m17n.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: Compositions and bidi display
Date: Fri, 30 Apr 2010 21:12:04 +0900 [thread overview]
Message-ID: <tl71vdxgmwb.fsf@m17n.org> (raw)
In-Reply-To: <83r5lxw8wi.fsf@gnu.org> (message from Eli Zaretskii on Fri, 30 Apr 2010 13:07:41 +0300)
I'll reply to this before replying to your previous mail.
In article <83r5lxw8wi.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> > Note that composition_compute_stop_pos just finds a stop
> > position to check, and the actual checking and composing is
> > done by composition_reseat_it which is called by
> > CHAR_COMPOSED_P.
> But it looks like composition_compute_stop_pos does use at least some
> validation for the candidate stop position. AFAIU, this fragment
> finds and validates a static composition:
> if (find_composition (charpos, endpos, &start, &end, &prop, string)
> && COMPOSITION_VALID_P (start, end, prop))
> {
> cmp_it->stop_pos = endpos = start;
> cmp_it->ch = -1;
> }
> So it looks like COMPOSITION_VALID_P is the proper way of validating a
> position that is a candidate for a static composition. Is that true?
Yes.
> If it is true, then the end point of the static composition is given
> by the `end' argument to find_composition,
Yes.
> and all we need is record it in cmp_it.
Record it for what purpose?
Anyway, calling COMPOSITION_VALID_P here is because we can
avoid calling it again in composition_reseat_it. But, for
automatic composition, the checking and actual composing
happens at the same time. So, even if we do that in
composition_compute_stop_pos, composition_reseat_it has to
do that again (for actual composing).
> And the loop after that, conditioned on auto-composition-mode, seems
> to do a similar job for automatic compositions. Omitting some
> secondary details, that loop does this:
> while (charpos < endpos)
> {
> [advance to the next character]
> val = CHAR_TABLE_REF (Vcomposition_function_table, c);
> if (! NILP (val))
> {
> Lisp_Object elt;
> for (; CONSP (val); val = XCDR (val))
> {
> elt = XCAR (val);
> if (VECTORP (elt) && ASIZE (elt) == 3 && NATNUMP (AREF (elt, 1))
> && charpos - 1 - XFASTINT (AREF (elt, 1)) >= start)
> break;
> }
> if (CONSP (val))
> {
> cmp_it->lookback = XFASTINT (AREF (elt, 1));
> cmp_it->stop_pos = charpos - 1 - cmp_it->lookback;
> cmp_it->ch = c;
> return;
> }
> }
> }
> This looks as if a position that is a candidate for starting a
> composition sequence should have a non-nil entry in
> composition-function-table for the character at that position, and
> that entry should specify the (relative) character position where the
> sequence might start. Is my understanding correct?
Mostly, but not accuate. The correct one is "A position
that will be composed with the following and/or the
preceding characters should have a non-nil entry in ...".
The reason why we don't record all characters that will
start a composition is for efficiency (for instance, to
record only combining characters (U+0300...U+03FF) in
composition-function-table).
> > To move from one composition position to the next, we must actually
> > call autocmp_chars and find where the current composition ends, then
> > start searching for the next composition.
> It is true that the code looking for stop position that might begin an
> automatic composition does not compute the end of the sequence. That
> end is computed by autocmp_chars. But what does this mean in
> practice? Suppose we have found a candidate stop_pos, marked by S
> below:
> abcdeSuvwxyz
> First, a composition sequence cannot be shorter than 2 characters,
> right?
No, a single character can composed.
> So the next stop_pos cannot be before v. Now suppose that the
> actual composition sequence is "Suvw", and we issue the next call to
> composition_compute_stop_pos at v -- are you saying that it will
> suggest that v is also a possible stop_pos, even though it is in the
> middle of a composition sequence? --- (Q1)
Yes, that happens in Indic scripts. Actually both a line
starting with "Suvw" and a line staring with "vw" can have
different composition at BOL. But, AFAIK, all R2L scripts
(Arabic, Dhivehi, Hebrew) don't have such a charactics. So,
in a adhoc way, we can say that your (Q1) is false. So,
> If not, then repeated calls to
> composition_compute_stop_pos in the bidi case, without calling
> composition_reseat_it in between, will just be slightly
> more expensive because they will need to examine more positions. Is
> this analysis correct?
it is correct but just empirically. There will be a script
that uses the same writing system as Devanagari but in R2L
manner somewhere between Indic and Arabic region. I have no
idea.
> > But composition_reseat_it also needs ENDPOS
> We can use IT_CHARPOS + MAX_COMPOSITION_COMPONENTS as ENDPOS, if we
> call composition_reseat_it and composition_compute_stop_pos in the
> forward direction repeatedly, can't we? That's because, when the
> iterator is some position, we are only interested in compositions that
> cover that position.
No. Such a way slows down the display of a buffer that has
no composition at all. For such a buffer,
composition_compute_stop_pos should set cmp_it->stop_pos to
the actual endpos so that CHAR_COMPOSED_P quickly returns
zero.
> > We don't have to re-calculate ENDPOS each time. It must be
> > updated only when we pass over bidi boundary.
> Btw, can we always assume that all the characters of a composition
> sequence are at the same embedding level? I guess IOW I'm asking what
> Emacs features are currently implemented based on compositions?
Yes. I can't think of any situation that characters must be
composed striding over bidi-boundary. First of all, in
what embedding level, such a composition belongs?
> Obviously, all the characters in a sequence that produces a single
> grapheme must have the same level, but what about compositions that
> produce several grapheme clusters -- can each of the clusters have
> different bidirectional properties?
It is possible to setup a regular expression of an entry of
composition-function-table to do such a composition. But, I
think we don't have to support such a thing until we face
with a concrete example of the necessity (quite doubtfull).
---
Kenichi Handa
handa@m17n.org
next prev parent reply other threads:[~2010-04-30 12:12 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-19 20:50 bug#5977: 24.0.50; Lao HELLO is incorrectly displayed Peter Dyballa
2010-04-19 23:15 ` Jason Rumney
2010-04-20 9:14 ` Eli Zaretskii
2010-04-20 23:16 ` Jason Rumney
2010-04-20 9:06 ` Eli Zaretskii
2010-04-20 10:28 ` Peter Dyballa
2010-04-20 12:17 ` Eli Zaretskii
2010-04-23 18:31 ` Eli Zaretskii
2010-04-21 2:32 ` Kenichi Handa
2010-04-23 18:52 ` Compositions and bidi display (was: bug#5977: 24.0.50; Lao HELLO is incorrectly displayed) Eli Zaretskii
2010-04-23 20:34 ` Andreas Schwab
2010-04-23 20:43 ` Eli Zaretskii
2010-04-24 11:27 ` Eli Zaretskii
2010-04-26 2:09 ` Kenichi Handa
2010-04-26 2:38 ` Kenichi Handa
2010-04-26 11:29 ` Kenichi Handa
2010-04-26 18:40 ` Compositions and bidi display Eli Zaretskii
2010-04-27 12:15 ` Kenichi Handa
2010-04-28 3:18 ` Eli Zaretskii
2010-04-28 4:01 ` Kenichi Handa
2010-04-28 17:38 ` Eli Zaretskii
2010-04-28 22:49 ` Stefan Monnier
2010-04-29 3:12 ` Eli Zaretskii
2010-04-30 2:28 ` Kenichi Handa
2010-04-30 6:41 ` Eli Zaretskii
2010-04-30 6:06 ` Kenichi Handa
2010-04-30 7:08 ` Eli Zaretskii
2010-05-03 2:39 ` Kenichi Handa
2010-05-03 7:31 ` Eli Zaretskii
2010-05-04 9:19 ` Kenichi Handa
2010-05-04 17:47 ` Eli Zaretskii
2010-04-30 10:07 ` Eli Zaretskii
2010-04-30 12:12 ` Kenichi Handa [this message]
2010-04-30 13:15 ` Eli Zaretskii
2010-04-27 3:13 ` Compositions and bidi display (was: bug#5977: 24.0.50; Lao HELLO is incorrectly displayed) Eli Zaretskii
2010-04-27 12:26 ` Kenichi Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tl71vdxgmwb.fsf@m17n.org \
--to=handa@m17n.org \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.