unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org
Subject: Re: Compositions and bidi display
Date: Fri, 30 Apr 2010 16:15:13 +0300	[thread overview]
Message-ID: <83ljc5w07y.fsf@gnu.org> (raw)
In-Reply-To: <tl71vdxgmwb.fsf@m17n.org>

> From: Kenichi Handa <handa@m17n.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 30 Apr 2010 21:12:04 +0900
> 
> > So it looks like COMPOSITION_VALID_P is the proper way of validating a
> > position that is a candidate for a static composition.  Is that true?
> 
> Yes.
> 
> > If it is true, then the end point of the static composition is given
> > by the `end' argument to find_composition,
> 
> Yes.
> 
> > and all we need is record it in cmp_it.
> 
> Record it for what purpose?

For determining (1) whether the current iterator position is inside a
composition sequence, and (2) when to look for the next possible
composition sequence.

Consider a buffer with 3 composition sequence indicated by Sn..En:

   S1..E1.......S2..E2.....|.....S3..E3

Suppose the iterator is at the position marked by |.  Then the
iterator does not need to consider composite characters as long as its
character position is between E2 and S3 (exclusively).  If it gets to
between S2 and E2, then it needs to produce the composite character
from S2..E2.  If it goes back beyond S2, it will need to find the
places S1 and E1, and if it gets beyond E3, it will need to find the
next sequence, S4..E4 (not shown above).

IOW, the idea is to keep track of 2 potential composition sequences,
one before and one after the current iterator position, and recompute
them when the iterator is placed outside the region between the start
of the leftmost and the end of the rightmost one.

But it looks like this idea is not going to work with automatic
compositions, see below.

> > This looks as if a position that is a candidate for starting a
> > composition sequence should have a non-nil entry in
> > composition-function-table for the character at that position, and
> > that entry should specify the (relative) character position where the
> > sequence might start.  Is my understanding correct?
> 
> Mostly, but not accuate.  The correct one is "A position
> that will be composed with the following and/or the
> preceding characters should have a non-nil entry in ...".

Yes, that's what I meant, but failed to express.  Thanks.

> > So the next stop_pos cannot be before v.  Now suppose that the
> > actual composition sequence is "Suvw", and we issue the next call to
> > composition_compute_stop_pos at v -- are you saying that it will
> > suggest that v is also a possible stop_pos, even though it is in the
> > middle of a composition sequence?  --- (Q1)
> 
> Yes, that happens in Indic scripts.  Actually both a line
> starting with "Suvw" and a line staring with "vw" can have
> different composition at BOL.  But, AFAIK, all R2L scripts
> (Arabic, Dhivehi, Hebrew) don't have such a charactics.  So,
> in a adhoc way, we can say that your (Q1) is false.  So, 
> 
> > If not, then repeated calls to
> > composition_compute_stop_pos in the bidi case, without calling
> > composition_reseat_it in between, will just be slightly
> > more expensive because they will need to examine more positions.  Is
> > this analysis correct?
> 
> it is correct but just empirically.

Unfortunately, this means that Q1 must be considered to be true.  The
reason is the following subtlety of bidi reordering: in R2L
paragraphs, where the base embedding level is 1 (as opposed to zero in
L2R paragraphs), the bidi iterator delivers R2L characters in their
logical order, and reorders the L2R characters.  (We then reverse the
character order for display in append_glyph, which prepends each new
glyph instead of appending it, in such paragraphs.)  So, if an Indic
script is embedded in an R2L paragraph, it will hit this issue,
because the iterator will see Indic characters in reverse order.

Is there _any_ way to precompute the length of a composition sequence
when the entry is added to composition-function-table?  Or is it only
possible to compute the length given the text surrounding the
sequence, when it is actually encountered in a buffer or string?

If the latter, I see no other way except calling autocmp_chars inside
composition_compute_stop_pos.  This would slow down redisplay by a
factor of 2 at the worst.  If that turns out too expensive, we will
have to introduce some mechanism to avoid computing each composition
more than once.  What results of the call to autocmp_chars need to be
recorded in order to avoid calling it again in composition_reseat_it?

> > We can use IT_CHARPOS + MAX_COMPOSITION_COMPONENTS as ENDPOS, if we
> > call composition_reseat_it and composition_compute_stop_pos in the
> > forward direction repeatedly, can't we?  That's because, when the
> > iterator is some position, we are only interested in compositions that
> > cover that position.
> 
> No.  Such a way slows down the display of a buffer that has
> no composition at all.  For such a buffer,
> composition_compute_stop_pos should set cmp_it->stop_pos to
> the actual endpos so that CHAR_COMPOSED_P quickly returns
> zero.

It could be that having CHAR_COMPOSED_P return non-zero once every 16
characters in a buffer with no compositions at all is still the best
we can do, see above.




  reply	other threads:[~2010-04-30 13:15 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3A521851-F7CC-45DB-A2ED-8348EF96D5CF@Freenet.DE>
     [not found] ` <83fx2q5w86.fsf@gnu.org>
     [not found]   ` <tl739yppmat.fsf@m17n.org>
2010-04-23 18:52     ` Compositions and bidi display (was: bug#5977: 24.0.50; Lao HELLO is incorrectly displayed) Eli Zaretskii
2010-04-23 20:34       ` Andreas Schwab
2010-04-23 20:43         ` Eli Zaretskii
2010-04-24 11:27           ` Eli Zaretskii
2010-04-26  2:09       ` Kenichi Handa
2010-04-26  2:38         ` Kenichi Handa
2010-04-26 11:29       ` Kenichi Handa
2010-04-26 18:40         ` Compositions and bidi display Eli Zaretskii
2010-04-27 12:15           ` Kenichi Handa
2010-04-28  3:18             ` Eli Zaretskii
2010-04-28  4:01               ` Kenichi Handa
2010-04-28 17:38                 ` Eli Zaretskii
2010-04-28 22:49                   ` Stefan Monnier
2010-04-29  3:12                     ` Eli Zaretskii
2010-04-30  2:28                       ` Kenichi Handa
2010-04-30  6:41                         ` Eli Zaretskii
2010-04-30  6:06                   ` Kenichi Handa
2010-04-30  7:08                     ` Eli Zaretskii
2010-05-03  2:39                       ` Kenichi Handa
2010-05-03  7:31                         ` Eli Zaretskii
2010-05-04  9:19                           ` Kenichi Handa
2010-05-04 17:47                             ` Eli Zaretskii
2010-04-30 10:07                     ` Eli Zaretskii
2010-04-30 12:12                       ` Kenichi Handa
2010-04-30 13:15                         ` Eli Zaretskii [this message]
2010-04-27  3:13         ` Compositions and bidi display (was: bug#5977: 24.0.50; Lao HELLO is incorrectly displayed) Eli Zaretskii
2010-04-27 12:26           ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83ljc5w07y.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=handa@m17n.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).