From: handa@gnu.org (K. Handa)
To: Richard Wordingham <richard.wordingham@ntlworld.com>
Cc: 20140@debbugs.gnu.org
Subject: bug#20140: 24.4; M17n shaper output rejected
Date: Wed, 25 Mar 2015 23:25:54 +0900 [thread overview]
Message-ID: <87mw31887h.fsf@gnu.org> (raw)
In-Reply-To: <20150321175818.1b125eba@JRWUBU2> (message from Richard Wordingham on Sat, 21 Mar 2015 17:58:18 +0000)
Hi, thank you for the detailed explanation.
In article <20150321175818.1b125eba@JRWUBU2>, Richard Wordingham <richard.wordingham@ntlworld.com> writes:
> What I ought to want is SIL's split cursor scheme, which indicated the
> next ('point') and previous characters, even in bidirectional text.
> Unfortunately, that's not compatible with m17n, which seems to assume
> that cursor position will be a single number. The Emacs functions
> forward-char-intrusive and backward-char-intrusive provided a pleasant,
> more intuitive, alternative, and I am sad to hear they are gone.
> Perhaps I'll have to start using toggle-auto-composition.
Those Emacs functions are just my idea for improving Emacs
for CTL users, and have never been included in the official
Emacs verison. I check the code and found two problems:
(1) When the command sets disable-point-adjustment to t,
command_loop_1 should force updating the display if point is
within a grapheme cluster. So we need this patch:
diff --git a/src/keyboard.c b/src/keyboard.c
index bf65df1..13125c1 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -1636,6 +1636,16 @@ command_loop_1 (void)
adjust_point_for_property (last_point_position,
MODIFF != prev_modiff);
}
+ else if (current_buffer == prev_buffer
+ && last_point_position != PT)
+ {
+ if (PT > BEGV && PT < ZV
+ && (composition_adjust_point (last_point_position, PT) != PT))
+ /* Now point is within a grapheme cluster. We must update
+ the display so that this cluster is discomosed on the
+ screen and the cursor is correctly placed at point. */
+ windows_or_buffers_changed = 22;
+ }
/* Install chars successfully executed in kbd macro. */
(2) We should break a grapheme cluster at point. So we need
this patch.
diff --git a/src/xdisp.c b/src/xdisp.c
index a17f5a9..0c56395 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -3408,6 +3408,9 @@ compute_stop_pos (struct it *it)
pos = next_overlay_change (charpos);
if (pos < it->stop_charpos)
it->stop_charpos = pos;
+ /* If point is in front of the current stop pos, stop there. */
+ if (charpos < PT && PT < it->stop_charpos)
+ it->stop_charpos = PT;
/* Set up variables for computing the stop position from text
property changes. */
@@ -8166,7 +8169,12 @@ next_element_from_buffer (struct it *it)
&& IT_CHARPOS (*it) >= it->redisplay_end_trigger_charpos)
run_redisplay_end_trigger_hook (it);
- stop = it->bidi_it.scan_dir < 0 ? -1 : it->end_charpos;
+ /* Set stop position considering the bidi direction and point. */
+ if (it->bidi_it.scan_dir < 0)
+ stop = (PT < IT_CHARPOS (*it)) ? PT : -1;
+ else
+ stop = ((IT_CHARPOS (*it) < PT && PT < it->end_charpos)
+ ? PT : it->end_charpos);
if (CHAR_COMPOSED_P (it, IT_CHARPOS (*it), IT_BYTEPOS (*it),
stop)
&& next_element_from_composition (it))
Could you try these patches and test the usability of
forward-char-intrusive and backward-char-intrusive?
> > Please try to move cursor over this Devanagri text "हिंदी" on
> > Emacs, gedit, and, for instance, firefox. They all treat
> > that text as 2 grapheme clusters "हिं" and "दी". The first
> > one corresponds to character the sequence U+935 U+93F, and
> > U+93F (vowel I) is displayed before U+935 (base cosonant).
> Note that those clusters are only 3 and 2 characters long. Retyping
> them is tolerable. Now consider the Sanskrit Devanagari text स्त्री,
> which contains two consonant-combining viramas. Emacs moves across it
> in 1 step, but Claws e-mail (GTK-based, I believe) and LibreOffice
> (HarfBuzz-based, at least for linux) both take 3 steps to move across
> it. Claws and LibreOffice use different algorithms to position the
> cursor. That of LibreOffice seems more reasonable, but that of
> Claws works better! The reason is that Unicode did not declare virama
> as forming grapheme clusters.
Ah, hmmm, that a problem of DEVA-OTF.flt and DEV2-OTF.flt of
the m17n library. I'll try to fix them.
> It seems to have solved all of them. When I reported the bug, I was
> having problems with my font because libotf was silently ignoring half
> the lookups in my font.
Could you please send me (not on this list) an appropriate
bug/problem report if libotf should be fixed?
> I though I might have problems with U+1A58 TAI THAM SIGN MAI KANG LAI,
> which in Lao visually groups (usually) with the following base
> consonant and in Tai Khuen groups with the preceding base consonant. My
> clustering in Emacs follows the Tai Khuen scheme. (I compose two
> orthographic clusters together in Emacs, but declare two grapheme
> clusters in the FLT processing.) However, my font follows a major
> Northern Thai dictionary and places it on the following base consonant
> if there is nothing above it, but otherwise places it on the preceding
> base consonant. However, my implementation is too dirty to cause
> problems - the second cluster is not reported as deriving from the
> mai kang lai character.
> I wonder, though, what will happen if I manage to implement the
> Universal Shaping Engine's (USE) rphf feature. The author of a Lao-style
> Tai Tham font wanted this feature in HarfBuzz. The desired effect seems
> easy to achieve in m17n-flt, but placing it under font control is more
> difficult. I'm studying MLM2-OTF.flt to see how to do it.
I've just started to study the Universal Shaping Engine. It
seems that we can implement it by a proper FLT file.
> > > However, it then makes editing of the 'clusters' more
> > > difficult. Note that there are examples above with 5
> > > characters in a cluster, and this is by no means the
> > > limit.
> >
> > But, it seems that the current behavior is accepted, at
> > least, by Indic people.
> Who do you mean by 'Indic people'?
I just mean that I have not heard any complaints about that
"too long cluster problem" of Emacs. No one is using Emacs
for Indic scripts?
> New Tai Lue is an interesting case. Microsoft delayed support for this
> simple Indic script for so long that most apparently Unicode-encoded
> New Tai Lue text was actually encoded in visual order. With Unicode
> 8.0, New Tai Lue is changing from phonetic order to visual order, and
> it will no longer need any clusters at all!
Wow, I didn't know that.
> Emacs 23.3 (which is what is in long-term support Ubuntu
> 12.04) offers no support for New Tai Lue, so I am not sure
> that there is yet a New Tai Lue view on composition in
> Emacs.
We may be able to provide supports for new scripts in elpa.
---
K. Handa
handa@gnu.org
next prev parent reply other threads:[~2015-03-25 14:25 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-18 22:20 bug#20140: 24.4; M17n shaper output rejected Richard Wordingham
2015-03-19 3:43 ` Eli Zaretskii
2015-03-21 8:33 ` K. Handa
2015-03-21 17:20 ` Wolfgang Jenkner
2015-03-21 17:58 ` Richard Wordingham
2015-03-21 18:26 ` Eli Zaretskii
2015-03-25 14:25 ` K. Handa [this message]
2015-03-25 21:45 ` Richard Wordingham
2015-04-05 19:48 ` Richard Wordingham
2022-02-03 21:21 ` Lars Ingebrigtsen
2022-02-04 7:37 ` Eli Zaretskii
2022-02-05 22:52 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-06 8:11 ` Eli Zaretskii
2022-02-06 22:09 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-07 14:04 ` Eli Zaretskii
2022-02-07 23:38 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-08 22:13 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-12 18:54 ` Eli Zaretskii
2022-02-13 16:04 ` Eli Zaretskii
2022-02-13 20:53 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-14 13:19 ` Eli Zaretskii
2022-02-14 22:14 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-15 1:27 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-16 15:13 ` Eli Zaretskii
2022-02-16 15:12 ` Eli Zaretskii
2022-02-16 15:11 ` Eli Zaretskii
2022-02-13 19:49 ` Eli Zaretskii
2022-02-13 21:11 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-14 13:26 ` Eli Zaretskii
2022-02-14 23:26 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-15 14:40 ` Eli Zaretskii
2022-02-15 21:06 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-16 13:15 ` Eli Zaretskii
2022-02-16 19:01 ` Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-16 19:20 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mw31887h.fsf@gnu.org \
--to=handa@gnu.org \
--cc=20140@debbugs.gnu.org \
--cc=richard.wordingham@ntlworld.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).