unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Visuwesh <visuweshm@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 56237@debbugs.gnu.org
Subject: bug#56237: 29.0.50; delete-forward-char fails to delete character
Date: Mon, 27 Jun 2022 11:17:25 +0530	[thread overview]
Message-ID: <87o7yeodxu.fsf@gmail.com> (raw)
In-Reply-To: <87sfnqoep4.fsf@gmail.com> (Visuwesh's message of "Mon, 27 Jun 2022 11:01:03 +0530")

[திங்கள் ஜூன் 27, 2022] Visuwesh wrote:

> [ஞாயிறு ஜூன் 26, 2022] Eli Zaretskii wrote:
>
>>> From: Visuwesh <visuweshm@gmail.com>
>>> Cc: 56237@debbugs.gnu.org
>>> Date: Sun, 26 Jun 2022 22:36:31 +0530
>>> 
>>> > Invoke find-composition, and you will see that it returns a single
>>> > composition there.
>>> 
>>> If find-composition is indeed right, then the return value is very
>>> unintuvitive as a native speaker: ப் and போ are two separate characters
>>> and combining them into a single cluster is weird...  
>>
>> Maybe you are right, but then Someone(TM) will have to either modify
>> find-composition or explain how to interpret its return value
>> differently from what we do now.  What is now in delete-forward-char
>> expresses my level of knowledge in this area, which admittedly is
>> limited.
>>
>
> Turns out that Someone™ was closer to us than I thought: describe-char.
> With a bit of edebug and reading the code in composition.h (for the
> LGLYPH_* macros) and defsubst's in composite.el, I think I figured out
> the logic:
>
> We need to call find-composition with a non-nil DETAIL-P argument to get
> the gstring.  The gstring contains the glyphs that will be used to
> construct the grapheme cluster [1].  According to composition.h, those
> glyphs which have the same FROM and TO indices are part of the same
> grapheme cluster so to get the actual length of individual codepoints,
> we need to calculate the number of glyphs which have an equal FROM and
> TO indices.
>
> Understanding all this, I came up with the following code:
>
>     (let* ((composition (find-composition 0 nil "ப்போ" t))
>            (gstring (nth 2 composition))
>            (num-glyphs (lgstring-glyph-len gstring))
>            (i 1)
>            (from (lglyph-from (lgstring-glyph gstring 0)))
>            (to (lglyph-to (lgstring-glyph gstring 0))))
>       (while (and (< i num-glyphs)
>                   (= from (lglyph-from (lgstring-glyph gstring i)))
>                   (= to (lglyph-to (lgstring-glyph gstring i))))
>         (setq i (1+ i)))
>       i)
>
> here i is the number of characters we need to delete using delete-char.
>
> [1] For the gstring format, see composition-get-gstring.
>
> But I think we should test this code in cases where a grapheme cluster
> contains more than two codepoints since all the composed characters in
> Tamil are made up of two Unicode codepoints.  I can't test it on emojis
> since I don't know of an Emoji font that won't crash potentially Xft and
> has enough coverage.
>

I got my hopes too high.  :(

This fails for the simple case of ரு (C-u C-x = also fails!) so I guess
we are back to square one.  Although ரு is composed from 0BB0 0BC1, the
gstring only has one glyph.






  reply	other threads:[~2022-06-27  5:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-26 16:07 bug#56237: 29.0.50; delete-forward-char fails to delete character visuweshm
2022-06-26 16:13 ` Visuwesh
2022-06-26 16:18 ` Eli Zaretskii
2022-06-26 16:24   ` Lars Ingebrigtsen
2022-06-26 16:25   ` Visuwesh
2022-06-26 16:36     ` Eli Zaretskii
2022-06-26 16:47       ` Visuwesh
2022-06-26 16:57         ` Eli Zaretskii
2022-06-26 17:06           ` Visuwesh
2022-06-26 17:26             ` Eli Zaretskii
2022-06-26 18:01               ` Eli Zaretskii
2022-06-27  5:31               ` Visuwesh
2022-06-27  5:47                 ` Visuwesh [this message]
2022-06-27 12:39                   ` Eli Zaretskii
2022-06-27 14:24                     ` Visuwesh
2022-06-27 15:53                       ` Eli Zaretskii
2022-07-02  7:03                         ` Visuwesh
2022-07-16 12:50                           ` Visuwesh
2022-07-16 13:31                             ` Eli Zaretskii
2022-07-16 13:43                               ` Visuwesh
2022-06-26 16:38     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o7yeodxu.fsf@gmail.com \
    --to=visuweshm@gmail.com \
    --cc=56237@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).