From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Visuwesh Newsgroups: gmane.emacs.bugs Subject: bug#56237: 29.0.50; delete-forward-char fails to delete character Date: Mon, 27 Jun 2022 11:17:25 +0530 Message-ID: <87o7yeodxu.fsf@gmail.com> References: <87v8sn9zo4.fsf@gmail.com> <83zghz8kk3.fsf@gnu.org> <87mtdz9ysx.fsf@gmail.com> <83y1xj8jqb.fsf@gnu.org> <87fsjr9xs6.fsf@gmail.com> <83v8sn8ir9.fsf@gnu.org> <87bkuf9wx4.fsf@gmail.com> <83tu878hen.fsf@gnu.org> <87sfnqoep4.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34785"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: 56237@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jun 27 07:48:12 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o5hbW-0008pa-3q for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 27 Jun 2022 07:48:10 +0200 Original-Received: from localhost ([::1]:54534 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o5hbU-0000FO-Lq for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 27 Jun 2022 01:48:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59562) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o5hbO-0000F6-1F for bug-gnu-emacs@gnu.org; Mon, 27 Jun 2022 01:48:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:55609) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o5hbN-0002vq-P4 for bug-gnu-emacs@gnu.org; Mon, 27 Jun 2022 01:48:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1o5hbN-0001v2-MH for bug-gnu-emacs@gnu.org; Mon, 27 Jun 2022 01:48:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Visuwesh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 27 Jun 2022 05:48:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56237 X-GNU-PR-Package: emacs Original-Received: via spool by 56237-submit@debbugs.gnu.org id=B56237.16563088597241 (code B ref 56237); Mon, 27 Jun 2022 05:48:01 +0000 Original-Received: (at 56237) by debbugs.gnu.org; 27 Jun 2022 05:47:39 +0000 Original-Received: from localhost ([127.0.0.1]:49506 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o5hb0-0001se-Bx for submit@debbugs.gnu.org; Mon, 27 Jun 2022 01:47:38 -0400 Original-Received: from mail-pl1-f195.google.com ([209.85.214.195]:40484) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o5haw-0001ru-Sc for 56237@debbugs.gnu.org; Mon, 27 Jun 2022 01:47:37 -0400 Original-Received: by mail-pl1-f195.google.com with SMTP id b2so212725plx.7 for <56237@debbugs.gnu.org>; Sun, 26 Jun 2022 22:47:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=v4hF07EaA3zLOXMN1DUVmdSPQBvjHUbKk8DLlMDAva8=; b=TqmzCmqJvmqsN2df80k6nuOP2WNAWYMUv78y9xibLeP2RzyAOJl4yhrjcUhRvSQ1rU p/kSzSJKPUIDGXMdg/I40Xdn5fN4Vjfu0ZHLAZL/w9ZnKWpYjHyfG5YzIUFDni/LfHUh OtRprutTktULRAtqMZOsVfgT9AJB4sDBtSE9gOpy2IDrEkqcWslfoG+KpHxrHYBiKgUd I3PoHQQjVXZasI9pzLryC9TjUJsnbhjwHi6ls36OheG6dDwzIOVuzLNNCpkNFqX9v+ZB RX/lclXadzXEraJtmrSgyyxapTt1DQVtUO3oAhKiCHFkIDjkaGB5ypUYcyfx1H0pu5B9 q9HA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=v4hF07EaA3zLOXMN1DUVmdSPQBvjHUbKk8DLlMDAva8=; b=T9fg6hQtXy4S5U9t85TpAS4GdvZucjvnCBqaRdxtLuvUwQ5NfTrrFxXIMIgK9r1UpR +l+8/gfLSN6SqiDF6bGe6wncdaQxnJzckr9pBkU7XaIWz6M6ZzZFa2md3zeUoj2c8NQO 9u8qFezNMAGlfiN081MSiIt3+COLIk7Hw2X/KcN3LGsbLv9jRwwXV4cPNjsep/ccPXZA EhFPuO/GG2U2HDWU6hP8e6uc2/XXf+gX45XxNapRpo6MgZ6MC3rlNXVH5WBopC7b9TPV V5NuNesjzkPUflN8uAAE0EoKxr3psctzeSv3eSd1zP4v6xBkOFY0miDV6UH1iO9FnWAf Vhsw== X-Gm-Message-State: AJIora/l3SgZzCaWoXL18i1ujMKdm3epfYUlHevcQG52qlk42MsaXmBB VsVOM21Tk06QJ04jjuYvz4y8DCb9ckk= X-Google-Smtp-Source: AGRyM1ujXcm1/2aFyGNP8Tftw9eXoFASQdOWyblhkPA8zekBR3nRQKugounlIaJ1seG3jna4mXYU7w== X-Received: by 2002:a17:902:c948:b0:16a:58f4:c142 with SMTP id i8-20020a170902c94800b0016a58f4c142mr13231473pla.103.1656308848871; Sun, 26 Jun 2022 22:47:28 -0700 (PDT) Original-Received: from localhost ([49.204.143.183]) by smtp.gmail.com with ESMTPSA id g26-20020aa7819a000000b0052548b87bd1sm6168803pfi.46.2022.06.26.22.47.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 22:47:28 -0700 (PDT) In-Reply-To: <87sfnqoep4.fsf@gmail.com> (Visuwesh's message of "Mon, 27 Jun 2022 11:01:03 +0530") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:235410 Archived-At: [=E0=AE=A4=E0=AE=BF=E0=AE=99=E0=AF=8D=E0=AE=95=E0=AE=B3=E0=AF=8D =E0=AE=9C= =E0=AF=82=E0=AE=A9=E0=AF=8D 27, 2022] Visuwesh wrote: > [=E0=AE=9E=E0=AE=BE=E0=AE=AF=E0=AE=BF=E0=AE=B1=E0=AF=81 =E0=AE=9C=E0=AF= =82=E0=AE=A9=E0=AF=8D 26, 2022] Eli Zaretskii wrote: > >>> From: Visuwesh >>> Cc: 56237@debbugs.gnu.org >>> Date: Sun, 26 Jun 2022 22:36:31 +0530 >>>=20 >>> > Invoke find-composition, and you will see that it returns a single >>> > composition there. >>>=20 >>> If find-composition is indeed right, then the return value is very >>> unintuvitive as a native speaker: =E0=AE=AA=E0=AF=8D and =E0=AE=AA=E0= =AF=8B are two separate characters >>> and combining them into a single cluster is weird...=20=20 >> >> Maybe you are right, but then Someone(TM) will have to either modify >> find-composition or explain how to interpret its return value >> differently from what we do now. What is now in delete-forward-char >> expresses my level of knowledge in this area, which admittedly is >> limited. >> > > Turns out that Someone=E2=84=A2 was closer to us than I thought: describe= -char. > With a bit of edebug and reading the code in composition.h (for the > LGLYPH_* macros) and defsubst's in composite.el, I think I figured out > the logic: > > We need to call find-composition with a non-nil DETAIL-P argument to get > the gstring. The gstring contains the glyphs that will be used to > construct the grapheme cluster [1]. According to composition.h, those > glyphs which have the same FROM and TO indices are part of the same > grapheme cluster so to get the actual length of individual codepoints, > we need to calculate the number of glyphs which have an equal FROM and > TO indices. > > Understanding all this, I came up with the following code: > > (let* ((composition (find-composition 0 nil "=E0=AE=AA=E0=AF=8D=E0=AE= =AA=E0=AF=8B" t)) > (gstring (nth 2 composition)) > (num-glyphs (lgstring-glyph-len gstring)) > (i 1) > (from (lglyph-from (lgstring-glyph gstring 0))) > (to (lglyph-to (lgstring-glyph gstring 0)))) > (while (and (< i num-glyphs) > (=3D from (lglyph-from (lgstring-glyph gstring i))) > (=3D to (lglyph-to (lgstring-glyph gstring i)))) > (setq i (1+ i))) > i) > > here i is the number of characters we need to delete using delete-char. > > [1] For the gstring format, see composition-get-gstring. > > But I think we should test this code in cases where a grapheme cluster > contains more than two codepoints since all the composed characters in > Tamil are made up of two Unicode codepoints. I can't test it on emojis > since I don't know of an Emoji font that won't crash potentially Xft and > has enough coverage. > I got my hopes too high. :( This fails for the simple case of =E0=AE=B0=E0=AF=81 (C-u C-x =3D also fail= s!) so I guess we are back to square one. Although =E0=AE=B0=E0=AF=81 is composed from 0B= B0 0BC1, the gstring only has one glyph.