From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Richard Wordingham Newsgroups: gmane.emacs.help Subject: Re: Composed Sequences Date: Sat, 26 Feb 2022 15:11:44 +0000 Message-ID: <20220226151144.4c0b641e@JRWUBU2> References: <20220220110926.25c675be@JRWUBU2> <835yp9ya4x.fsf@gnu.org> <20220226002837.699ae2b1@JRWUBU2> <83r17qp268.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12827"; mail-complaints-to="usenet@ciao.gmane.io" To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 26 16:12:54 2022 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nNykg-0003CG-5E for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 26 Feb 2022 16:12:54 +0100 Original-Received: from localhost ([::1]:44820 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nNyke-0006oO-SK for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 26 Feb 2022 10:12:52 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:40720) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nNyjf-0006nO-So for help-gnu-emacs@gnu.org; Sat, 26 Feb 2022 10:11:52 -0500 Original-Received: from smtpq2.tb.ukmail.iss.as9143.net ([212.54.57.97]:42582) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nNyjd-0000EY-Mb for help-gnu-emacs@gnu.org; Sat, 26 Feb 2022 10:11:51 -0500 Original-Received: from [212.54.57.110] (helo=csmtp6.tb.ukmail.iss.as9143.net) by smtpq2.tb.ukmail.iss.as9143.net with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nNyja-0002uR-FU for help-gnu-emacs@gnu.org; Sat, 26 Feb 2022 16:11:46 +0100 Original-Received: from JRWUBU2 ([82.27.122.109]) by cmsmtp with ESMTP id NyjZnPWBS24zoNyjan5bHg; Sat, 26 Feb 2022 16:11:46 +0100 X-SourceIP: 82.27.122.109 X-Spam: 0 X-Authority: v=2.4 cv=K8YxogaI c=1 sm=1 tr=0 ts=621a4332 cx=a_exe a=lZfnwhydZ+7bl6OdZ0zTBw==:117 a=lZfnwhydZ+7bl6OdZ0zTBw==:17 a=IkcTkHD0fZMA:10 a=oGFeUVbbRNcA:10 a=mDV3o1hIAAAA:8 a=NLZqzBF-AAAA:8 a=mI61h1sKbauEIIytG4YA:9 a=QEXdDO2ut3YA:10 a=_FVE-zBwftR9WsbkzFJk:22 a=wW_WBVUImv98JQXhvVPZ:22 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ntlworld.com; s=meg.feb2017; t=1645888306; bh=3pCj19gyf/qWBoEC/AiUR6z+KGdjaVz1S+LTZNSF4sc=; h=Date:From:To:Subject:In-Reply-To:References; b=tVjNIEpGGqfMDjENPQnIJV+3aoEdE85H0p1qs+MhY8/22hAsu8Ix0rEAriw69al8n tQJDpnoDl/asjJ12vvY6FSN4HIHXGplvCAhPt7a9I+hPPXgaBKpOkhXia8j7W8hZiu b42BlqdLbwmOMWQ33K9btIaNGDgerGCs4eL6NFZPClZgU25GyVYiXAEo6zG+saCcGg rxzKbebaGHXx5XFhJg/MhsxreqJsmbPAm+xqdAtovH2UQUwaIG9kqcWkbeRRkbjkdx gXqQINXC1AusL3lKPtzDKO4puUvKLz208o4wQZQUro+yHIW3Bg/rjen1fTQ+9Zae8i jgy3sb4ompt/w== In-Reply-To: <83r17qp268.fsf@gnu.org> X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) X-CMAE-Envelope: MS4xfHcyozy+Qs8jvlJtHzzhynH59uIxwCE3EOj2x9tk1+IxyMgMso/OYyu/x4c0xmuawFa2HSs2QUerbqbFSRrkpxoK3yAehkvWhveIj53aHKCFpKP/TR9R MYrXgRAnymnx2lfEzDiBoaPTD0tcR0eNUig2cmVb3BKB1E0mWqzmnw/pzpJw7tyAYgHB0n5xyQyUs4qDcnGivcijJN0pK8WESnM= Received-SPF: pass client-ip=212.54.57.97; envelope-from=richard.wordingham@ntlworld.com; helo=smtpq2.tb.ukmail.iss.as9143.net X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:136199 Archived-At: On Sat, 26 Feb 2022 08:33:35 +0200 Eli Zaretskii wrote: > > Date: Sat, 26 Feb 2022 00:28:37 +0000 > > From: Richard Wordingham > >=20 > > I still haven't found the code where the difference occurs, but I > > now have a better idea of what is going on. It seems that runs > > with the same value of the composition property ('composed > > sequences') are sequences of clusters for the font that match a > > regular expression given in composition-function-table. =20 >=20 > (Please don't use "composition property" in this context, because it's > confusing: the 'composition' text property does exist in Emacs (it's > an old and now deprecated way of composing characters), but it is not > relevant to this discussion, which instead focuses on what is known in > Emacs as "automatic composition".) Ah, I've misinterpreted some of the code. > > Different renderers give different clusters, and thus, by default, > > different cursor motion! =20 > Not "different renderers", but "different fonts". I experimented with the Tai Tham composition-function-table entry (list (vector "[\u1a20-\u1aad]+" 0 'font-shape-gstring)) For GNU Emacs 23.4.1 (i386-mingw-nt6.2.9200) using Uniscribe, the word =E1=A8=A0=E1=A9=A3=E1=A9=A0=E1=A8=BF <1A20 HIGH KA, 1A63 AA, 1A60 SAKOT, 1A= 3F LOW YA>, the glyph string for Version 0.8 of my font Da Lekh is divided into two clusters as identified by the 'glyph' values [0 1 6688...] [0 1 6688...] [2 3 6752...] and confirmed by ordinary cursor motion. While this division into <1A20, 1A63> and <1A60, 1A3F> is not the Unicode division into grapheme clusters, it accords with what are natively namable clusters. For GNU Emacs 27.1 (build1 i686-w64-mingw32) of 2020-08-21, which uses HarfBuzz, the same word is one indivisible cluster (at least with Version 0.13 of the same font). I think this is a change in the behaviour of HarfBuzz. So should also depend on the clustering by the rendering engine. > > The reason Arabic seemed different is that when lam+hah appears to > > ligate, what is happening (at least with Amiri) is that > > substitutions are made which give the effect of a ligature, while > > remaining two distinct glyphs. =20 > Yes, I see that as well. "C-u C-x =3D" should tell you whether ligation > happened or not. What you see is normal, I think: Emacs obeys the > decisions of the font designers. Unless they recorded the positions of the boundaries between the parts of a ligature! (There is such a facility in the GDEF table, but it is very widely ignored, and so a consumer would have to check its quality.) Richard.