From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Richard Wordingham via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#67828: 27.1; Sinhala touching consonants Date: Thu, 14 Dec 2023 17:08:02 +0000 Message-ID: <20231214170802.03b5ea4d@JRWUBU2> Reply-To: Richard Wordingham Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14748"; mail-complaints-to="usenet@ciao.gmane.io" To: 67828@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Dec 14 18:09:17 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rDpD3-0003c1-1e for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 14 Dec 2023 18:09:17 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rDpCs-0003xH-4R; Thu, 14 Dec 2023 12:09:06 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rDpCp-0003wg-A4 for bug-gnu-emacs@gnu.org; Thu, 14 Dec 2023 12:09:03 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rDpCo-0005Jp-Q7 for bug-gnu-emacs@gnu.org; Thu, 14 Dec 2023 12:09:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rDpCo-0007Pv-3v for bug-gnu-emacs@gnu.org; Thu, 14 Dec 2023 12:09:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Richard Wordingham Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 14 Dec 2023 17:09:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 67828 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.170257370628445 (code B ref -1); Thu, 14 Dec 2023 17:09:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 14 Dec 2023 17:08:26 +0000 Original-Received: from localhost ([127.0.0.1]:50842 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rDpCE-0007Oi-0V for submit@debbugs.gnu.org; Thu, 14 Dec 2023 12:08:26 -0500 Original-Received: from lists.gnu.org ([2001:470:142::17]:42848) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rDpCB-0007OP-9A for submit@debbugs.gnu.org; Thu, 14 Dec 2023 12:08:23 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rDpC3-0003Zi-1P for bug-gnu-emacs@gnu.org; Thu, 14 Dec 2023 12:08:16 -0500 Original-Received: from csmtpq2-prd-nl1-vmo.edge.unified.services ([84.116.50.37]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rDpC0-0004xA-I5 for bug-gnu-emacs@gnu.org; Thu, 14 Dec 2023 12:08:14 -0500 Original-Received: from csmtp6-prd-nl1-vmo.nl1.unified.services ([100.107.82.136] helo=csmtp6-prd-nl1-vmo.edge.unified.services) by csmtpq2-prd-nl1-vmo.edge.unified.services with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1rDpBr-00Bqai-Au for bug-gnu-emacs@gnu.org; Thu, 14 Dec 2023 18:08:03 +0100 Original-Received: from JRWUBU2 ([82.27.122.109]) by csmtp6-prd-nl1-vmo.edge.unified.services with ESMTP id DpBqrfy25WG9DDpBqrEPFd; Thu, 14 Dec 2023 18:08:03 +0100 X-SourceIP: 82.27.122.109 X-Spam: 0 X-Authority: v=2.4 cv=DJmJ4TNb c=1 sm=1 tr=0 ts=657b3673 cx=a_exe a=lZfnwhydZ+7bl6OdZ0zTBw==:117 a=lZfnwhydZ+7bl6OdZ0zTBw==:17 a=IkcTkHD0fZMA:10 a=e2cXIFwxEfEA:10 a=aR16PxjQAAAA:8 a=mBSFXPpqcUkSjWGcoIYA:9 a=QEXdDO2ut3YA:10 a=zbFvvTOBjyH4ze5LlUjX:22 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ntlworld.com; s=meg.feb2017; t=1702573683; bh=SmmypUMIa7bQuOQs/8uCv/Yjh4EuedIRwSeXscDtHXc=; h=Date:From:To:Subject; b=Jbo1t6qlxTjhJeEs1CFKzH1ktm1stkMN/MmmNjTdgJwZGQvZz7LJQhqjHCa7jrjNv IO9Klvbbs2G9YJpgIKOkpwN+lx3cUihcEhMYT03mi9djj6UIlRyglheDsnAleruEZG 6R21/NERfp1uBZrFhTa3HIbxBa/Bky8vjNYWlIyWiW4HXimPbXMCRt95LVD+hjXQsG hIyMoY0uvc7qoTUuW98Fx7AdVUKYWA80/ate1SfXusbflBuofmlBJN4HJDnzZCZcfw 5UpiYrgzyphSQyenVMIBvRHTx0X1LJaODp7FqaeNEhyqB6wnGbqR0loOGWaMM59gzb vvrhEyIQ555CQ== X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.33; x86_64-pc-linux-gnu) X-CMAE-Envelope: MS4xfOf2sVXB/KYp5CJL0S8zSzm7Z7jzLcWOfB+um4a+QY4+jz7lV9Aal+6lbsp0giNQJWRSM1spi8uv88CtXkpLna7j1j9e5rQol5yNi9EpEZMem44roNZt 7U7QpAm2H/wJEj0bPUHvWd0rzfZW3U9CKxKPXW7ytNAqNBNa/hYWgIGo8EWAbbKQ6cHxw5+8rf6quBFK02fQ+lzMrvrceIGaUVM= Received-SPF: pass client-ip=84.116.50.37; envelope-from=richard.wordingham@ntlworld.com; helo=csmtpq2-prd-nl1-vmo.edge.unified.services X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:276200 Archived-At: To reproduce: Paste the Sinhala script string "=E0=B6=9C=E0=B6=B1=E2=80=8D=E0=B7=8A=E0=B6= =AD=E0=B7=80=E0=B7=8F" into a buffer. Ther string will then display with a dotted circle in the middle. The problem is that this string is split into three clusters for rendering. The dotted circle ought not to appear. With a suitable font being selected for Sinhala, e.g. Noto Sans Sinhala, the characters either side of =E0=B7=8A should abut or have very little separation. Without a suitable font, the display should fall back to one similar to "=E0=B6=9C=E0=B6=B1=E0= =B7=8A=E0=B6=AD=E0=B7=8A=E2=80=8D=E0=B7=80=E0=B7=8F". The problem can be fixed by changing the file lisp/language/sinhala.el as follows: 41c42 < "[\u0D9A-\u0DC6]\\(?:\u0DCA\u200D[\u0D9A-\u0DC6]\\)*[\u0DCF-\u0DDF\u0DF2-\u= 0DF3]*\u0DCA?[\u0D82-\u0D83]?\\|" --- > "[\u0D9A-\u0DC6]\\(?:\\(\u0DCA\u200D\\|\u200D\u0DCA\\)[\u0D9A-\u0DC6]\\= )*[\u0DCF-\u0DDF\u0DF2-\u0DF3]*\u0DCA?[\u0D82-\u0D83]?\\|" There are three ways of suppressing the inherent vowel between consonants in the Sinhala script: 1) Insert U+0DCA between them. This character displays as a mark or modified the preceding character, and there is otherwise no interaction between them, and Emacs therefore treats the characters after it as a separate cluster. 2) Insert the sequence U+0DCA U+200D between them. Depending on font design, the two characters will interact by one or both of them changing shape or combining, and Indic rearrangment may occur across the join. Alternatively, the first way may be used. 3) Insert the sequence U+200D U+0DCA between them. The space between the consonants should then be removed. Indic rearrangment may occur across the join. If a font does not support this, the first way may be used as a fallback. Emacs 27.1 supports only the first two methods. The change above enables it to support all three methods by also forming a character cluster for Way 3.=20 In GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.33, cairo version 1.16.0) of 2023-08-16, modified by Debian built on lcy02-amd64-041 Windowing system distributor 'The X.Org Foundation', version 11.0.12201001 System Description: Ubuntu 22.04.3 LTS Recent messages: Wrote /home/richard/PIE/Pali/sinhala.el Loading /home/richard/PIE/Pali/sinhala.el (source)...done t (No changes need to be saved) Auto-saving...done Saving file /home/richard/PIE/Pali/sinhala.el... Wrote /home/richard/PIE/Pali/sinhala.el Loading /home/richard/PIE/Pali/sinhala.el (source)...done t End of buffer Configured using: 'configure --build x86_64-linux-gnu --prefix=3D/usr --sharedstatedir=3D/var/lib --libexecdir=3D/usr/lib --localstatedir=3D/var/lib --infodir=3D/usr/share/info --mandir=3D/usr/share/man --enable-libsystemd --with-pop=3Dyes --enable-locallisppath=3D/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:= /usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share= /emacs/site-lisp --with-sound=3Dalsa --without-gconf --with-mailutils --build x86_64-linux-gnu --prefix=3D/usr --sharedstatedir=3D/var/lib --libexecdir=3D/usr/lib --localstatedir=3D/var/lib --infodir=3D/usr/share/info --mandir=3D/usr/share/man --enable-libsystemd --with-pop=3Dyes --enable-locallisppath=3D/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:= /usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share= /emacs/site-lisp --with-sound=3Dalsa --without-gconf --with-mailutils --with-cairo --with-x=3Dyes --with-x-toolkit=3Dgtk3 --with-toolkit-scroll-bars 'CFLAGS=3D-g -O2 -ffile-prefix-map=3D/build/emacs-WL9mhG/emacs-27.1+1=3D. -fstack-protector-strong -Wformat -Werror=3Dformat-security -Wall' 'CPPFLAGS=3D-Wdate-time -D_FORTIFY_SOURCE=3D2' 'LDFLAGS=3D-Wl,-Bsymbolic-functions -Wl,-z,relro'' Configured features: XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS LIBSYSTEMD JSON PDUMPER LCMS2 GMP Important settings: value of $LANG: en_GB.utf8 value of $XMODIFIERS: @im=3Dibus locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: show-paren-mode: t tpu-edt-mode: t tooltip-mode: t global-eldoc-mode: t eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs format-spec rfc822 mml mml-sec epa derived epg epg-config gnus-util rmail rmail-loaddefs text-property-search mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils thai-util thai-word mule-util time-date cus-edit cus-start cus-load wid-edit paren tpu-edt picture quail help-mode edmacro kmacro finder-inf package easymenu browse-url url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache json subr-x map url-vars seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 16 202496 17379) (symbols 48 10301 1) (strings 32 34875 1922) (string-bytes 1 891450) (vectors 16 28820) (vector-slots 8 939128 168466) (floats 8 41 19) (intervals 56 3515 0) (buffers 1000 20))