unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#67828: 27.1; Sinhala touching consonants
@ 2023-12-14 17:08 Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 13:34 ` Eli Zaretskii
  0 siblings, 1 reply; 2+ messages in thread
From: Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-14 17:08 UTC (permalink / raw)
  To: 67828

To reproduce:

Paste the Sinhala script string "ගන‍්තවා" into a buffer.

Ther string will then display with a dotted circle in the middle.  The
problem is that this string is split into three clusters for rendering.
The dotted circle ought not to appear.  With a suitable font being
selected for Sinhala, e.g. Noto Sans Sinhala, the characters either side
of ් should abut or have very little separation.  Without a suitable
font, the display should fall back to one similar to "ගන්ත්‍වා".

The problem can be fixed by changing the file lisp/language/sinhala.el
as follows:

41c42
<
"[\u0D9A-\u0DC6]\\(?:\u0DCA\u200D[\u0D9A-\u0DC6]\\)*[\u0DCF-\u0DDF\u0DF2-\u0DF3]*\u0DCA?[\u0D82-\u0D83]?\\|"
---
> 	 "[\u0D9A-\u0DC6]\\(?:\\(\u0DCA\u200D\\|\u200D\u0DCA\\)[\u0D9A-\u0DC6]\\)*[\u0DCF-\u0DDF\u0DF2-\u0DF3]*\u0DCA?[\u0D82-\u0D83]?\\|"

There are three ways of suppressing the inherent vowel between
consonants in the Sinhala script:

1) Insert U+0DCA between them.  This character displays as a mark or
modified the preceding character, and there is otherwise no interaction
between them, and Emacs therefore treats the characters after it as a
separate cluster.

2) Insert the sequence U+0DCA U+200D between them.  Depending on font
design, the two characters will interact by one or both of them changing
shape or combining, and Indic rearrangment may occur across the join.
Alternatively, the first way may be used.

3) Insert the sequence U+200D U+0DCA between them.  The space between
the consonants should then be removed.  Indic rearrangment may occur
across the join.  If a font does not support this, the first way may be
used as a fallback.

Emacs 27.1 supports only the first two methods.  The change above
enables it to support all three methods by also forming a character
cluster for Way 3. 


In GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.33,
cairo version 1.16.0) of 2023-08-16, modified by Debian built on
lcy02-amd64-041 Windowing system distributor 'The X.Org Foundation',
version 11.0.12201001 System Description: Ubuntu 22.04.3 LTS

Recent messages:
Wrote /home/richard/PIE/Pali/sinhala.el
Loading /home/richard/PIE/Pali/sinhala.el (source)...done
t
(No changes need to be saved)
Auto-saving...done
Saving file /home/richard/PIE/Pali/sinhala.el...
Wrote /home/richard/PIE/Pali/sinhala.el
Loading /home/richard/PIE/Pali/sinhala.el (source)...done
t
End of buffer

Configured using:
 'configure --build x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/lib
 --localstatedir=/var/lib --infodir=/usr/share/info
 --mandir=/usr/share/man --enable-libsystemd --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils --build
 x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
 --libexecdir=/usr/lib --localstatedir=/var/lib
 --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd
 --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils --with-cairo
 --with-x=yes --with-x-toolkit=gtk3 --with-toolkit-scroll-bars
 'CFLAGS=-g -O2
 -ffile-prefix-map=/build/emacs-WL9mhG/emacs-27.1+1=.
-fstack-protector-strong -Wformat -Werror=format-security -Wall'
'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2'
'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro''

Configured features:
XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY
INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS LIBSYSTEMD
JSON PDUMPER LCMS2 GMP

Important settings:
  value of $LANG: en_GB.utf8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  show-paren-mode: t
  tpu-edt-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
format-spec rfc822 mml mml-sec epa derived epg epg-config gnus-util
rmail rmail-loaddefs text-property-search mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils thai-util thai-word
mule-util time-date cus-edit cus-start cus-load wid-edit paren tpu-edt
picture quail help-mode edmacro kmacro finder-inf package easymenu
browse-url url-handlers url-parse auth-source cl-seq eieio eieio-core
cl-macs eieio-loaddefs password-cache json subr-x map url-vars seq
byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib tooltip eldoc
electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win
x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads dbusbind inotify lcms2 dynamic-setting
system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit
x multi-tty make-network-process emacs)

Memory information:
((conses 16 202496 17379)
 (symbols 48 10301 1)
 (strings 32 34875 1922)
 (string-bytes 1 891450)
 (vectors 16 28820)
 (vector-slots 8 939128 168466)
 (floats 8 41 19)
 (intervals 56 3515 0)
 (buffers 1000 20))





^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#67828: 27.1; Sinhala touching consonants
  2023-12-14 17:08 bug#67828: 27.1; Sinhala touching consonants Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-16 13:34 ` Eli Zaretskii
  0 siblings, 0 replies; 2+ messages in thread
From: Eli Zaretskii @ 2023-12-16 13:34 UTC (permalink / raw)
  To: Richard Wordingham; +Cc: 67828-done

> Date: Thu, 14 Dec 2023 17:08:02 +0000
> From:  Richard Wordingham via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> To reproduce:
> 
> Paste the Sinhala script string "ගන‍්තවා" into a buffer.
> 
> Ther string will then display with a dotted circle in the middle.  The
> problem is that this string is split into three clusters for rendering.
> The dotted circle ought not to appear.  With a suitable font being
> selected for Sinhala, e.g. Noto Sans Sinhala, the characters either side
> of ් should abut or have very little separation.  Without a suitable
> font, the display should fall back to one similar to "ගන්ත්‍වා".
> 
> The problem can be fixed by changing the file lisp/language/sinhala.el
> as follows:
> 
> 41c42
> <
> "[\u0D9A-\u0DC6]\\(?:\u0DCA\u200D[\u0D9A-\u0DC6]\\)*[\u0DCF-\u0DDF\u0DF2-\u0DF3]*\u0DCA?[\u0D82-\u0D83]?\\|"
> ---
> > 	 "[\u0D9A-\u0DC6]\\(?:\\(\u0DCA\u200D\\|\u200D\u0DCA\\)[\u0D9A-\u0DC6]\\)*[\u0DCF-\u0DDF\u0DF2-\u0DF3]*\u0DCA?[\u0D82-\u0D83]?\\|"

Thanks, installed on the emacs-29 branch, and closing the bug.





^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-12-16 13:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-14 17:08 bug#67828: 27.1; Sinhala touching consonants Richard Wordingham via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-16 13:34 ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).