unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
@ 2013-05-24 14:30 Cédric Chépied
  2019-08-15  4:50 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 21+ messages in thread
From: Cédric Chépied @ 2013-05-24 14:30 UTC (permalink / raw)
  To: 14461




start emacs -Q
use scratch buffer for example
type 'l' then M-x insert-char RET 336 RET
'l' letter will be stroken
type ' ' (space) then M-x insert-char RET 336 RET
space char is not stroken but strikeout is visible after the space character.
If you paste the entire line to someone using emacs 23 (with erc for example)
his display is ok.
As far as I know, space char is the only one with this behaviour.





In GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.4.2)
 of 2013-05-10 on dex, modified by Debian
 (emacs-snapshot package, version 2:20130510-1)
Windowing system distributor `The X.Org Foundation', version 11.0.11204000
System Description:	Debian GNU/Linux testing (jessie)

Configured using:
 `configure --build x86_64-linux-gnu --host x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var
 --infodir=/usr/share/info --mandir=/usr/share/man --with-pop=yes
 --enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/24.3.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.3.50/site-lisp:/usr/share/emacs/site-lisp
 --without-compress-info --with-crt-dir=/usr/lib/x86_64-linux-gnu/ --with-x=yes
 --with-x-toolkit=gtk3 --with-imagemagick=yes CFLAGS='-DDEBIAN
 -DSITELOAD_PURESIZE_EXTRA=5000 -g -O2' CPPFLAGS='-D_FORTIFY_SOURCE=2'
 LDFLAGS='-g -Wl,--as-needed -znocombreloc''

Important settings:
  value of $LANG: fr_FR.utf8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  erc-list-mode: t
  erc-menu-mode: t
  erc-autojoin-mode: t
  erc-ring-mode: t
  erc-networks-mode: t
  erc-pcomplete-mode: t
  erc-track-mode: t
  erc-track-minor-mode: t
  erc-match-mode: t
  erc-button-mode: t
  erc-stamp-mode: t
  erc-netsplit-mode: t
  erc-truncate-mode: t
  erc-log-mode: t
  diff-auto-refine-mode: t
  show-paren-mode: t
  erc-smiley-mode: t
  erc-irccontrols-mode: t
  erc-noncommands-mode: t
  erc-move-to-prompt-mode: t
  erc-readonly-mode: t
  shell-dirtrack-mode: t
  global-auto-complete-mode: t
  auto-complete-mode: t
  virtual-desktops-mode: t
  display-time-mode: t
  display-battery-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t
  hs-minor-mode: t

Recent input:
c h a <tab> <return> <kp-3> <kp-3> <kp-6> <kp-enter> 
C-c <down> e t SPC l à SPC t o u t e SPC l a SPC b 
a r r e SPC <M-backspace> p h r a s e SPC e s t SPC 
b a r r é e SPC c o r r e c t e m e n t ? C-SPC C-a 
M-x <up> <up> <return> <return> C-> <down> C-c <down> 
<down> <up> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> C-k 
C-k C-k C-_ <C-right> <C-right> <C-left> <left> <M-backspace> 
<backspace> <tab> C-e <down> <tab> C-e <backspace> 
C-k C-k C-k C-k C-k C-k C-k C-k <down> C-e <down> C-SPC 
<C-up> M-x <up> <up> <up> <up> <return> M-w <up> <up> 
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> 
<up> <up> C-SPC <C-right> <C-right> <C-right> <C-right> 
<C-right> <C-right> <C-right> <C-right> <C-right> <C-right> 
M-x <up> <up> <return> C-_ C-< l o l <return> n o n 
SPC <backspace> <return> m a i s SPC t i <backspace> 
u SPC n o u s SPC d é p a r t a g e r a i s SPC = ) 
<return> o n SPC n ' a SPC p a s SPC l e SPC m a i 
n <backspace> <backspace> <backspace> ê m e SPC r é 
s u l t a t SPC s e l o n SPC l e s SPC l o g i c i 
e l s SPC u t i l i s é s <return> o u <backspace> 
<backspace> m e r c i <return> C-c <up> <M-backspace> 
<M-backspace> <help-echo> C-> M-x r e p <tab> o <tab> 
r <tab> b u <tab> <return>

Recent messages:
Mark set [3 times]
Undo!
Mark set
You can run the command `insert-char' with C-x 8 RET
Mark set
windmove-do-window-select: Minibuffer is inactive
Undo!
Mark set [2 times]
Undo!
Making completion list... [2 times]

Load-path shadows:
/usr/share/emacs/24.3.50/site-lisp/debian-startup hides /usr/share/emacs/site-lisp/debian-startup
/usr/share/emacs/24.3.50/site-lisp/cscope/xcscope hides /usr/share/emacs/site-lisp/xcscope
/usr/share/emacs-snapshot/site-lisp/flim/hex-util hides /usr/share/emacs/24.3.50/lisp/hex-util
/usr/share/emacs-snapshot/site-lisp/flim/md4 hides /usr/share/emacs/24.3.50/lisp/md4
/usr/share/emacs-snapshot/site-lisp/flim/hmac-md5 hides /usr/share/emacs/24.3.50/lisp/net/hmac-md5
/usr/share/emacs-snapshot/site-lisp/flim/sasl hides /usr/share/emacs/24.3.50/lisp/net/sasl
/usr/share/emacs-snapshot/site-lisp/flim/hmac-def hides /usr/share/emacs/24.3.50/lisp/net/hmac-def
/usr/share/emacs-snapshot/site-lisp/flim/ntlm hides /usr/share/emacs/24.3.50/lisp/net/ntlm
/usr/share/emacs-snapshot/site-lisp/flim/sasl-cram hides /usr/share/emacs/24.3.50/lisp/net/sasl-cram
/usr/share/emacs-snapshot/site-lisp/flim/sasl-digest hides /usr/share/emacs/24.3.50/lisp/net/sasl-digest
/usr/share/emacs-snapshot/site-lisp/flim/sasl-ntlm hides /usr/share/emacs/24.3.50/lisp/net/sasl-ntlm
/usr/share/emacs-snapshot/site-lisp/wl/rfc2368 hides /usr/share/emacs/24.3.50/lisp/mail/rfc2368
/usr/share/emacs-snapshot/site-lisp/semi/pgg-pgp hides /usr/share/emacs/24.3.50/lisp/obsolete/pgg-pgp
/usr/share/emacs-snapshot/site-lisp/semi/pgg-pgp5 hides /usr/share/emacs/24.3.50/lisp/obsolete/pgg-pgp5
/usr/share/emacs-snapshot/site-lisp/semi/pgg hides /usr/share/emacs/24.3.50/lisp/obsolete/pgg
/usr/share/emacs-snapshot/site-lisp/semi/pgg-parse hides /usr/share/emacs/24.3.50/lisp/obsolete/pgg-parse
/usr/share/emacs-snapshot/site-lisp/semi/pgg-gpg hides /usr/share/emacs/24.3.50/lisp/obsolete/pgg-gpg
/usr/share/emacs-snapshot/site-lisp/semi/pgg-def hides /usr/share/emacs/24.3.50/lisp/obsolete/pgg-def

Features:
(shadow emacsbug find-func wl-fldmgr w3m-form smtp sasl sasl-anonymous
sasl-login sasl-plain rect cal-iso cal-move man grep conf-mode novice make-mode
dired view tabify compile vc-git pcmpl-unix misearch multi-isearch help-mode
mel-q-ccl wl-score elmo-internal mule-util windmove smiley gnus-art mm-uu
mml2015 mm-view mml-smime smime dig mailcap gnus-sum nnoo gnus-group gnus-undo
nnmail mail-source gnus-start gnus-spec gnus-int gnus-range message rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader gnus-win gnus gnus-ems nnheader flyspell ispell erc-menu erc-join
erc-ring erc-networks erc-pcomplete erc-track erc-match erc-button erc-fill
erc-stamp erc-netsplit erc-truncate erc-log rot13 disp-table epa-file epa epg
elmo-dop elmo-maildir elmo-map modb-standard wl-mime mime-edit pgg-parse pccl
pccl-20 signature mime-setup mail-mime-setup semi-setup mime-pgp pgg-def
mime-image wl-demo wl-draft eword-encode wl-template sendmail rfc2047 rfc2045
ietf-drums mail-utils wl-news derived wl-address wl-thread wl-action wl-summary
ps-print ps-def lpr wl-refile wl-message elmo-mime mmelmo-buffer mmelmo-imap
mmimap mime-parse mmbuffer mmgeneric wl-highlight elmo-multi wl-folder wl wl-e21
wl-util elmo-flag elmo-localdir wl-vars epg-config wl-version elmo elmo-signal
elmo-msgdb modb modb-generic modb-entity elmo-util utf7 elmo-date elmo-vars
elmo-version luna hideshow magit-blame magit-bisect magit-key-mode assoc magit
diff-mode log-edit pcvs-util add-log php-mode cc-langs speedbar sb-image ezimage
dframe xcscope smart-tabs chep-tag-popup etags hideif cc-mode cc-fonts cc-guess
cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs paren uniquify
chep-couleur chep-retourne chep-pastebin google_search ifndef_fichier_h
mime-play filename emu invisible inv-23 poem poem-e20 poem-e20_3 mime-view
mime-conf calist semi-def mime eword-decode mel path-util mime-def mcharset
mcs-20 mcs-e20 pces pces-e20 pces-20 broken pcustom poe std11 alist pym static
apel-ver product mime-w3m warnings advice help-fns w3m browse-url timezone
w3m-hist w3m-e23 wid-edit w3m-ccl ccl w3m-fsf w3m-favicon w3m-image w3m-proc
w3m-util erc-goodies erc erc-backend erc-compat format-spec thingatpt pp netrc
chep-notification readline-complete shell pcomplete comint ansi-color ring
auto-complete-config auto-complete edmacro kmacro cl-macs gv popup cl nadvice
cl-lib cal-china lunar solar cal-dst cal-bahai cal-islam cal-hebrew holidays
hol-loaddefs appt diary-lib diary-loaddefs cal-menu calendar cal-loaddefs ampc
easymenu avl-tree network-stream auth-source eieio byte-opt bytecomp
byte-compile cconv gnus-util mm-util mail-prsvr password-cache starttls tls
chep-anchor easy-mmode virtual-desktops ido time-date time battery cus-start
cus-load server w3m-load magit-install tooltip ediff-hook vc-hooks
lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list newcomment lisp-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5 base64
format env code-pages mule custom widget hashtable-print-readable backquote
make-network-process dbusbind inotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2013-05-24 14:30 bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination Cédric Chépied
@ 2019-08-15  4:50 ` Lars Ingebrigtsen
  2019-08-15  9:01   ` Stephen Berman
  2019-08-15 14:48   ` Eli Zaretskii
  0 siblings, 2 replies; 21+ messages in thread
From: Lars Ingebrigtsen @ 2019-08-15  4:50 UTC (permalink / raw)
  To: Cédric Chépied; +Cc: 14461

Cédric Chépied <cedric.chepied@gmail.com> writes:

> start emacs -Q
> use scratch buffer for example
> type 'l' then M-x insert-char RET 336 RET
> 'l' letter will be stroken
> type ' ' (space) then M-x insert-char RET 336 RET
> space char is not stroken but strikeout is visible after the space character.
> If you paste the entire line to someone using emacs 23 (with erc for example)
> his display is ok.
> As far as I know, space char is the only one with this behaviour.

(I'm going through old bug reports that have unfortunately gotten no
responses yet.)

I'm not quite sure I understand the bug report, but I tried this recipe,
and I'm not able to reproduce any odd behaviour here in Emacs 27 (I
think).

If I type

l M-x insert-char RET 336 RET

I get

l̶

which is displayed here as an l with a dash after it -- no overstrikes
or anything.  The same happens with a space character instead of an l.

I'm guessing something has changed with combining characters here?  Or
do I need to be in a particular language environment for the l and the
dash to combine?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-15  4:50 ` Lars Ingebrigtsen
@ 2019-08-15  9:01   ` Stephen Berman
  2019-08-15 10:02     ` Cédric Chépied
  2019-08-15 14:48   ` Eli Zaretskii
  1 sibling, 1 reply; 21+ messages in thread
From: Stephen Berman @ 2019-08-15  9:01 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 14461, Cédric Chépied

On Wed, 14 Aug 2019 21:50:33 -0700 Lars Ingebrigtsen <larsi@gnus.org> wrote:

> Cédric Chépied <cedric.chepied@gmail.com> writes:
>
>> start emacs -Q
>> use scratch buffer for example
>> type 'l' then M-x insert-char RET 336 RET
>> 'l' letter will be stroken
>> type ' ' (space) then M-x insert-char RET 336 RET
>> space char is not stroken but strikeout is visible after the space character.
>> If you paste the entire line to someone using emacs 23 (with erc for example)
>> his display is ok.
>> As far as I know, space char is the only one with this behaviour.
>
> (I'm going through old bug reports that have unfortunately gotten no
> responses yet.)
>
> I'm not quite sure I understand the bug report, but I tried this recipe,
> and I'm not able to reproduce any odd behaviour here in Emacs 27 (I
> think).
>
> If I type
>
> l M-x insert-char RET 336 RET
>
> I get
>
> l̶
>
> which is displayed here as an l with a dash after it -- no overstrikes
> or anything.  The same happens with a space character instead of an l.
>
> I'm guessing something has changed with combining characters here?  Or
> do I need to be in a particular language environment for the l and the
> dash to combine?

I also see l with a dash after it in Emacs started with -Q, in which the
font is DejaVu Sans Mono.  But when I then enable variable-pitch-mode,
which uses DejaVu Sans, I see l overlayed with a dash.

Steve Berman





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-15  9:01   ` Stephen Berman
@ 2019-08-15 10:02     ` Cédric Chépied
  2019-08-15 12:29       ` Stephen Berman
  0 siblings, 1 reply; 21+ messages in thread
From: Cédric Chépied @ 2019-08-15 10:02 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 14461, Lars Ingebrigtsen, Cédric Chépied

Stephen Berman wrote:
> I also see l with a dash after it in Emacs started with -Q, in which the
> font is DejaVu Sans Mono.  But when I then enable variable-pitch-mode,
> which uses DejaVu Sans, I see l overlayed with a dash.

You are right, it depends on the font. But with DejaVu Sans I still have the
problem with spaces.

@Lars Ingebrigtsen U+0336 is 'Combining Long Stroke Overlay' so I think it
should always be combined with the last character.
-- 
Cédric Chépied
<cedric.chepied@gmail.com>





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-15 10:02     ` Cédric Chépied
@ 2019-08-15 12:29       ` Stephen Berman
  2019-08-16  1:03         ` Lars Ingebrigtsen
  2019-08-17 12:00         ` Eli Zaretskii
  0 siblings, 2 replies; 21+ messages in thread
From: Stephen Berman @ 2019-08-15 12:29 UTC (permalink / raw)
  To: Cédric Chépied; +Cc: 14461, Lars Ingebrigtsen

On Thu, 15 Aug 2019 12:02:21 +0200 Cédric Chépied <cedric.chepied@gmail.com> wrote:

> Stephen Berman wrote:
>> I also see l with a dash after it in Emacs started with -Q, in which the
>> font is DejaVu Sans Mono.  But when I then enable variable-pitch-mode,
>> which uses DejaVu Sans, I see l overlayed with a dash.
>
> You are right, it depends on the font. But with DejaVu Sans I still have the
> problem with spaces.

Do you mean that the dash is displayed after rather than over the space
character?  If so, I see this too, but...

> @Lars Ingebrigtsen U+0336 is 'Combining Long Stroke Overlay' so I think it
> should always be combined with the last character.

... I assume combining characters are always displayed after a space
instead of over it -- at least that's what I see with e.g. U+0301
(COMBINING ACUTE ACCENT) and U+0302 (COMBINING CIRCUMFLEX ACCENT).  That
makes sense to me (otherwise, you couldn't visually distinguish e.g. the
sequence 'aU+0301U+0302' from the sequence 'aU+0301 U+0302') and I would
guess some Unicode standard prescribes it.

Steve Berman





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-15  4:50 ` Lars Ingebrigtsen
  2019-08-15  9:01   ` Stephen Berman
@ 2019-08-15 14:48   ` Eli Zaretskii
  1 sibling, 0 replies; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-15 14:48 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 14461, cedric.chepied

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Wed, 14 Aug 2019 21:50:33 -0700
> Cc: 14461@debbugs.gnu.org
> 
> If I type
> 
> l M-x insert-char RET 336 RET
> 
> I get
> 
> l̶
> 
> which is displayed here as an l with a dash after it -- no overstrikes
> or anything.

You need to do that with a font that has glyphs both for l and for
u+0336.  Emacs doesn't compose characters if their glyphs don't come
from the same fonts, for obvious reasons.

> do I need to be in a particular language environment for the l and the
> dash to combine?

No, composition of Latin combining accents is turned on by default.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-15 12:29       ` Stephen Berman
@ 2019-08-16  1:03         ` Lars Ingebrigtsen
  2019-08-16  6:55           ` Eli Zaretskii
  2019-08-17 12:00         ` Eli Zaretskii
  1 sibling, 1 reply; 21+ messages in thread
From: Lars Ingebrigtsen @ 2019-08-16  1:03 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 14461, Cédric Chépied

Stephen Berman <stephen.berman@gmx.net> writes:

> ... I assume combining characters are always displayed after a space
> instead of over it -- at least that's what I see with e.g. U+0301
> (COMBINING ACUTE ACCENT) and U+0302 (COMBINING CIRCUMFLEX ACCENT).  That
> makes sense to me (otherwise, you couldn't visually distinguish e.g. the
> sequence 'aU+0301U+0302' from the sequence 'aU+0301 U+0302') and I would
> guess some Unicode standard prescribes it.

Sounds logical.  Anybody know for sure?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-16  1:03         ` Lars Ingebrigtsen
@ 2019-08-16  6:55           ` Eli Zaretskii
  0 siblings, 0 replies; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-16  6:55 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 14461, stephen.berman, cedric.chepied

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Thu, 15 Aug 2019 18:03:31 -0700
> Cc: 14461@debbugs.gnu.org, Cédric Chépied
>  <cedric.chepied@gmail.com>
> 
> Stephen Berman <stephen.berman@gmx.net> writes:
> 
> > ... I assume combining characters are always displayed after a space
> > instead of over it -- at least that's what I see with e.g. U+0301
> > (COMBINING ACUTE ACCENT) and U+0302 (COMBINING CIRCUMFLEX ACCENT).  That
> > makes sense to me (otherwise, you couldn't visually distinguish e.g. the
> > sequence 'aU+0301U+0302' from the sequence 'aU+0301 U+0302') and I would
> > guess some Unicode standard prescribes it.
> 
> Sounds logical.  Anybody know for sure?

I don't know for sure, but I will look into this soon if no one beats
me to it.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-15 12:29       ` Stephen Berman
  2019-08-16  1:03         ` Lars Ingebrigtsen
@ 2019-08-17 12:00         ` Eli Zaretskii
  2019-08-17 13:50           ` Stephen Berman
  2019-09-07  9:21           ` Eli Zaretskii
  1 sibling, 2 replies; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-17 12:00 UTC (permalink / raw)
  To: Stephen Berman, Kenichi Handa; +Cc: 14461, larsi, cedric.chepied

> From: Stephen Berman <stephen.berman@gmx.net>
> Date: Thu, 15 Aug 2019 14:29:08 +0200
> Cc: 14461@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org>
> 
> On Thu, 15 Aug 2019 12:02:21 +0200 Cédric Chépied <cedric.chepied@gmail.com> wrote:
> 
> ... I assume combining characters are always displayed after a space
> instead of over it -- at least that's what I see with e.g. U+0301
> (COMBINING ACUTE ACCENT) and U+0302 (COMBINING CIRCUMFLEX ACCENT).

Indeed, we reject base characters of certain general categories,
including those whose general category is Zs (space separator).  In
composite.el:compose-gstring-for-graphic we have:

     ;; This sequence doesn't start with a proper base character.
     ((memq (get-char-code-property (lgstring-char gstring 0)
				    'general-category)
	    '(Mn Mc Me Zs Zl Zp Cc Cf Cs))
      nil)

> That makes sense to me (otherwise, you couldn't visually distinguish
> e.g. the sequence 'aU+0301U+0302' from the sequence 'aU+0301 U+0302')

I don't see why: the former should be displayed as a single grapheme
cluster, with both diacritics on top of a, whereas the latter should
be displayed as 2 grapheme clusters, with U+0302 on top of the SPC
character instead of on top of a.

> and I would guess some Unicode standard prescribes it.

Actually , the Unicode Standard prescribes the opposite.  It says
(paragraph 3.6):

  D50 Graphic character: A character with the General Category of
      Letter (L), Combining Mark (M), Number (N), Punctuation (P),
      Symbol (S), or Space Separator (Zs).
  ...
  D51 Base character: Any graphic character except for those with the
      General Category of Combining Mark (M).
       • Most Unicode characters are base characters. In terms of
	 General Category values, a base character is any code point
	 that has one of the following categories: Letter (L), Number
	 (N), Punctuation (P), Symbol (S), or Space Separator (Zs).
  ...
  D52 Combining character: A character with the General Category of
      Combining Mark (M).

and (in 2.11)

      All combining characters can be applied to any base character and
      can, in principle, be used with any script.

So I don't think we are right when we exclude space separators from
base characters eligible for character composition, I think it's a
mistake.  Perhaps Handa-san (CC'ed) could comment on why we do that.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 12:00         ` Eli Zaretskii
@ 2019-08-17 13:50           ` Stephen Berman
  2019-08-17 14:14             ` Eli Zaretskii
  2019-09-07  9:21           ` Eli Zaretskii
  1 sibling, 1 reply; 21+ messages in thread
From: Stephen Berman @ 2019-08-17 13:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, cedric.chepied, 14461

On Sat, 17 Aug 2019 15:00:18 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Date: Thu, 15 Aug 2019 14:29:08 +0200
>> Cc: 14461@debbugs.gnu.org, Lars Ingebrigtsen <larsi@gnus.org>
>>
>> ... I assume combining characters are always displayed after a space
>> instead of over it -- at least that's what I see with e.g. U+0301
>> (COMBINING ACUTE ACCENT) and U+0302 (COMBINING CIRCUMFLEX ACCENT).
>> That makes sense to me (otherwise, you couldn't visually distinguish
>> e.g. the sequence 'aU+0301U+0302' from the sequence 'aU+0301 U+0302')
>
> I don't see why: the former should be displayed as a single grapheme
> cluster, with both diacritics on top of a, whereas the latter should
> be displayed as 2 grapheme clusters, with U+0302 on top of the SPC
> character instead of on top of a.

Hm, I chose COMBINING ACUTE ACCENT and COMBINING CIRCUMFLEX ACCENT more
or less at random, but I do indeed see the sequence 'aU+0301U+0302' as
two grapheme clusters (also with -Q): 'a' with an acute accent over it
followed by a circumflex.  In contrast, the sequences 'aU+0301U+0317'
and 'aU+0302U+0317' are displayed as single grapheme clusters (317 is
COMBINING ACUTE ACCENT BELOW).  I also noticed that the seqence
'-U+0301U+0302' is displayed as a dash followed by a single grapheme
cluster of an acute accent and a circumflex; this holds for all
nonalphabetic ASCII characters I tried and for some but not all
non-ASCII alphabetic characters.  So there seems to be some
inconsistency in the display of combining characters.

Steve Berman





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 13:50           ` Stephen Berman
@ 2019-08-17 14:14             ` Eli Zaretskii
  2019-08-17 14:40               ` Stephen Berman
  0 siblings, 1 reply; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-17 14:14 UTC (permalink / raw)
  To: Stephen Berman; +Cc: larsi, cedric.chepied, 14461

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: Kenichi Handa <handa@gnu.org>,  cedric.chepied@gmail.com,
>   14461@debbugs.gnu.org,  larsi@gnus.org
> Date: Sat, 17 Aug 2019 15:50:20 +0200
> 
> Hm, I chose COMBINING ACUTE ACCENT and COMBINING CIRCUMFLEX ACCENT more
> or less at random, but I do indeed see the sequence 'aU+0301U+0302' as
> two grapheme clusters (also with -Q): 'a' with an acute accent over it
> followed by a circumflex.  In contrast, the sequences 'aU+0301U+0317'
> and 'aU+0302U+0317' are displayed as single grapheme clusters (317 is
> COMBINING ACUTE ACCENT BELOW).  I also noticed that the seqence
> '-U+0301U+0302' is displayed as a dash followed by a single grapheme
> cluster of an acute accent and a circumflex; this holds for all
> nonalphabetic ASCII characters I tried and for some but not all
> non-ASCII alphabetic characters.  So there seems to be some
> inconsistency in the display of combining characters.

Is this in Emacs 27 built with HarfBuzz support?  If so, I think this
just means that the default font you use doesn't support these
combining accents, because on my system I see a single grapheme
cluster in both of the above cases, when I select a suitable font.

You can tell which font is used for each character by typing
"C-u C-x =" on each character/grapheme cluster.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 14:14             ` Eli Zaretskii
@ 2019-08-17 14:40               ` Stephen Berman
  2019-08-17 15:09                 ` Eli Zaretskii
  0 siblings, 1 reply; 21+ messages in thread
From: Stephen Berman @ 2019-08-17 14:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, cedric.chepied, 14461

On Sat, 17 Aug 2019 17:14:45 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: Kenichi Handa <handa@gnu.org>,  cedric.chepied@gmail.com,
>>   14461@debbugs.gnu.org,  larsi@gnus.org
>> Date: Sat, 17 Aug 2019 15:50:20 +0200
>> 
>> Hm, I chose COMBINING ACUTE ACCENT and COMBINING CIRCUMFLEX ACCENT more
>> or less at random, but I do indeed see the sequence 'aU+0301U+0302' as
>> two grapheme clusters (also with -Q): 'a' with an acute accent over it
>> followed by a circumflex.  In contrast, the sequences 'aU+0301U+0317'
>> and 'aU+0302U+0317' are displayed as single grapheme clusters (317 is
>> COMBINING ACUTE ACCENT BELOW).  I also noticed that the seqence
>> '-U+0301U+0302' is displayed as a dash followed by a single grapheme
>> cluster of an acute accent and a circumflex; this holds for all
>> nonalphabetic ASCII characters I tried and for some but not all
>> non-ASCII alphabetic characters.  So there seems to be some
>> inconsistency in the display of combining characters.
>
> Is this in Emacs 27 built with HarfBuzz support?

Yes (both --with-cairo and without).

>                                                   If so, I think this
> just means that the default font you use doesn't support these
> combining accents, because on my system I see a single grapheme
> cluster in both of the above cases, when I select a suitable font.

My default font is DejaVu Sans Mono, but it seems there's something else
at play here: in contrast to 'aU+0301U+0302', I do see the sequence
'bU+0301U+0302' as a single grapheme cluster.  Maybe the difference is
because there is a glyph for 'a' with an acute accent and it doesn't
support further combining.  (But I have no idea if that makes sense.)
Here's what describe-char shows on both:

________________________________________________________________________
             position: 1 of 7 (0%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
              charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET 61" or "C-x 8 RET LATIN SMALL LETTER A"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "á̂" (see below)

Composed with the following character(s) "́̂" using this font:
  xfthb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 2 97 163 9 0 8 12 0 nil]
  [0 2 769 650 9 2 7 12 -9 [0 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')

________________________________________________________________________
             position: 5 of 7 (57%), column: 0
            character: b (displayed as b) (codepoint 98, #o142, #x62)
              charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x62
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET 62" or "C-x 8 RET LATIN SMALL LETTER B"
          buffer code: #x62
            file code: #x62 (encoded by coding system utf-8-unix)
              display: composed to form "b́̂" (see below)

Composed with the following character(s) "́̂" using this font:
  xfthb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 2 98 69 9 1 9 11 0 nil]
  [0 2 769 649 9 3 7 12 -9 [-9 -3 0]]
  [0 2 770 650 9 2 7 12 -9 [-9 -3 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER B
  general-category: Ll (Letter, Lowercase)
  decomposition: (98) ('b')






^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 14:40               ` Stephen Berman
@ 2019-08-17 15:09                 ` Eli Zaretskii
  2019-08-17 15:39                   ` Stephen Berman
  0 siblings, 1 reply; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-17 15:09 UTC (permalink / raw)
  To: Stephen Berman; +Cc: larsi, cedric.chepied, 14461

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>   larsi@gnus.org
> Date: Sat, 17 Aug 2019 16:40:44 +0200
> 
> >                                                   If so, I think this
> > just means that the default font you use doesn't support these
> > combining accents, because on my system I see a single grapheme
> > cluster in both of the above cases, when I select a suitable font.
> 
> My default font is DejaVu Sans Mono, but it seems there's something else
> at play here: in contrast to 'aU+0301U+0302', I do see the sequence
> 'bU+0301U+0302' as a single grapheme cluster.  Maybe the difference is
> because there is a glyph for 'a' with an acute accent and it doesn't
> support further combining.  (But I have no idea if that makes sense.)
> Here's what describe-char shows on both:

That says you have a single grapheme cluster in both cases.  Does the
cursor include all of the characters in both cases?  IOW, can you move
with C-f between these 3 characters, or do they behave as a single
character cell in both cases?  If the latter, then the composition was
done in both cases, and you simply should find a better font if you
want these displayed more nicely.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 15:09                 ` Eli Zaretskii
@ 2019-08-17 15:39                   ` Stephen Berman
  2019-08-17 15:44                     ` Eli Zaretskii
  0 siblings, 1 reply; 21+ messages in thread
From: Stephen Berman @ 2019-08-17 15:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, cedric.chepied, 14461

On Sat, 17 Aug 2019 18:09:17 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>>   larsi@gnus.org
>> Date: Sat, 17 Aug 2019 16:40:44 +0200
>>
>> >                                                   If so, I think this
>> > just means that the default font you use doesn't support these
>> > combining accents, because on my system I see a single grapheme
>> > cluster in both of the above cases, when I select a suitable font.
>>
>> My default font is DejaVu Sans Mono, but it seems there's something else
>> at play here: in contrast to 'aU+0301U+0302', I do see the sequence
>> 'bU+0301U+0302' as a single grapheme cluster.  Maybe the difference is
>> because there is a glyph for 'a' with an acute accent and it doesn't
>> support further combining.  (But I have no idea if that makes sense.)
>> Here's what describe-char shows on both:
>
> That says you have a single grapheme cluster in both cases.  Does the
> cursor include all of the characters in both cases?

Visually, in the case of the 'a' sequence, the cursor does not cover the
circumflex, but...

>                                                      IOW, can you move
> with C-f between these 3 characters, or do they behave as a single
> character cell in both cases?

Yes, with the 'a' sequence, when I type C-f, the cursor now appears over
the circumflex, but describe-char says the character at that position is
C-j, and typing C-f indeed advances point to the next line.

>                                If the latter, then the composition was
> done in both cases, and you simply should find a better font if you
> want these displayed more nicely.

That indeed appears to be the case: when I change the font to DejaVu
Sans (i.e. not the monospace version), then the 'a' sequence is
displayed like the 'b' sequence, with both combining characters over the
alphabetic character.  This seems like a bug in the monospace font, but
it also seems unlikely such a bug wouldn't have been noticed and fixed
long ago, so I suspect there must be some other reason for the
difference.

Steve Berman





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 15:39                   ` Stephen Berman
@ 2019-08-17 15:44                     ` Eli Zaretskii
  2019-08-17 17:05                       ` Stephen Berman
  0 siblings, 1 reply; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-17 15:44 UTC (permalink / raw)
  To: Stephen Berman; +Cc: larsi, cedric.chepied, 14461

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>   larsi@gnus.org
> Date: Sat, 17 Aug 2019 17:39:41 +0200
> 
> >                                If the latter, then the composition was
> > done in both cases, and you simply should find a better font if you
> > want these displayed more nicely.
> 
> That indeed appears to be the case: when I change the font to DejaVu
> Sans (i.e. not the monospace version), then the 'a' sequence is
> displayed like the 'b' sequence, with both combining characters over the
> alphabetic character.  This seems like a bug in the monospace font, but
> it also seems unlikely such a bug wouldn't have been noticed and fixed
> long ago, so I suspect there must be some other reason for the
> difference.

Does HarfBuzz's hb-view produce the same display with the monospaced
font?  If so, I'd bet it's a problem with the font.  You could ask
about this on the HarfBuzz mailing list.

If hb-view produces a different display, then it could be our problem.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 15:44                     ` Eli Zaretskii
@ 2019-08-17 17:05                       ` Stephen Berman
  2019-08-17 17:29                         ` Eli Zaretskii
  0 siblings, 1 reply; 21+ messages in thread
From: Stephen Berman @ 2019-08-17 17:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, cedric.chepied, 14461

On Sat, 17 Aug 2019 18:44:33 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>>   larsi@gnus.org
>> Date: Sat, 17 Aug 2019 17:39:41 +0200
>>
>> >                                If the latter, then the composition was
>> > done in both cases, and you simply should find a better font if you
>> > want these displayed more nicely.
>>
>> That indeed appears to be the case: when I change the font to DejaVu
>> Sans (i.e. not the monospace version), then the 'a' sequence is
>> displayed like the 'b' sequence, with both combining characters over the
>> alphabetic character.  This seems like a bug in the monospace font, but
>> it also seems unlikely such a bug wouldn't have been noticed and fixed
>> long ago, so I suspect there must be some other reason for the
>> difference.
>
> Does HarfBuzz's hb-view produce the same display with the monospaced
> font?  If so, I'd bet it's a problem with the font.  You could ask
> about this on the HarfBuzz mailing list.
>
> If hb-view produces a different display, then it could be our problem.

Executing this:

$ hb-view /usr/share/fonts/dejavu/DejaVuSansMono.ttf -u 'U+061, U+301, U+302'

displays just 'a' with an acute accent over it, i.e. the circumflex is
not displayed at all (unlike Emacs, which display the circumflex to the
right of the a + acute accent grapheme).

Steve Berman





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 17:05                       ` Stephen Berman
@ 2019-08-17 17:29                         ` Eli Zaretskii
  2019-08-17 18:11                           ` Stephen Berman
  0 siblings, 1 reply; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-17 17:29 UTC (permalink / raw)
  To: Stephen Berman; +Cc: larsi, cedric.chepied, 14461

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>   larsi@gnus.org
> Date: Sat, 17 Aug 2019 19:05:20 +0200
> 
> Executing this:
> 
> $ hb-view /usr/share/fonts/dejavu/DejaVuSansMono.ttf -u 'U+061, U+301, U+302'
> 
> displays just 'a' with an acute accent over it, i.e. the circumflex is
> not displayed at all

I don't think this is true, I think the accents are overlaid in a way
that makes them hard to distinguish.  Try zooming in, if you can.

> (unlike Emacs, which display the circumflex to the right of the a +
> acute accent grapheme).

Hmm... something strange happens with DejaVu Sans Mono.  I tried two
different font backends, and they both display the circumflex
incorrectly.  That doesn't happen with other monospaced fonts I tried.

I have no idea what's going on here, sorry.

In any case, the issue at hand is not about this particular display
with this particular font, it's more general.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 17:29                         ` Eli Zaretskii
@ 2019-08-17 18:11                           ` Stephen Berman
  2019-08-17 18:22                             ` Eli Zaretskii
  0 siblings, 1 reply; 21+ messages in thread
From: Stephen Berman @ 2019-08-17 18:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, cedric.chepied, 14461

[-- Attachment #1: Type: text/plain, Size: 698 bytes --]

On Sat, 17 Aug 2019 20:29:22 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>>   larsi@gnus.org
>> Date: Sat, 17 Aug 2019 19:05:20 +0200
>>
>> Executing this:
>>
>> $ hb-view /usr/share/fonts/dejavu/DejaVuSansMono.ttf -u 'U+061, U+301, U+302'
>>
>> displays just 'a' with an acute accent over it, i.e. the circumflex is
>> not displayed at all
>
> I don't think this is true, I think the accents are overlaid in a way
> that makes them hard to distinguish.  Try zooming in, if you can.

I find the displays quite unambiguous; here is the output as SVG images:


[-- Attachment #2: a and b with combining accents --]
[-- Type: image/png, Size: 39395 bytes --]

[-- Attachment #3: Type: text/plain, Size: 511 bytes --]


>> (unlike Emacs, which display the circumflex to the right of the a +
>> acute accent grapheme).
>
> Hmm... something strange happens with DejaVu Sans Mono.  I tried two
> different font backends, and they both display the circumflex
> incorrectly.  That doesn't happen with other monospaced fonts I tried.
>
> I have no idea what's going on here, sorry.
>
> In any case, the issue at hand is not about this particular display
> with this particular font, it's more general.

That I agree with.

Steve Berman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 18:11                           ` Stephen Berman
@ 2019-08-17 18:22                             ` Eli Zaretskii
  2019-08-17 18:58                               ` Stephen Berman
  0 siblings, 1 reply; 21+ messages in thread
From: Eli Zaretskii @ 2019-08-17 18:22 UTC (permalink / raw)
  To: Stephen Berman; +Cc: larsi, cedric.chepied, 14461

> From: Stephen Berman <stephen.berman@gmx.net>
> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>   larsi@gnus.org
> Date: Sat, 17 Aug 2019 20:11:19 +0200
> 
> >> displays just 'a' with an acute accent over it, i.e. the circumflex is
> >> not displayed at all
> >
> > I don't think this is true, I think the accents are overlaid in a way
> > that makes them hard to distinguish.  Try zooming in, if you can.
> 
> I find the displays quite unambiguous; here is the output as SVG images:

Then I guess our display is just fine, and the problem is with the
font after all.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 18:22                             ` Eli Zaretskii
@ 2019-08-17 18:58                               ` Stephen Berman
  0 siblings, 0 replies; 21+ messages in thread
From: Stephen Berman @ 2019-08-17 18:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: larsi, cedric.chepied, 14461

On Sat, 17 Aug 2019 21:22:09 +0300 Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stephen Berman <stephen.berman@gmx.net>
>> Cc: handa@gnu.org,  cedric.chepied@gmail.com,  14461@debbugs.gnu.org,
>>   larsi@gnus.org
>> Date: Sat, 17 Aug 2019 20:11:19 +0200
>>
>> >> displays just 'a' with an acute accent over it, i.e. the circumflex is
>> >> not displayed at all
>> >
>> > I don't think this is true, I think the accents are overlaid in a way
>> > that makes them hard to distinguish.  Try zooming in, if you can.
>>
>> I find the displays quite unambiguous; here is the output as SVG images:
>
> Then I guess our display is just fine, and the problem is with the
> font after all.

I was about to reply that the difference between the Emacs and the
hb-view display (displaying the circumflex as if it were in the next
column vs. not displaying it at all) is nevertheless striking, but then
it occurred to me to try this:

hb-view /usr/share/fonts/dejavu/DejaVuSansMono.ttf -u 'U+061, U+301, U+302, U+062'

(i.e. the sequence 'a' + COMBINING ACUTE ACCENT + COMBINING CIRCUMFLEX
ACCENT + 'b') and the display shows 'a' with an acute accent over it
followed by 'b' with a circumflex over it.  A slightly different display
is shown by this:

hb-view /usr/share/fonts/dejavu/DejaVuSansMono.ttf -u 'U+061, U+301, U+062, U+302'

(i.e. switching the order of 'b' and the circumflex): here the
circumflex is placed higher than the ascender of 'b', while with the
previous input the circumflex is next to the ascender.  I see exactly
the same display in Emacs with

M-: (insert ?a #x301 #x302 ?b) vs.
M-: (insert ?a #x301 ?b #x302)

I guess in the case of 'U+061, U+301, U+302' hb-view limits the width of
the display to the one alphabetic character.

In short, I agree with your conclusion.

Steve Berman





^ permalink raw reply	[flat|nested] 21+ messages in thread

* bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination
  2019-08-17 12:00         ` Eli Zaretskii
  2019-08-17 13:50           ` Stephen Berman
@ 2019-09-07  9:21           ` Eli Zaretskii
  1 sibling, 0 replies; 21+ messages in thread
From: Eli Zaretskii @ 2019-09-07  9:21 UTC (permalink / raw)
  To: stephen.berman, handa; +Cc: larsi, 14461-done, cedric.chepied

> Date: Sat, 17 Aug 2019 15:00:18 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 14461@debbugs.gnu.org, larsi@gnus.org, cedric.chepied@gmail.com
> 
> Actually , the Unicode Standard prescribes the opposite.  It says
> (paragraph 3.6):
> 
>   D50 Graphic character: A character with the General Category of
>       Letter (L), Combining Mark (M), Number (N), Punctuation (P),
>       Symbol (S), or Space Separator (Zs).
>   ...
>   D51 Base character: Any graphic character except for those with the
>       General Category of Combining Mark (M).
>        • Most Unicode characters are base characters. In terms of
> 	 General Category values, a base character is any code point
> 	 that has one of the following categories: Letter (L), Number
> 	 (N), Punctuation (P), Symbol (S), or Space Separator (Zs).
>   ...
>   D52 Combining character: A character with the General Category of
>       Combining Mark (M).
> 
> and (in 2.11)
> 
>       All combining characters can be applied to any base character and
>       can, in principle, be used with any script.
> 
> So I don't think we are right when we exclude space separators from
> base characters eligible for character composition, I think it's a
> mistake.  Perhaps Handa-san (CC'ed) could comment on why we do that.

No further comments, so I've installed changes to allow SPC and other
similar characters to be composed.

I'm therefore marking this bug done.





^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-09-07  9:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-24 14:30 bug#14461: 24.3.50; bad display for 'space' + (U+0336) unicode combination Cédric Chépied
2019-08-15  4:50 ` Lars Ingebrigtsen
2019-08-15  9:01   ` Stephen Berman
2019-08-15 10:02     ` Cédric Chépied
2019-08-15 12:29       ` Stephen Berman
2019-08-16  1:03         ` Lars Ingebrigtsen
2019-08-16  6:55           ` Eli Zaretskii
2019-08-17 12:00         ` Eli Zaretskii
2019-08-17 13:50           ` Stephen Berman
2019-08-17 14:14             ` Eli Zaretskii
2019-08-17 14:40               ` Stephen Berman
2019-08-17 15:09                 ` Eli Zaretskii
2019-08-17 15:39                   ` Stephen Berman
2019-08-17 15:44                     ` Eli Zaretskii
2019-08-17 17:05                       ` Stephen Berman
2019-08-17 17:29                         ` Eli Zaretskii
2019-08-17 18:11                           ` Stephen Berman
2019-08-17 18:22                             ` Eli Zaretskii
2019-08-17 18:58                               ` Stephen Berman
2019-09-07  9:21           ` Eli Zaretskii
2019-08-15 14:48   ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).