unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#55256: 27.1; Unicode noncharacters may change writing direction
@ 2022-05-04  7:06 frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-05-04  7:20 ` bug#55256: Writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-05-04  8:17 ` Eli Zaretskii
  0 siblings, 2 replies; 6+ messages in thread
From: frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-05-04  7:06 UTC (permalink / raw)
  To: 55256

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 8000 bytes --]


M-x set-input-method RET
ucs RET
jufdd0$1 RET

If you type C-a and then step through the characters with the right
arrow, the direction is reversed. It seems like the entry of $ followed
by something else is triggering it, but that may just be an
impression/side effect.

I suspect that the reason is that most Unicode noncharacters
(https://www.unicode.org/faq/private_use.html#nonchar1) are in an Arabic
block (https://www.unicode.org/faq/private_use.html#nonchar4b) and that
the cause is an incorrect generalisation of the properties of the
characters in this block.

In GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0)
 of 2020-09-19 built on lgw01-amd64-021
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: Ubuntu 20.04.4 LTS

Recent messages:
Char: ﷐‎ (64976, #o176720, #xfdd0, file ...) point=4361 of 7190 (61%) column=23
Mark set [2 times]
Auto-saving...done

Configured using:
 'configure --build=x86_64-linux-gnu --prefix=/usr
 '--includedir=${prefix}/include' '--mandir=${prefix}/share/man'
 '--infodir=${prefix}/share/info' --sysconfdir=/etc --localstatedir=/var
 --disable-silent-rules '--libdir=${prefix}/lib/x86_64-linux-gnu'
 '--libexecdir=${prefix}/lib/x86_64-linux-gnu' --disable-maintainer-mode
 --disable-dependency-tracking --prefix=/usr --sharedstatedir=/var/lib
 --libexecdir=/usr/lib --localstatedir=/var/lib
 --infodir=/usr/share/info --mandir=/usr/share/man
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp
 --program-suffix=27 --with-modules --with-file-notification=inotify
 --with-mailutils --with-harfbuzz --with-json --with-x=yes
 --with-x-toolkit=gtk3 --with-lcms2 --with-cairo --with-xpm=yes
 --with-gif=yes --with-gnutls=yes --with-jpeg=yes --with-png=yes
 --with-tiff=yes --with-xwidgets 'CFLAGS=-g -O2
 -fdebug-prefix-map=/build/emacs27-bifpWT/emacs27-27.1~1.git86d8d76aa3=. -fstack-protector-strong
 -Wformat -Werror=format-security -no-pie' 'CPPFLAGS=-Wdate-time
 -D_FORTIFY_SOURCE=2' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro
 -no-pie''

Configured features:
XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY
INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS XWIDGETS
LIBSYSTEMD JSON PDUMPER LCMS2 GMP

Important settings:
  value of $LC_MONETARY: en_GB.UTF-8
  value of $LC_NUMERIC: en_GB.UTF-8
  value of $LC_TIME: en_GB.UTF-8
  value of $LANG: en_GB.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix

Major mode: Fundamental

Minor modes in effect:
  hexl-follow-ascii: t
  global-git-commit-mode: t
  magit-auto-revert-mode: t
  shell-dirtrack-mode: t
  global-activity-watch-mode: t
  activity-watch-mode: t
  auto-revert-mode: t
  show-paren-mode: t
  desktop-save-mode: t
  display-time-mode: t
  editorconfig-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:

Features:
(shadow sort mail-extr emacsbug ruler-mode hexl uni-input quail misearch
multi-isearch magit-submodule magit-obsolete magit-blame magit-stash
magit-reflog magit-bisect magit-push magit-pull magit-fetch magit-clone
magit-remote magit-commit magit-sequence magit-notes magit-worktree
magit-tag magit-merge magit-branch magit-reset magit-files magit-refs
magit-status magit magit-repos magit-apply magit-wip magit-log
which-func magit-diff smerge-mode diff git-commit magit-core
magit-autorevert magit-margin magit-transient magit-process with-editor
shell magit-mode transient magit-git magit-base magit-section crm dash
compat-27 compat-26 compat flymake-shellcheck flymake-proc flymake
compile sh-script executable css-mode smie imenu rng-xsd xsd-regexp
rng-cmpct rng-nxml rng-valid rng-loc rng-uri rng-parse nxml-parse
rng-match rng-dt rng-util rng-pttrn nxml-ns nxml-mode nxml-outln
nxml-rap sgml-mode nxml-util nxml-enc xmltok ol-eww eww mm-url url-queue
ol-rmail ol-mhe ol-irc ol-info ol-gnus nnir ol-docview ol-bibtex ol-bbdb
ol-w3m ol-doi org-link-doi vc-git diff-mode markdown-mode edit-indirect
color dired-aux server activity-watch-mode request autorevert filenotify
ert pp ewoc debug backtrace paren desktop frameset cus-start cus-load
auto-dictionary flyspell ispell time editorconfig-core
editorconfig-core-handle editorconfig-fnmatch timeclock mu4e mu4e-org
mu4e-main mu4e-view mu4e-view-gnus gnus-art mm-uu mml2015 mm-view
mml-smime smime dig gnus-sum url url-proxy url-privacy url-expand
url-methods url-history gnus-group gnus-undo gnus-start gnus-cloud
nnimap nnmail mail-source utf7 netrc nnoo parse-time iso8601 gnus-spec
gnus-int gnus-range gnus-win gnus nnheader wid-edit mu4e-view-common
thingatpt mu4e-headers mu4e-compose mu4e-context mu4e-draft mu4e-actions
ido rfc2368 smtpmail sendmail mu4e-mark mu4e-proc mu4e-utils doc-view
jka-compr image-mode exif mu4e-lists mu4e-message shr url-cookie
url-domsuf url-util svg xml dom flow-fill mule-util mailcap hl-line
mu4e-vars mu4e-meta dired-x calfw-org org-capture org-element avl-tree
generator org-agenda org-refile calfw holidays hol-loaddefs cl
org-journal edmacro kmacro org-crypt org ob ob-tangle ob-ref ob-lob
ob-table ob-exp org-macro org-footnote org-src ob-comint org-pcomplete
pcomplete comint ansi-color org-list org-faces org-entities noutline
outline org-version ob-emacs-lisp ob-core ob-eval org-table oc-basic
bibtex ol rx org-keys oc org-compat advice org-macs org-loaddefs
find-func cal-iso cal-menu calendar cal-loaddefs vc-svn dsvn log-edit
easy-mmode message rmc puny dired dired-loaddefs format-spec rfc822 mml
mml-sec epa derived epg epg-config gnus-util rmail rmail-loaddefs
warnings text-property-search time-date mm-decode mm-bodies mm-encode
mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr
mailabbrev mail-utils gmm-utils mailheader ring pcvs-util add-log vc
vc-dispatcher editorconfig cl-extra help-mode use-package-ensure
use-package-core helm-easymenu info package easymenu browse-url
url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json subr-x map url-vars seq byte-opt gv
bytecomp byte-compile cconv cl-loaddefs cl-lib tooltip eldoc electric
uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads dbusbind inotify lcms2 dynamic-setting
system-font-setting font-render-setting xwidget-internal cairo
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 398834 33888)
 (symbols 48 37214 1)
 (strings 32 137584 6008)
 (string-bytes 1 4673915)
 (vectors 16 73119)
 (vector-slots 8 1675119 195896)
 (floats 8 531 329)
 (intervals 56 7085 0)
 (buffers 1000 76))





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55256: Writing direction
  2022-05-04  7:06 bug#55256: 27.1; Unicode noncharacters may change writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-05-04  7:20 ` frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-05-04  8:24   ` Eli Zaretskii
  2022-05-04  8:17 ` Eli Zaretskii
  1 sibling, 1 reply; 6+ messages in thread
From: frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-05-04  7:20 UTC (permalink / raw)
  To: 55256


I forgot to add:

The Unicode noncharacters should not cause a change in writing
direction, since they are not Arabic characters, but a set of characters
for internal use only (no exchange between different parties).





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55256: 27.1; Unicode noncharacters may change writing direction
  2022-05-04  7:06 bug#55256: 27.1; Unicode noncharacters may change writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-05-04  7:20 ` bug#55256: Writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-05-04  8:17 ` Eli Zaretskii
  1 sibling, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2022-05-04  8:17 UTC (permalink / raw)
  To: frederik.fouvry; +Cc: 55256

> Date: Wed, 04 May 2022 09:06:37 +0200
> From: frederik.fouvry--- via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> 
> M-x set-input-method RET
> ucs RET
> jufdd0$1 RET
> 
> If you type C-a and then step through the characters with the right
> arrow, the direction is reversed. It seems like the entry of $ followed
> by something else is triggering it, but that may just be an
> impression/side effect.
> 
> I suspect that the reason is that most Unicode noncharacters
> (https://www.unicode.org/faq/private_use.html#nonchar1) are in an Arabic
> block (https://www.unicode.org/faq/private_use.html#nonchar4b) and that
> the cause is an incorrect generalisation of the properties of the
> characters in this block.

Correct.  We failed to be in sync with the Unicode Standard in this
regard.  Should be fixed now on the master branch.

Thanks.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55256: Writing direction
  2022-05-04  7:20 ` bug#55256: Writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-05-04  8:24   ` Eli Zaretskii
  2022-05-04  9:02     ` Frederik Fouvry via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2022-05-04  8:24 UTC (permalink / raw)
  To: frederik.fouvry; +Cc: 55256

> Date: Wed, 04 May 2022 09:20:05 +0200
> From: frederik.fouvry--- via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> 
> I forgot to add:
> 
> The Unicode noncharacters should not cause a change in writing
> direction, since they are not Arabic characters, but a set of characters
> for internal use only (no exchange between different parties).

That is not entirely true, because Unicode assigns default Bidi Class
properties to some unassigned codepoints, and Emacs obeys that.  So an
unassigned codepoint (which is AFAIU what "noncharacter" stands for in
your terminology) for which Unicode says that its Bidi Class should
be, for example, AL, _will_ cause change of text directionality.

If you use those unassigned codepoints for private use, you will have
to override the default properties by manually modifying the relevant
Emacs char-tables at run time.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55256: Writing direction
  2022-05-04  8:24   ` Eli Zaretskii
@ 2022-05-04  9:02     ` Frederik Fouvry via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-06-02 13:09       ` bug#55256: 27.1; Unicode noncharacters may change writing direction Lars Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Frederik Fouvry via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-05-04  9:02 UTC (permalink / raw)
  To: 55256

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]

On Wed, 4 May 2022 at 10:23, Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Wed, 04 May 2022 09:20:05 +0200
> > From: frederik.fouvry--- via "Bug reports for GNU Emacs,
> >  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> >
> >
> > I forgot to add:
> >
> > The Unicode noncharacters should not cause a change in writing
> > direction, since they are not Arabic characters, but a set of characters
> > for internal use only (no exchange between different parties).
>
> That is not entirely true, because Unicode assigns default Bidi Class
> properties to some unassigned codepoints, and Emacs obeys that.  So an
> unassigned codepoint (which is AFAIU what "noncharacter" stands for in
> your terminology) for which Unicode says that its Bidi Class should
> be, for example, AL, _will_ cause change of text directionality.
>
> If you use those unassigned codepoints for private use, you will have
> to override the default properties by manually modifying the relevant
> Emacs char-tables at run time.
>

That sounds fair enough. I admit that my statement was generalising too
much.

The odd name "noncharacter" is Unicode terminology, not mine (see e.g. Spec
v14, Ch. 2, p.30).
<https://www.facebook.com/Acrolinx-127089923970436/>

[-- Attachment #2: Type: text/html, Size: 3047 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55256: 27.1; Unicode noncharacters may change writing direction
  2022-05-04  9:02     ` Frederik Fouvry via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-06-02 13:09       ` Lars Ingebrigtsen
  0 siblings, 0 replies; 6+ messages in thread
From: Lars Ingebrigtsen @ 2022-06-02 13:09 UTC (permalink / raw)
  To: Frederik Fouvry; +Cc: 55256

Frederik Fouvry <frederik.fouvry@acrolinx.com> writes:

>  If you use those unassigned codepoints for private use, you will have
>  to override the default properties by manually modifying the relevant
>  Emacs char-tables at run time.
>
> That sounds fair enough. I admit that my statement was generalising too much.

Eli fixed some bits here, and the rest is up to the users of these
unassigned codepoints, if I understand correctly, so I'm closing this
bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-06-02 13:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-04  7:06 bug#55256: 27.1; Unicode noncharacters may change writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-05-04  7:20 ` bug#55256: Writing direction frederik.fouvry--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-05-04  8:24   ` Eli Zaretskii
2022-05-04  9:02     ` Frederik Fouvry via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-06-02 13:09       ` bug#55256: 27.1; Unicode noncharacters may change writing direction Lars Ingebrigtsen
2022-05-04  8:17 ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).