* bug#68751: 29.1; "\x0e0" is a multibyte string
@ 2024-01-27 6:23 Christopher Yeleighton
[not found] ` <handler.68751.B.17063370555677.ack@debbugs.gnu.org>
2024-01-27 7:38 ` Eli Zaretskii
0 siblings, 2 replies; 5+ messages in thread
From: Christopher Yeleighton @ 2024-01-27 6:23 UTC (permalink / raw)
To: 68751
M-: (multibyte-string-p "\x0e0") RET
> t
In GNU Emacs 29.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.38,
cairo version 1.17.8)
Windowing system distributor 'The X.Org Foundation', version 11.0.12101010
System Description: Arch Linux
Configured using:
'configure --sysconfdir=/etc --prefix=/usr --libexecdir=/usr/lib
--with-tree-sitter --localstatedir=/var --with-cairo
--disable-build-details --with-harfbuzz --with-libsystemd
--with-modules --with-x-toolkit=gtk3 'CFLAGS=-march=x86-64
-mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2
-Wformat -Werror=format-security -fstack-clash-protection
-fcf-protection -g
-ffile-prefix-map=/build/emacs/src=/usr/src/debug/emacs -flto=auto'
'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto''
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY INOTIFY
PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
TREE_SITTER WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB
Important settings:
value of $LANG: pl_PL.UTF-8
locale-coding-system: utf-8-unix
Major mode: Info
Minor modes in effect:
shell-dirtrack-mode: t
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
isearch-fold-quotes-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
buffer-read-only: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(xref project lpr thai-util thai-word repeat mailalias mailclient
textsec uni-scripts idna-mapping ucs-normalize uni-confusable
textsec-check facemenu shadow sort mail-extr emacsbug message yank-media
puny rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils browse-url url url-proxy url-privacy url-expand url-methods
url-history url-cookie generate-lisp-file url-domsuf url-util url-parse
url-vars mailcap mule-util info tar-mode arc-mode archive-mode sh-script
rx smie treesit executable files-x conf-mode shell pcomplete comint
ansi-osc ansi-color ring dired-aux dired dired-loaddefs noutline outline
icons two-column kmacro debug backtrace find-func face-remap shortdoc
text-property-search cl-extra cl-print erc-lang erc-goodies erc iso8601
auth-source cl-seq eieio eieio-core cl-macs password-cache json map pp
format-spec erc-backend erc-networks byte-opt gv bytecomp byte-compile
erc-common erc-compat erc-loaddefs thingatpt help-fns radix-tree
jka-compr misearch multi-isearch time-date subr-x rfc1345 quail
help-mode cl-loaddefs cl-lib rmc iso-transl tooltip cconv eldoc paren
electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo
move-toolbar gtk x-toolkit xinput2 x multi-tty make-network-process
emacs)
Memory information:
((conses 16 514130 81836)
(symbols 48 26162 44)
(strings 32 129312 5724)
(string-bytes 1 2808230)
(vectors 16 59943)
(vector-slots 8 1889204 140248)
(floats 8 953 203)
(intervals 56 27501 657)
(buffers 984 37))
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#68751: Acknowledgement (29.1; "\x0e0" is a multibyte string)
[not found] ` <handler.68751.B.17063370555677.ack@debbugs.gnu.org>
@ 2024-01-27 6:46 ` Christopher Yeleighton
2024-01-27 8:18 ` bug#68751: 29.1; "\x0e0" is a multibyte string Eli Zaretskii
0 siblings, 1 reply; 5+ messages in thread
From: Christopher Yeleighton @ 2024-01-27 6:46 UTC (permalink / raw)
To: 68751
Info (elisp) Non-ASCII in Strings says:
> If a string constant contains hexadecimal or octal escape sequences,
and these
> escape sequences all specify unibyte characters (i.e., less than 256),
> and there are no other literal non-ASCII characters or Unicode-style
> escape sequences in the string, then Emacs automatically assumes that it
> is a unibyte string.
I believe it should say:
| (i.e., less than 256 and octal or written with 2 hexadecimal digits),
and additionally
| Unibyte characters embedded in multibyte string constants evaluate to
private character codes,
| e.g. "\x0a0\xa0" equals "\x0a0\x3fffa0".
On 27.01.2024 06:31, GNU bug Tracking System wrote:
> Thank you for filing a new bug report with debbugs.gnu.org.
>
> This is an automatically generated reply to let you know your message
> has been received.
>
> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due course.
>
> Your message has been sent to the package maintainer(s):
> bug-gnu-emacs@gnu.org
>
> If you wish to submit further information on this problem, please
> send it to 68751@debbugs.gnu.org.
>
> Please do not send mail to help-debbugs@gnu.org unless you wish
> to report a problem with the Bug-tracking system.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#68751: 29.1; "\x0e0" is a multibyte string
2024-01-27 6:23 bug#68751: 29.1; "\x0e0" is a multibyte string Christopher Yeleighton
[not found] ` <handler.68751.B.17063370555677.ack@debbugs.gnu.org>
@ 2024-01-27 7:38 ` Eli Zaretskii
2024-01-27 7:53 ` Krzysztof Żelechowski
1 sibling, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2024-01-27 7:38 UTC (permalink / raw)
To: Christopher Yeleighton; +Cc: 68751
> Date: Sat, 27 Jan 2024 06:23:45 +0000
> From: Christopher Yeleighton <giecrilj@stegny.2a.pl>
>
> M-: (multibyte-string-p "\x0e0") RET
>
> > t
Why do you think this is a problem? U+0E0E is à, a non-ASCII
character, so it has a multibyte representation.
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#68751: 29.1; "\x0e0" is a multibyte string
2024-01-27 7:38 ` Eli Zaretskii
@ 2024-01-27 7:53 ` Krzysztof Żelechowski
0 siblings, 0 replies; 5+ messages in thread
From: Krzysztof Żelechowski @ 2024-01-27 7:53 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 68751
[-- Attachment #1: Type: text/html, Size: 947 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* bug#68751: 29.1; "\x0e0" is a multibyte string
2024-01-27 6:46 ` bug#68751: Acknowledgement (29.1; "\x0e0" is a multibyte string) Christopher Yeleighton
@ 2024-01-27 8:18 ` Eli Zaretskii
0 siblings, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2024-01-27 8:18 UTC (permalink / raw)
To: Christopher Yeleighton; +Cc: 68751
> Date: Sat, 27 Jan 2024 06:46:36 +0000
> From: Christopher Yeleighton <giecrilj@stegny.2a.pl>
>
> Info (elisp) Non-ASCII in Strings says:
>
> > If a string constant contains hexadecimal or octal escape sequences,
> and these
> > escape sequences all specify unibyte characters (i.e., less than 256),
> > and there are no other literal non-ASCII characters or Unicode-style
> > escape sequences in the string, then Emacs automatically assumes that it
> > is a unibyte string.
>
> I believe it should say:
>
> | (i.e., less than 256 and octal or written with 2 hexadecimal digits),
Right. I modified the text to that effect.
> and additionally
>
> | Unibyte characters embedded in multibyte string constants evaluate to
> private character codes,
> | e.g. "\x0a0\xa0" equals "\x0a0\x3fffa0".
I didn't make this change because I don't see how it is useful.
First, "evaluate" is confusing here. Also, "private character codes"
is confusing/incorrect, as it could be interpreted to mean Emacs
somehow uses the PUA of Unicode codespace, which it doesn't. Finally,
when Emacs converts from a single-byte representation of a raw byte to
its multibyte representation is an obscure matter largely defined by
ad-hoc compatibility considerations, and doesn't belong to the ELisp
manual.
I think this bug can be closed now.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-01-27 8:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-27 6:23 bug#68751: 29.1; "\x0e0" is a multibyte string Christopher Yeleighton
[not found] ` <handler.68751.B.17063370555677.ack@debbugs.gnu.org>
2024-01-27 6:46 ` bug#68751: Acknowledgement (29.1; "\x0e0" is a multibyte string) Christopher Yeleighton
2024-01-27 8:18 ` bug#68751: 29.1; "\x0e0" is a multibyte string Eli Zaretskii
2024-01-27 7:38 ` Eli Zaretskii
2024-01-27 7:53 ` Krzysztof Żelechowski
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).