* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
@ 2021-12-12 20:13 Daniel Mendler
2021-12-12 20:42 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-12 20:13 UTC (permalink / raw)
To: 52459
1. Start emacs -Q
2. Enter the following in the scratch buffer:
(let ((print-escape-control-characters t))
(prin1-to-string bidi-directional-controls-chars))
3. Evaluate. The bidi control characters are not escaped despite
print-escape-control-characters=t.
The bidi characters should probably be treated as control characters
since they have the Bidi_Control property according to the Unicode
standard.
If it is undesirable to treat bidi control characters like other control
characters it may make sense to introduce another print configuration
variable, print-escape-all-control-characters or
print-escape-bidi-control-characters?
In GNU Emacs 28.0.90 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.5,
cairo version 1.16.0)
of 2021-12-05 built on projects
Repository revision: 34f56561372d83b71dcaff1cdf5d9264ba38fa0e
Repository branch: emacs-28
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description: Debian GNU/Linux 10 (buster)
Configured using:
'configure --prefix=/home/user/emacs/install --with-cairo'
Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG
LIBSELINUX LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND
THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE XIM XPM GTK3 ZLIB
Important settings:
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util rmail
rmail-loaddefs auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map text-property-search time-date
subr-x seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils thingatpt
help-fns radix-tree help-mode cl-loaddefs cl-lib two-column iso-transl
tooltip eldoc paren electric uniquify ediff-hook vc-hooks
lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget hashtable-print-readable backquote threads dbusbind
inotify dynamic-setting system-font-setting font-render-setting cairo
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)
Memory information:
((conses 16 57250 10751)
(symbols 48 6852 1)
(strings 32 21147 1607)
(string-bytes 1 651816)
(vectors 16 13342)
(vector-slots 8 183349 13347)
(floats 8 22 48)
(intervals 56 260 1)
(buffers 992 13))
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-12 20:13 bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t Daniel Mendler
@ 2021-12-12 20:42 ` Eli Zaretskii
2021-12-12 21:11 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-12 20:42 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Sun, 12 Dec 2021 21:13:12 +0100
>
> 1. Start emacs -Q
> 2. Enter the following in the scratch buffer:
> (let ((print-escape-control-characters t))
> (prin1-to-string bidi-directional-controls-chars))
> 3. Evaluate. The bidi control characters are not escaped despite
> print-escape-control-characters=t.
>
> The bidi characters should probably be treated as control characters
> since they have the Bidi_Control property according to the Unicode
> standard.
print-escape-control-characters is about ASCII control characters, not
about Unicode fomatting controls.
> If it is undesirable to treat bidi control characters like other control
> characters it may make sense to introduce another print configuration
> variable, print-escape-all-control-characters or
> print-escape-bidi-control-characters?
I don't think it's desirable. Those formatting controls have starkly
different roles that ASCII control characters, and we already have
features to make them stand out on display. Moreover, escape
sequences are not well-defined for codepoints beyond a single byte.
So I don't think we should do anything here, and we should close the
bug.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-12 20:42 ` Eli Zaretskii
@ 2021-12-12 21:11 ` Daniel Mendler
2021-12-12 21:33 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-12 21:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
>> 1. Start emacs -Q
>> 2. Enter the following in the scratch buffer:
>> (let ((print-escape-control-characters t))
>> (prin1-to-string bidi-directional-controls-chars))
>> 3. Evaluate. The bidi control characters are not escaped despite
>> print-escape-control-characters=t.
>>
>> The bidi characters should probably be treated as control characters
>> since they have the Bidi_Control property according to the Unicode
>> standard.
>
> print-escape-control-characters is about ASCII control characters, not
> about Unicode fomatting controls.
I see, but this is not a satisfactory answer.
>> If it is undesirable to treat bidi control characters like other control
>> characters it may make sense to introduce another print configuration
>> variable, print-escape-all-control-characters or
>> print-escape-bidi-control-characters?
>
> I don't think it's desirable. Those formatting controls have starkly
> different roles that ASCII control characters, and we already have
> features to make them stand out on display. Moreover, escape
> sequences are not well-defined for codepoints beyond a single byte.
>
> So I don't think we should do anything here, and we should close the
> bug.
I disagree. I would like to turn this report into a feature request
then. I've observed that multiple packages which escape strings and
print them for debugging or help purposes have difficulties with bidi
characters. All I would like to have is a way to print strings safely,
such that control characters (any kind of control characters, ascii or
unicode) do not affect the output.
Example packages which are affected by this issue are for example the
Helpful package, which provides an enhanced Help buffer. Another package
affected by the issue is the Marginalia package which adds annotations
to `describe-variable` in the minibuffer. The annotations show the
variable value. I would like to print the variable values in a safe way
which does not mess up the display. Instead of "string" these packages
show "str"gnandallthatfollowsisgarbage. How is this supposed to be done?
I propose the addition of an additional variable which configures
prin1-string such that all control characters which affect the display
in special ways are escaped. Is there an alternative approach to achieve
this goal?
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-12 21:11 ` Daniel Mendler
@ 2021-12-12 21:33 ` Daniel Mendler
2021-12-13 12:22 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-12 21:33 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
On 12/12/21 10:11 PM, Daniel Mendler wrote:
> Example packages which are affected by this issue are for example the
> Helpful package, which provides an enhanced Help buffer. Another package
> affected by the issue is the Marginalia package which adds annotations
> to `describe-variable` in the minibuffer. The annotations show the
> variable value. I would like to print the variable values in a safe way
> which does not mess up the display. Instead of "string" these packages
> show "str"gnandallthatfollowsisgarbage. How is this supposed to be done?
>
> I propose the addition of an additional variable which configures
> prin1-string such that all control characters which affect the display
> in special ways are escaped. Is there an alternative approach to achieve
> this goal?
There is actually one function which comes close in functionality -
`bidi-string-mark-left-to-right`. However this function is not really a
pure string manipulation function since it adds display properties. So
this function can only be used if the string is directly displayed as
is. The function is not a good fit if the resulting string is
manipulated further, truncated, etc.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-12 21:33 ` Daniel Mendler
@ 2021-12-13 12:22 ` Eli Zaretskii
2021-12-13 13:19 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 12:22 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> From: Daniel Mendler <mail@daniel-mendler.de>
> Cc: 52459@debbugs.gnu.org
> Date: Sun, 12 Dec 2021 22:33:50 +0100
>
> On 12/12/21 10:11 PM, Daniel Mendler wrote:
> > Example packages which are affected by this issue are for example the
> > Helpful package, which provides an enhanced Help buffer. Another package
> > affected by the issue is the Marginalia package which adds annotations
> > to `describe-variable` in the minibuffer. The annotations show the
> > variable value. I would like to print the variable values in a safe way
> > which does not mess up the display. Instead of "string" these packages
> > show "str"gnandallthatfollowsisgarbage. How is this supposed to be done?
> >
> > I propose the addition of an additional variable which configures
> > prin1-string such that all control characters which affect the display
> > in special ways are escaped. Is there an alternative approach to achieve
> > this goal?
>
> There is actually one function which comes close in functionality -
> `bidi-string-mark-left-to-right`. However this function is not really a
> pure string manipulation function since it adds display properties. So
> this function can only be used if the string is directly displayed as
> is. The function is not a good fit if the resulting string is
> manipulated further, truncated, etc.
Sorry, I don't understand. The examples you provided are of text
being displayed. Which is expected, since these controls have no
other effect _except_ when the text is displayed. So why isn't the
existing function bidi-string-mark-left-to-right (which was introduced
precisely for the situations like you describe, and is actually used
in Emacs for those purposes) the solution for the class of problems
that you described?
And can we agree that displaying these characters as escapes would not
solve the problems you had in mind, so it is off the table for the
rest of this discussion?
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 12:22 ` Eli Zaretskii
@ 2021-12-13 13:19 ` Daniel Mendler
2021-12-13 13:30 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 13:19 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
> Sorry, I don't understand. The examples you provided are of text
> being displayed. Which is expected, since these controls have no
> other effect _except_ when the text is displayed. So why isn't the
> existing function bidi-string-mark-left-to-right (which was introduced
> precisely for the situations like you describe, and is actually used
> in Emacs for those purposes) the solution for the class of problems
> that you described?
`bidi-string-mark-left-to-right` is an insufficient solution since it
manipulates the string on the level of display properties. It appends an
invisible character. If I take the string returned by
`bidi-string-mark-left-to-right` I cannot manipulate it freely
afterwards. In particular if I truncate the string, the invisible
character will be lost again.
I need a function which sanitizes a string completely, such that after
sanitization I can use and manipulate it without having to worry about
display properties or other pecularities.
> And can we agree that displaying these characters as escapes would not
> solve the problems you had in mind, so it is off the table for the
> rest of this discussion?
No, I disagree. This should not be off the table. I don't understand why
you want to close this issue so quickly. The probem I described is an
actual problem, which I've observed in multiple packages. Ideally Emacs
would offer a solution on the API level such that package authors and
users can sanitize strings in a robust way. Such an API does not exist
currently.
Escaping all control characters is my preferred solution. What about
adding two variables: `print-escape-unicode-control-characters` and
`print-escape-ascii-control-characters`, such that it is explicit what
is going on? The variable `print-escape-control-characters` could be
deprecated or aliased to `print-escape-ascii-control-characters`.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 13:19 ` Daniel Mendler
@ 2021-12-13 13:30 ` Daniel Mendler
2021-12-13 15:24 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 13:30 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
In other words, there is a need for a functionality which makes it
possible to turn a string into a string literal in a form which could be
used in source code for example.
If you look at the definition of bidi-directional-controls-chars in
simple.el, the bidi characters are escaped there. Why is this? Why did
you write this definition in this form and not with unescaped bidi
characters?
(defvar bidi-directional-controls-chars "\x202a-\x202e\x2066-\x2069"
"Character set that matches bidirectional formatting control characters.")
If I set `print-escape-control-characters` or some hypothetical
`print-escape-unicode-control-characters` to t, I expect a string
literal of this form. The current behavior of Emacs is counter intuitive
since it does not escape all control characters but singles out only the
ASCII control characters. This behavior does not seem correct given full
Emacs support in Emacs.
Therefore my proposal to add two variables
`print-escape-ascii-control-characters` and
`print-escape-unicode-control-characters`.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 13:30 ` Daniel Mendler
@ 2021-12-13 15:24 ` Eli Zaretskii
2021-12-13 16:32 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 15:24 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> From: Daniel Mendler <mail@daniel-mendler.de>
> Cc: 52459@debbugs.gnu.org
> Date: Mon, 13 Dec 2021 14:30:13 +0100
>
> In other words, there is a need for a functionality which makes it
> possible to turn a string into a string literal in a form which could be
> used in source code for example.
>
> If you look at the definition of bidi-directional-controls-chars in
> simple.el, the bidi characters are escaped there. Why is this? Why did
> you write this definition in this form and not with unescaped bidi
> characters?
>
> (defvar bidi-directional-controls-chars "\x202a-\x202e\x2066-\x2069"
> "Character set that matches bidirectional formatting control characters.")
So you want a feature that would produce strings suitable for using in
program source files, like we did in the above example? Is that the
meaning of "sanitize" you have been using?
Are there other use cases for those "escaped" or "sanitized" strings?
If so, please describe them as well.
Or if that's not the correct meaning of "sanitized", please define it
more accurately.
You see, this discussion is hard because I still don't understand what
is it that you want Emacs to provide, and for what purposes. Please
try to clarify that, to make the discussion more efficient and avoid
misunderstandings.
For now, I understand that those strings are not necessarily required
to be readable on the Emacs display, at least not in all cases,
because some of the reordering that these controls produce will be
disabled when they are represented by ASCII escapes, and the character
order on display will change as result. If the string includes RTL
characters, the result might not be easily readable. But AFAIU, this
is not a problem for the use cases you have in mind?
> Therefore my proposal to add two variables
> `print-escape-ascii-control-characters` and
> `print-escape-unicode-control-characters`.
I'd prefer to wait with concrete proposals until the requirements are
clear. It seems like a variable like those you mention, which only
affect the Emacs display, but not the string contents, is not what you
need. You need to actually produce the ASCII characters \x2023a, so
that you could put them in a string like you show above. Or am I
misunderstanding again?
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 15:24 ` Eli Zaretskii
@ 2021-12-13 16:32 ` Daniel Mendler
2021-12-13 17:07 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 16:32 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
On 12/13/21 4:24 PM, Eli Zaretskii wrote:
> So you want a feature that would produce strings suitable for using in
> program source files, like we did in the above example? Is that the
> meaning of "sanitize" you have been using?
>
> Are there other use cases for those "escaped" or "sanitized" strings?
> If so, please describe them as well.
As you said the use case is to produce strings suitable in source files
or strings in a form which looks like strings occurring in source files.
This use case appears in debuggers and other UIs which inspect variable
values at Emacs runtime. Furthermore code editing and refactoring tools
produce strings which are supposed to be used directly in source files.
For the usage of strings in source files on could simply use
`print-escape-multibyte=t` however in particular in debugger UIs this
leads to a severe obfuscation of the output, which in particular hurts
users wo use Emacs in a setup with a multi-byte bidi language, Hebrew,
Arabic, Chinese, etc. Therefore in debugger UIs I only want to escape
control characters but not other multi-byte display characters.
There are packages which provide such debugging or live inspection of
values, I already mentioned the Helpful (MELPA) package and the
Marginalia (GNU ELPA) package. Both of these variable inspection
utilities are affected by the problem of how to escape string literals
or print Emacs values properly, transforming the value to a printable
representation as it would appear in source.
> For now, I understand that those strings are not necessarily required
> to be readable on the Emacs display, at least not in all cases,
> because some of the reordering that these controls produce will be
> disabled when they are represented by ASCII escapes, and the character
> order on display will change as result. If the string includes RTL
> characters, the result might not be easily readable. But AFAIU, this
> is not a problem for the use cases you have in mind?
Yes, if one escapes bidi control characters the readability of the
result is affected, but not as severely as with
`print-escape-multibyte=t`. The goal is to produce strings which don't
contain hidden characters for debugging, so even if readability is
affected it is still not as bad as with `print-escape-multibyte=t`.
Note that I am not a native speaker and only an occasional user of
multi-byte bidi languages, e.g., for educational purpose. Therefore I
cannot tell how such bidi strings are usually written in source code.
But I suspect that in source code literals ideally all non-visible
control characters are written in escaped form. However visible
multi-byte characters may be written literally in source code. At least
this is the practice I am using in source code regarding other Unicode
characters, I may write displayable characters as is but I will always
escape control characters (no difference between ASCII or Unicode).
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 16:32 ` Daniel Mendler
@ 2021-12-13 17:07 ` Eli Zaretskii
2021-12-13 18:13 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 17:07 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> Cc: 52459@debbugs.gnu.org
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Mon, 13 Dec 2021 17:32:40 +0100
>
> On 12/13/21 4:24 PM, Eli Zaretskii wrote:
> > So you want a feature that would produce strings suitable for using in
> > program source files, like we did in the above example? Is that the
> > meaning of "sanitize" you have been using?
> >
> > Are there other use cases for those "escaped" or "sanitized" strings?
> > If so, please describe them as well.
>
> As you said the use case is to produce strings suitable in source files
> or strings in a form which looks like strings occurring in source files.
> This use case appears in debuggers and other UIs which inspect variable
> values at Emacs runtime. Furthermore code editing and refactoring tools
> produce strings which are supposed to be used directly in source files.
>
> For the usage of strings in source files on could simply use
> `print-escape-multibyte=t` however in particular in debugger UIs this
> leads to a severe obfuscation of the output, which in particular hurts
> users wo use Emacs in a setup with a multi-byte bidi language, Hebrew,
> Arabic, Chinese, etc. Therefore in debugger UIs I only want to escape
> control characters but not other multi-byte display characters.
So there are two different use cases:
1) produce strings for using in program source files.
2) produce strings for display in various UIs
The solutions should IMO be different, because the first is not about
displaying these characters, while the second is about displaying
them.
For 1), is print-escape-multibyte satisfactory? If not, why not?
For 2), we now have in Emacs 29 the glyphless-display-mode, whereby
the bidi control characters are shown as small boxes with their
acronyms (RLE, FSI, PDI, etc.). Is that satisfactory? If not, why
not?
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 17:07 ` Eli Zaretskii
@ 2021-12-13 18:13 ` Daniel Mendler
2021-12-13 18:28 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 18:13 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
> 1) produce strings for using in program source files.
> 2) produce strings for display in various UIs
>
> The solutions should IMO be different, because the first is not about
> displaying these characters, while the second is about displaying
> them.
No, they are not different for my purposes since I want to have the
ability to copy strings from the UI to a source file. Working around the
problem on the display level (glyphless-display-mode) will preclude this
use case.
> For 1), is print-escape-multibyte satisfactory? If not, why not?
I already explained this. `print-escape-multibyte` obfuscates the string
too much, which is undesirable for a debugging UI. Note that I am
passing on this experience report from a Russian user who observed that
Marginalia (which currently uses `print-escape-multibyte=t`) produces
output which is not as helpful as it could be thanks to the escaping of
all multi byte characters. The escaping hurts users of multi-byte languages.
> For 2), we now have in Emacs 29 the glyphless-display-mode, whereby
> the bidi control characters are shown as small boxes with their
> acronyms (RLE, FSI, PDI, etc.). Is that satisfactory? If not, why
> not?
The `glyphless-display-mode` would be a possible workaround if I just
pass on the characters unescaped. However I want to produce strings
which I can possibly copy to source code buffers. This is not possible
if the strings are not escaped and contain the problematic control
characters in literal form.
Once again - I propose the addition of configuration variables which
configure `prin1-string` to produce output where all control characters
are escaped. I would even argue that current variable
`print-escape-control-characters` is misleading since it only encodes
Ascii control characters. Is there anything which prevents the addition
of a configuration variable `print-escape-unicode-control-characters`,
which ensures full escaping of *all control characters* or we could even
further and add `print-escape-glyphless-characters` which would treat
the same characters as `glyphless-display-mode`.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 18:13 ` Daniel Mendler
@ 2021-12-13 18:28 ` Eli Zaretskii
2021-12-13 18:35 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 18:28 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> Cc: 52459@debbugs.gnu.org
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Mon, 13 Dec 2021 19:13:32 +0100
>
> > 1) produce strings for using in program source files.
> > 2) produce strings for display in various UIs
> >
> > The solutions should IMO be different, because the first is not about
> > displaying these characters, while the second is about displaying
> > them.
>
> No, they are not different for my purposes since I want to have the
> ability to copy strings from the UI to a source file. Working around the
> problem on the display level (glyphless-display-mode) will preclude this
> use case.
But in the debug UI use-case, you do want to see the text and be able
to read it, don't you? Which is why you said you don't want to have
escapes there, you want to see characters. Which means that use-case
is about _displaying_ these characters, not _replacing_ them with
something else. And you _cannot_ copy anything that only exists on
display, because that copies the actual codepoints, not their visual
representation.
> > For 1), is print-escape-multibyte satisfactory? If not, why not?
>
> I already explained this. `print-escape-multibyte` obfuscates the string
> too much, which is undesirable for a debugging UI. Note that I am
> passing on this experience report from a Russian user who observed that
> Marginalia (which currently uses `print-escape-multibyte=t`) produces
> output which is not as helpful as it could be thanks to the escaping of
> all multi byte characters. The escaping hurts users of multi-byte languages.
The use case you describe with Marginalia is of a different kind --
why do they use print-escape-multibyte in that case? Cyrillic text
doesn't use bidi controls, so what does that use case have to do with
your request?
What I'm suggesting is to use print-escape-multibyte when producing
strings for inclusion in the source code, and only for that purpose.
You, OTOH, are talking about case 2), where these strings are
presented in a UI. Then of course print-escape-multibyte is
inappropriate for that.
> Once again - I propose the addition of configuration variables which
> configure `prin1-string` to produce output where all control characters
> are escaped.
That could help in case 1), but not in case 2), because there prin1 is
not used, or not necessarily used.
> `print-escape-control-characters` is misleading since it only encodes
> Ascii control characters. Is there anything which prevents the addition
> of a configuration variable `print-escape-unicode-control-characters`,
> which ensures full escaping of *all control characters*
Escaping where? on display? That won't help you to write strings as
in bidi-directional-controls-chars. I thought this was one part of
your request?
But I already said that, and so it sounds like we have some grave
misunderstanding that I'm unable to resolve. So maybe it's time for
someone else to try. Sorry I couldn't be of more help.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 18:28 ` Eli Zaretskii
@ 2021-12-13 18:35 ` Daniel Mendler
2021-12-13 18:52 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 18:35 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
On 12/13/21 7:28 PM, Eli Zaretskii wrote:
> But in the debug UI use-case, you do want to see the text and be able
> to read it, don't you? Which is why you said you don't want to have
> escapes there, you want to see characters. Which means that use-case
> is about _displaying_ these characters, not _replacing_ them with
> something else. And you _cannot_ copy anything that only exists on
> display, because that copies the actual codepoints, not their visual
> representation.
I want to escape *only control characters* not all multi byte characters.
> The use case you describe with Marginalia is of a different kind --
> why do they use print-escape-multibyte in that case? Cyrillic text
> doesn't use bidi controls, so what does that use case have to do with
> your request?
Of course Cyrillic text is not using bidi but it is still affected by
print-escape-multibyte. If I have a debugging UI I want it to work with
all kinds of languages, rtl and ltr. Multi-byte and single-byte.
> What I'm suggesting is to use print-escape-multibyte when producing
> strings for inclusion in the source code, and only for that purpose.
> You, OTOH, are talking about case 2), where these strings are
> presented in a UI. Then of course print-escape-multibyte is
> inappropriate for that.
This is not good enough, I want to produce strings which can be copied
to the source and presented in the UI in the same form. I argue that
this is not an unreasonable requirement.
>> Once again - I propose the addition of configuration variables which
>> configure `prin1-string` to produce output where all control characters
>> are escaped.
>
> That could help in case 1), but not in case 2), because there prin1 is
> not used, or not necessarily used.
I am only taking about prin1. The issue is about prin1. My goal is to
produce safely escaped string representations of Elisp values, including
strings and other values.
> But I already said that, and so it sounds like we have some grave
> misunderstanding that I'm unable to resolve. So maybe it's time for
> someone else to try. Sorry I couldn't be of more help.
Yes. Maybe someone else can chime in with their opinion.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 18:35 ` Daniel Mendler
@ 2021-12-13 18:52 ` Eli Zaretskii
2021-12-13 18:57 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 18:52 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> Cc: 52459@debbugs.gnu.org
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Mon, 13 Dec 2021 19:35:37 +0100
>
> > What I'm suggesting is to use print-escape-multibyte when producing
> > strings for inclusion in the source code, and only for that purpose.
> > You, OTOH, are talking about case 2), where these strings are
> > presented in a UI. Then of course print-escape-multibyte is
> > inappropriate for that.
>
> This is not good enough, I want to produce strings which can be copied
> to the source and presented in the UI in the same form. I argue that
> this is not an unreasonable requirement.
Not only is it unreasonable, it is simply impossible. Representing
characters _on_display_ and writing such a representation into a file,
as in simple.el, are two different and incompatible goals. The
solutions for them must be separate. I already explained why, and if
my explanations still don't convince you, then I'm sorry, but I cannot
help you more than that, because it means we don't have a common
language and understanding to discuss this stuff.
> >> Once again - I propose the addition of configuration variables which
> >> configure `prin1-string` to produce output where all control characters
> >> are escaped.
> >
> > That could help in case 1), but not in case 2), because there prin1 is
> > not used, or not necessarily used.
>
> I am only taking about prin1. The issue is about prin1. My goal is to
> produce safely escaped string representations of Elisp values, including
> strings and other values.
Once again: prin1 will not help with displaying these characters.
Emacs doesn't use prin1 to display text.
> > But I already said that, and so it sounds like we have some grave
> > misunderstanding that I'm unable to resolve. So maybe it's time for
> > someone else to try. Sorry I couldn't be of more help.
>
> Yes. Maybe someone else can chime in with their opinion.
Yes, please.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 18:52 ` Eli Zaretskii
@ 2021-12-13 18:57 ` Daniel Mendler
2021-12-13 19:08 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 18:57 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
On 12/13/21 7:52 PM, Eli Zaretskii wrote:
> Not only is it unreasonable, it is simply impossible. Representing
> characters _on_display_ and writing such a representation into a file,
> as in simple.el, are two different and incompatible goals. The
> solutions for them must be separate. I already explained why, and if
> my explanations still don't convince you, then I'm sorry, but I cannot
> help you more than that, because it means we don't have a common
> language and understanding to discuss this stuff.
I produce strings from Elisp values using `prin1-to-string`. These
strings should be escaped such that I can use copy them to source files
as is. Furthermore when I display the string in the UI, the string
should not mess up the display. This requires the string to not have
control characters.
You are seriously misunderstanding what I am proposing.
Please consider my proposal: I propose the addition of a variable
`print-escape-unicode-control-characters` which ensures that
`prin1-to-string` returns a string where all control characters are
escaped. This proposal is certainly not impossible.
Currently `prin1-to-string` produces a string which contains bidi
control characters despite `print-escape-control-characters=t`.
>>>> Once again - I propose the addition of configuration variables which
>>>> configure `prin1-string` to produce output where all control characters
>>>> are escaped.
>>>
>>> That could help in case 1), but not in case 2), because there prin1 is
>>> not used, or not necessarily used.
>>
>> I am only taking about prin1. The issue is about prin1. My goal is to
>> produce safely escaped string representations of Elisp values, including
>> strings and other values.
>
> Once again: prin1 will not help with displaying these characters.
> Emacs doesn't use prin1 to display text.
Of course not. I am using prin1 to create a string from a value which I
can then copy to a source file or display somewhere in the UI.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 18:57 ` Daniel Mendler
@ 2021-12-13 19:08 ` Eli Zaretskii
2021-12-13 19:16 ` Daniel Mendler
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 19:08 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> Cc: 52459@debbugs.gnu.org
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Mon, 13 Dec 2021 19:57:54 +0100
>
> > Once again: prin1 will not help with displaying these characters.
> > Emacs doesn't use prin1 to display text.
>
> Of course not. I am using prin1 to create a string from a value which I
> can then copy to a source file or display somewhere in the UI.
That will solve only a small fraction of situations where these
characters are displayed, because the vast majority of them don't use
prin1 to produce a string to display. Most of the stuff displayed by
Emacs doesn't come from strings produced by prin1, it comes from
displaying the text of some buffer.
So your proposal is incomplete at best. Again, I already tried to
explain that several times.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 19:08 ` Eli Zaretskii
@ 2021-12-13 19:16 ` Daniel Mendler
2021-12-13 19:38 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-13 19:16 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459
On 12/13/21 8:08 PM, Eli Zaretskii wrote:
>> Cc: 52459@debbugs.gnu.org
>> From: Daniel Mendler <mail@daniel-mendler.de>
>> Date: Mon, 13 Dec 2021 19:57:54 +0100
>>
>>> Once again: prin1 will not help with displaying these characters.
>>> Emacs doesn't use prin1 to display text.
>>
>> Of course not. I am using prin1 to create a string from a value which I
>> can then copy to a source file or display somewhere in the UI.
>
> That will solve only a small fraction of situations where these
> characters are displayed, because the vast majority of them don't use
> prin1 to produce a string to display. Most of the stuff displayed by
> Emacs doesn't come from strings produced by prin1, it comes from
> displaying the text of some buffer.
Yes, it solves a small fraction of situations. This issue does not
address a generic setting, it only addresses the behavior of `prin1`. As
I described at length there are packages which are affected by this and
which could improve from an improvement of `prin1`.
To break it down once more:
1. We have the function `prin1-to-string` which can be used to produce a
string representation for an Emacs lisp value.
2. The behavior of the function can be adjusted via configuration
variables, in particular `print-escape-multibyte` and
`print-escape-control-characters`. `print-escape-multibyte` is very
aggressive, it escapes every multibyte character.
`print-escape-control-characters` only escapes ASCII control characters.
3. I am asking for a way to configure `prin1-to-string` such that it
escapes control and other glyphless characters but not all multibyte
characters, such that text still stays somewhat readable. I want less
aggressive escaping than `print-escape-multibyte`. If I set only
`print-escape-control-characters=t` is not sufficient since it escapes
only ASCII control characters.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 19:16 ` Daniel Mendler
@ 2021-12-13 19:38 ` Eli Zaretskii
2021-12-14 18:23 ` Dmitry Gutov
0 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-13 19:38 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459
> Cc: 52459@debbugs.gnu.org
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Mon, 13 Dec 2021 20:16:54 +0100
>
> > That will solve only a small fraction of situations where these
> > characters are displayed, because the vast majority of them don't use
> > prin1 to produce a string to display. Most of the stuff displayed by
> > Emacs doesn't come from strings produced by prin1, it comes from
> > displaying the text of some buffer.
>
> Yes, it solves a small fraction of situations. This issue does not
> address a generic setting, it only addresses the behavior of `prin1`. As
> I described at length there are packages which are affected by this and
> which could improve from an improvement of `prin1`.
>
> To break it down once more:
>
> 1. We have the function `prin1-to-string` which can be used to produce a
> string representation for an Emacs lisp value.
>
> 2. The behavior of the function can be adjusted via configuration
> variables, in particular `print-escape-multibyte` and
> `print-escape-control-characters`. `print-escape-multibyte` is very
> aggressive, it escapes every multibyte character.
> `print-escape-control-characters` only escapes ASCII control characters.
>
> 3. I am asking for a way to configure `prin1-to-string` such that it
> escapes control and other glyphless characters but not all multibyte
> characters, such that text still stays somewhat readable. I want less
> aggressive escaping than `print-escape-multibyte`. If I set only
> `print-escape-control-characters=t` is not sufficient since it escapes
> only ASCII control characters.
And, to reiterate once more, I'm against partial solutions that affect
only some functions that produce strings, and don't affect at all any
text displayed from a buffer. It would be a broken solution, because
we will never be able to explain why 'prin1' produces escapes whereas
'format' and 'message' don't. It is unreasonable to require Lisp
programs which will want to use something like that to use only
'prin1' and not the other functions routinely used for producing text
for display.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-13 19:38 ` Eli Zaretskii
@ 2021-12-14 18:23 ` Dmitry Gutov
2021-12-14 18:32 ` Daniel Mendler
2021-12-14 18:39 ` Eli Zaretskii
0 siblings, 2 replies; 28+ messages in thread
From: Dmitry Gutov @ 2021-12-14 18:23 UTC (permalink / raw)
To: Eli Zaretskii, Daniel Mendler; +Cc: 52459
On 13.12.2021 22:38, Eli Zaretskii wrote:
>> To break it down once more:
>>
>> 1. We have the function `prin1-to-string` which can be used to produce a
>> string representation for an Emacs lisp value.
>>
>> 2. The behavior of the function can be adjusted via configuration
>> variables, in particular `print-escape-multibyte` and
>> `print-escape-control-characters`. `print-escape-multibyte` is very
>> aggressive, it escapes every multibyte character.
>> `print-escape-control-characters` only escapes ASCII control characters.
>>
>> 3. I am asking for a way to configure `prin1-to-string` such that it
>> escapes control and other glyphless characters but not all multibyte
>> characters, such that text still stays somewhat readable. I want less
>> aggressive escaping than `print-escape-multibyte`. If I set only
>> `print-escape-control-characters=t` is not sufficient since it escapes
>> only ASCII control characters.
> And, to reiterate once more, I'm against partial solutions that affect
> only some functions that produce strings, and don't affect at all any
> text displayed from a buffer. It would be a broken solution, because
> we will never be able to explain why 'prin1' produces escapes whereas
> 'format' and 'message' don't.
I just did a little testing, and it seems
'print-escape-control-characters' only affects 'prin1-to-string' and
'prin1' but not 'message' or 'format'.
Is that a problem?
If not, adding a new variable which makes the same distinction seems
consistent with the current design.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:23 ` Dmitry Gutov
@ 2021-12-14 18:32 ` Daniel Mendler
2021-12-14 18:40 ` Dmitry Gutov
` (2 more replies)
2021-12-14 18:39 ` Eli Zaretskii
1 sibling, 3 replies; 28+ messages in thread
From: Daniel Mendler @ 2021-12-14 18:32 UTC (permalink / raw)
To: Dmitry Gutov, Eli Zaretskii; +Cc: 52459
On 12/14/21 7:23 PM, Dmitry Gutov wrote:
> I just did a little testing, and it seems
> 'print-escape-control-characters' only affects 'prin1-to-string' and
> 'prin1' but not 'message' or 'format'.
No, `print-escape-multibyte` also applies to `format and `message`. Try
the following:
(let ((print-escape-multibyte t))
(format "%S" bidi-directional-controls-chars)
(message "%S" bidi-directional-controls-chars))
Anyway Eli's criticism does not apply to my proposal of the addition of
such a configuration variable. The variable would have the same scope as
`print-escape-multiple` and the other `print-escape-*` variables. My
proposal does not introduce an inconsistency or any other kind of
incoherence.
> Is that a problem?
>
> If not, adding a new variable which makes the same distinction seems
> consistent with the current design.
Exactly. My proposal is consistent with the current design.
Daniel
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:23 ` Dmitry Gutov
2021-12-14 18:32 ` Daniel Mendler
@ 2021-12-14 18:39 ` Eli Zaretskii
2021-12-14 18:56 ` Dmitry Gutov
1 sibling, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-14 18:39 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: mail, 52459
> Cc: 52459@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 14 Dec 2021 21:23:27 +0300
>
> > And, to reiterate once more, I'm against partial solutions that affect
> > only some functions that produce strings, and don't affect at all any
> > text displayed from a buffer. It would be a broken solution, because
> > we will never be able to explain why 'prin1' produces escapes whereas
> > 'format' and 'message' don't.
>
> I just did a little testing, and it seems
> 'print-escape-control-characters' only affects 'prin1-to-string' and
> 'prin1' but not 'message' or 'format'.
>
> Is that a problem?
It could be. I guess the only reason no one complained about it is
that those print functions are used in very specialized cases. But in
this case, the requirement was to use it for displaying human-readable
text in a UI, and I think in that context it would be highly
surprising, to say the least, to have it supported by prin1, but not
by formatted printing APIs.
> If not, adding a new variable which makes the same distinction seems
> consistent with the current design.
The goal is explicitly different, and specifically targets the display
of text to users. And IMO that is not consistent with the current
design, because these variables definitely weren't meant to affect how
text is presented to users. We have other similar features for that,
like nobreak-char-display etc.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:32 ` Daniel Mendler
@ 2021-12-14 18:40 ` Dmitry Gutov
2021-12-14 18:47 ` Eli Zaretskii
2021-12-14 19:22 ` Andreas Schwab
2 siblings, 0 replies; 28+ messages in thread
From: Dmitry Gutov @ 2021-12-14 18:40 UTC (permalink / raw)
To: Daniel Mendler, Eli Zaretskii; +Cc: 52459
On 14.12.2021 21:32, Daniel Mendler wrote:
> On 12/14/21 7:23 PM, Dmitry Gutov wrote:
>> I just did a little testing, and it seems
>> 'print-escape-control-characters' only affects 'prin1-to-string' and
>> 'prin1' but not 'message' or 'format'.
> No, `print-escape-multibyte` also applies to `format and `message`. Try
> the following:
>
> (let ((print-escape-multibyte t))
> (format "%S" bidi-directional-controls-chars)
> (message "%S" bidi-directional-controls-chars))
This is interesting, because print-escape-control-characters (which I
mentioned) does not:
ELISP> (let ((print-escape-control-characters t)) (prin1 "\b"))
"\10"
"\b"
ELISP> (let ((print-escape-control-characters t)) (prin1-to-string "\b"))
"\"\\10\""
ELISP> (let ((print-escape-control-characters t)) (format "\b"))
"\b"
ELISP> (let ((print-escape-control-characters t)) (message "\b"))
"\b"
>> Is that a problem?
>>
>> If not, adding a new variable which makes the same distinction seems
>> consistent with the current design.
> Exactly. My proposal is consistent with the current design.
...but indeed if the new variable has the same scope as either of the
existing ones, it seems easy to justify.
Maybe reconcile the scopes of the existing vars, too.
'print-escape-multibyte' is documented as "This affects only ‘prin1’",
but it is the other var which makes the distinction.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:32 ` Daniel Mendler
2021-12-14 18:40 ` Dmitry Gutov
@ 2021-12-14 18:47 ` Eli Zaretskii
2021-12-14 18:51 ` Daniel Mendler
2021-12-14 19:22 ` Andreas Schwab
2 siblings, 1 reply; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-14 18:47 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459, dgutov
> Cc: 52459@debbugs.gnu.org
> From: Daniel Mendler <mail@daniel-mendler.de>
> Date: Tue, 14 Dec 2021 19:32:47 +0100
>
> On 12/14/21 7:23 PM, Dmitry Gutov wrote:
> > I just did a little testing, and it seems
> > 'print-escape-control-characters' only affects 'prin1-to-string' and
> > 'prin1' but not 'message' or 'format'.
>
> No, `print-escape-multibyte` also applies to `format and `message`. Try
> the following:
>
> (let ((print-escape-multibyte t))
> (format "%S" bidi-directional-controls-chars)
> (message "%S" bidi-directional-controls-chars))
Strings are not usually formatted using %S, they are formatted using
%s. It would be unreasonable to expect Lisp programs to use %S for
formatting strings.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:47 ` Eli Zaretskii
@ 2021-12-14 18:51 ` Daniel Mendler
2021-12-14 19:41 ` Dmitry Gutov
0 siblings, 1 reply; 28+ messages in thread
From: Daniel Mendler @ 2021-12-14 18:51 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 52459, dgutov
On 12/14/21 7:47 PM, Eli Zaretskii wrote:
>> Cc: 52459@debbugs.gnu.org
>> From: Daniel Mendler <mail@daniel-mendler.de>
>> Date: Tue, 14 Dec 2021 19:32:47 +0100
>>
>> On 12/14/21 7:23 PM, Dmitry Gutov wrote:
>>> I just did a little testing, and it seems
>>> 'print-escape-control-characters' only affects 'prin1-to-string' and
>>> 'prin1' but not 'message' or 'format'.
>>
>> No, `print-escape-multibyte` also applies to `format and `message`. Try
>> the following:
>>
>> (let ((print-escape-multibyte t))
>> (format "%S" bidi-directional-controls-chars)
>> (message "%S" bidi-directional-controls-chars))
>
> Strings are not usually formatted using %S, they are formatted using
> %s. It would be unreasonable to expect Lisp programs to use %S for
> formatting strings.
I am talking about creating string representations of Elisp values. I
did not exclusively talk about strings. Strings are of course also Elisp
values, but for a debugger UI it is useful to produce string
representations of general Elisp values.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:39 ` Eli Zaretskii
@ 2021-12-14 18:56 ` Dmitry Gutov
2021-12-14 19:20 ` Eli Zaretskii
0 siblings, 1 reply; 28+ messages in thread
From: Dmitry Gutov @ 2021-12-14 18:56 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: mail, 52459
On 14.12.2021 21:39, Eli Zaretskii wrote:
>> Cc:52459@debbugs.gnu.org
>> From: Dmitry Gutov<dgutov@yandex.ru>
>> Date: Tue, 14 Dec 2021 21:23:27 +0300
>>
>>> And, to reiterate once more, I'm against partial solutions that affect
>>> only some functions that produce strings, and don't affect at all any
>>> text displayed from a buffer. It would be a broken solution, because
>>> we will never be able to explain why 'prin1' produces escapes whereas
>>> 'format' and 'message' don't.
>> I just did a little testing, and it seems
>> 'print-escape-control-characters' only affects 'prin1-to-string' and
>> 'prin1' but not 'message' or 'format'.
>>
>> Is that a problem?
> It could be. I guess the only reason no one complained about it is
> that those print functions are used in very specialized cases. But in
> this case, the requirement was to use it for displaying human-readable
> text in a UI, and I think in that context it would be highly
> surprising, to say the least, to have it supported by prin1, but not
> by formatted printing APIs.
I'm not sure it would be a problem if 'message' and 'format' also honor
that new variable. Aside from inconsistency with existing vars, that is.
>> If not, adding a new variable which makes the same distinction seems
>> consistent with the current design.
> The goal is explicitly different, and specifically targets the display
> of text to users. And IMO that is not consistent with the current
> design, because these variables definitely weren't meant to affect how
> text is presented to users. We have other similar features for that,
> like nobreak-char-display etc.
It seems to me that you misunderstand the use case, or at least the
approach that Daniel wants to take:
Helpful, or Help buffers used by commands like 'describe-variable', use
prin1 to output values which are not known in advance (like the value of
the described variable).
And the dynamic vars under discussion can make those printed values more
predictable and easier to grok (and copy-paste, and etc).
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:56 ` Dmitry Gutov
@ 2021-12-14 19:20 ` Eli Zaretskii
0 siblings, 0 replies; 28+ messages in thread
From: Eli Zaretskii @ 2021-12-14 19:20 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: mail, 52459
> Cc: mail@daniel-mendler.de, 52459@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 14 Dec 2021 21:56:16 +0300
>
> > It could be. I guess the only reason no one complained about it is
> > that those print functions are used in very specialized cases. But in
> > this case, the requirement was to use it for displaying human-readable
> > text in a UI, and I think in that context it would be highly
> > surprising, to say the least, to have it supported by prin1, but not
> > by formatted printing APIs.
>
> I'm not sure it would be a problem if 'message' and 'format' also honor
> that new variable. Aside from inconsistency with existing vars, that is.
We agree. But that is not what is being requested by Daniel.
> It seems to me that you misunderstand the use case, or at least the
> approach that Daniel wants to take:
>
> Helpful, or Help buffers used by commands like 'describe-variable', use
> prin1 to output values which are not known in advance (like the value of
> the described variable).
>
> And the dynamic vars under discussion can make those printed values more
> predictable and easier to grok (and copy-paste, and etc).
The set of use cases that was described included more than just
Helpful. It explicitly included displaying text not known in advance,
such as in debugging UI, where one needs to display strings that come
from the program being debugged.
Hoever, you may be right that I have no good understanding of the use
cases. I did try to understand them, but found it to be impossible,
because every question I tried to ask, every concept of a solution I
tried to propose was inevitably met with the equivalent of "why won't
you give me my variable". I actually think that there are two
distinct classes of use cases involved, but Daniel rejected that,
insisting, AFAIU, that all the use cases are the same and need a
single solution. Which I think is based on the wrong mental model of
how the related stuff works in Emacs.
So I'm sorry, but I cannot be of help in this discussion under those
terms. That is not the way I'm used to approach a problem.
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:32 ` Daniel Mendler
2021-12-14 18:40 ` Dmitry Gutov
2021-12-14 18:47 ` Eli Zaretskii
@ 2021-12-14 19:22 ` Andreas Schwab
2 siblings, 0 replies; 28+ messages in thread
From: Andreas Schwab @ 2021-12-14 19:22 UTC (permalink / raw)
To: Daniel Mendler; +Cc: 52459, Dmitry Gutov
On Dez 14 2021, Daniel Mendler wrote:
> No, `print-escape-multibyte` also applies to `format and `message`. Try
> the following:
>
> (let ((print-escape-multibyte t))
> (format "%S" bidi-directional-controls-chars)
> (message "%S" bidi-directional-controls-chars))
%S is, in essence, prin1-to-string.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 28+ messages in thread
* bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t
2021-12-14 18:51 ` Daniel Mendler
@ 2021-12-14 19:41 ` Dmitry Gutov
0 siblings, 0 replies; 28+ messages in thread
From: Dmitry Gutov @ 2021-12-14 19:41 UTC (permalink / raw)
To: Daniel Mendler, Eli Zaretskii; +Cc: 52459
On 14.12.2021 21:51, Daniel Mendler wrote:
> I am talking about creating string representations of Elisp values. I
> did not exclusively talk about strings. Strings are of course also Elisp
> values, but for a debugger UI it is useful to produce string
> representations of general Elisp values.
Then, also with clarification from A. Schwab, we could say that perhaps
we'd be fine with having all print-escape-* variables (the existing ones
and the one under proposal) only affecting 'prin1', but not 'message'
and 'format' in general (used without '%S').
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2021-12-14 19:41 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-12 20:13 bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t Daniel Mendler
2021-12-12 20:42 ` Eli Zaretskii
2021-12-12 21:11 ` Daniel Mendler
2021-12-12 21:33 ` Daniel Mendler
2021-12-13 12:22 ` Eli Zaretskii
2021-12-13 13:19 ` Daniel Mendler
2021-12-13 13:30 ` Daniel Mendler
2021-12-13 15:24 ` Eli Zaretskii
2021-12-13 16:32 ` Daniel Mendler
2021-12-13 17:07 ` Eli Zaretskii
2021-12-13 18:13 ` Daniel Mendler
2021-12-13 18:28 ` Eli Zaretskii
2021-12-13 18:35 ` Daniel Mendler
2021-12-13 18:52 ` Eli Zaretskii
2021-12-13 18:57 ` Daniel Mendler
2021-12-13 19:08 ` Eli Zaretskii
2021-12-13 19:16 ` Daniel Mendler
2021-12-13 19:38 ` Eli Zaretskii
2021-12-14 18:23 ` Dmitry Gutov
2021-12-14 18:32 ` Daniel Mendler
2021-12-14 18:40 ` Dmitry Gutov
2021-12-14 18:47 ` Eli Zaretskii
2021-12-14 18:51 ` Daniel Mendler
2021-12-14 19:41 ` Dmitry Gutov
2021-12-14 19:22 ` Andreas Schwab
2021-12-14 18:39 ` Eli Zaretskii
2021-12-14 18:56 ` Dmitry Gutov
2021-12-14 19:20 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).