bug#43866: 26.3; italian postfix additions

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#43866: 26.3; italian postfix additions
@ 2020-10-08 12:05 Francesco Potortì
  2020-10-08 12:26 ` Eli Zaretskii
  2020-10-08 15:23 ` Mattias Engdegård
  0 siblings, 2 replies; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 12:05 UTC (permalink / raw)
  To: 43866

Since the inception of mule, amyyears ago, I have set up an environment
where I switch between italian-postfix and american input methods.

Now I realise that I have made long time ago an addition to italian that
has never gone into emacs.

The rationale is that in Italy latin-9 should be used insterad of
latin1, which does not contain the euro symbol.  And that
italian-postfix should allow introducing the euro symbol.

Here is what I use in all machines where I have emacs:

================ start ================
;; Add the Euro symbol, use Latin-9 rather than Latin-1
(quail-define-package
 "italian-postfix" "Latin-9" "IT<" t
 "Italian (Italiano) input method with postfix modifiers

a` -> à    A` -> À    e' -> é    << -> «
e` -> è    E` -> È    E' -> É    >> -> »
i` -> ì    I` -> Ì    E= -> €    o_ -> º
o` -> ò    O` -> Ò               a_ -> ª
u` -> ù    U` -> Ù

Typewriter-style italian characters.

Doubling the postfix separates the letter and postfix: e.g. a`` -> a`
" nil t nil nil nil nil nil nil nil nil t)

(quail-define-rules
 ("A`" ?À) ("a`" ?à) ("E`" ?È) ("E'" ?É) ("E=" ?€) ("e`" ?è) ("e'" ?é)
 ("I`" ?Ì) ("i`" ?ì) ("O`" ?Ò) ("o`" ?ò) ("U`" ?Ù)
 ("u`" ?ù) ("<<" ?«) (">>" ?») ("o_" ?º) ("a_" ?ª)
 ("A``" ["A`"]) ("a``" ["a`"]) ("E``" ["E`"]) ("E''" ["E'"]) ("e``" ["e`"])
 ("e''" ["e'"]) ("I``" ["I`"]) ("i``" ["i`"]) ("O``" ["O`"]) ("o``" ["o`"])
 ("U``" ["U`"]) ("u``" ["u`"])
 ("<<<" ["<<"]) (">>>" [">>"]) ("o__" ["o_"]) ("a__" ["a_"])
 )
================ end ================

In GNU Emacs 26.3 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2020-05-17, modified by Debian built on x86-csail-01
Windowing system distributor 'The X.Org Foundation', version 11.0.12008000
System Description:	Debian GNU/Linux bullseye/sid

Important settings:
  value of $LC_COLLATE: it_IT.UTF-8
  value of $LC_CTYPE: it_IT.UTF-8
  value of $LC_NUMERIC: C
  value of $LANG: C.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Mail

Minor modes in effect:
  filladapt-mode: t
  diff-auto-refine-mode: t
  desktop-save-mode: t
  epa-global-mail-mode: t
  epa-mail-mode: t
  shell-dirtrack-mode: t
  openwith-mode: t
  xterm-mouse-mode: t
  display-time-mode: t
  tooltip-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  abbrev-mode: t

Load-path shadows:
~/elisp/bhl hides /usr/share/emacs/site-lisp/bhl
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-autoloads hides /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/debian-autoloads
/usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode-pkg
/usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode
/usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode-tests hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode-tests
/usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode-autoloads
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-el hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-el
/usr/share/emacs/site-lisp/elpa/debian-el-37/gnus-BTS hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/gnus-BTS
/usr/share/emacs/site-lisp/elpa/debian-el-37/preseed hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/preseed
/usr/share/emacs/site-lisp/elpa/debian-el-37/deb-view hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/deb-view
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-el-autoloads hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-el-autoloads
/usr/share/emacs/site-lisp/elpa/debian-el-37/apt-utils hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/apt-utils
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-bug hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-bug
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-el-pkg hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-el-pkg
/usr/share/emacs/site-lisp/elpa/debian-el-37/apt-sources hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/apt-sources
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-autoloads hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-autoloads
/usr/share/emacs/site-lisp/elpa/dictionary-1.10/dictionary hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/dictionary
/usr/share/emacs/site-lisp/elpa/dictionary-1.10/link hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/link
/usr/share/emacs/site-lisp/elpa/dictionary-1.10/dictionary-pkg hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/dictionary-pkg
/usr/share/emacs/site-lisp/elpa/dictionary-1.10/dictionary-autoloads hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/dictionary-autoloads
/usr/share/emacs/site-lisp/elpa/dictionary-1.10/connection hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/connection
/usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot
/usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-mode-pkg
/usr/share/emacs/site-lisp/elpa/debian-el-37/debian-autoloads hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/debian-autoloads
/usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-context hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-context
/usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-gui hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-gui
/usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-mode-autoloads
/usr/share/emacs/site-lisp/elpa/markdown-mode-2.4/markdown-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/markdown-mode-2.4/markdown-mode-autoloads
/usr/share/emacs/site-lisp/elpa/markdown-mode-2.4/markdown-mode hides /usr/share/emacs/site-lisp/elpa-src/markdown-mode-2.4/markdown-mode
/usr/share/emacs/site-lisp/elpa/markdown-mode-2.4/markdown-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/markdown-mode-2.4/markdown-mode-pkg
/usr/share/emacs/site-lisp/flim/md4 hides /usr/share/emacs/26.3/lisp/md4
/usr/share/emacs/site-lisp/flim/hex-util hides /usr/share/emacs/26.3/lisp/hex-util
~/elisp/octave hides /usr/share/emacs/26.3/lisp/progmodes/octave
/usr/share/emacs/site-lisp/flim/ntlm hides /usr/share/emacs/26.3/lisp/net/ntlm
/usr/share/emacs/site-lisp/flim/hmac-md5 hides /usr/share/emacs/26.3/lisp/net/hmac-md5
/usr/share/emacs/site-lisp/flim/sasl-ntlm hides /usr/share/emacs/26.3/lisp/net/sasl-ntlm
/usr/share/emacs/site-lisp/flim/sasl-digest hides /usr/share/emacs/26.3/lisp/net/sasl-digest
/usr/share/emacs/site-lisp/flim/sasl hides /usr/share/emacs/26.3/lisp/net/sasl
/usr/share/emacs/site-lisp/flim/sasl-cram hides /usr/share/emacs/26.3/lisp/net/sasl-cram
/usr/share/emacs/site-lisp/flim/hmac-def hides /usr/share/emacs/26.3/lisp/net/hmac-def

Features:
(shadow emacsbug apropos vc-bzr mode-local calccomp calc-map calc-alg
calc-vec calc-aent calc-menu calc-yank calc-ext reporter debian-bug
anything-config anything woman cl etags two-column iso-transl org-rmail
org-mhe org-irc org-info org-gnus nnir gnus-sum gnus-group gnus-undo
gnus-start gnus-cloud nnimap nnmail mail-source utf7 netrc nnoo
gnus-spec gnus-int gnus-range gnus-win gnus nnheader org-docview
org-bibtex org-bbdb org-w3m org-element avl-tree generator org org-macro
org-footnote org-pcomplete org-list org-faces org-entities org-version
ob-emacs-lisp ob ob-tangle org-src ob-ref ob-lob ob-table ob-keys ob-exp
ob-comint ob-core ob-eval org-compat org-macs org-loaddefs ispell xref
project eieio-opt speedbar sb-image ezimage dframe completion dos-w32
find-cmd grep find-dired find-func pp cl-print help-fns radix-tree
unrmail calc calc-loaddefs calc-macs deb-view network-stream starttls
url-http tls gnutls url-gw nsm url-cache url-auth url url-proxy
url-privacy url-expand url-methods url-history url-cookie w3m-filter
w3m-form w3m-cookie url-domsuf w3m-bookmark w3m-tabmenu w3m-session w3m
mailcap doc-view image-mode w3m-hist w3m-fb bookmark-w3m w3m-ems
wid-edit w3m-ccl ccl w3m-favicon w3m-image w3m-proc w3m-util cal-move
cal-x dabbrev arc-mode archive-mode macros locate edmacro kmacro rect
tabify man shr-color timezone rmailsort rmailedit url-util shr svg xml
browse-url add-log mailalias rmailout rmailkwd time-stamp cl-extra
dired-aux wdired misearch multi-isearch make-mode jka-compr vc-git
diff-mode markdown-mode subr-x noutline outline easy-mmode generic
sh-script executable tex-mode compile vc-dir ewoc vc vc-dispatcher
vc-svn json-mode rx bibtex-style vc-filewise vc-rcs octave texinfo pcase
bibtex mhtml-mode css-mode smie color js json map imenu cc-mode cc-fonts
cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
sgml-mode dom qp rmailmm message rmc puny rfc822 mml mml-sec gnus-util
mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader mail-parse
rfc2231 desktop frameset elec-pair cal-julian solar cal-dst pot skeleton
warnings rmailsum rmail rmail-loaddefs sendmail rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mime-compose epa-mail mail-utils epa
derived epg view holidays hol-loaddefs appt diary-lib diary-loaddefs
cal-menu calendar cal-loaddefs tramp tramp-compat tramp-loaddefs
trampver ucs-normalize shell pcomplete comint ring parse-time
format-spec advice bhl visual-fill-column switch-to-shell openwith
hi-lock xt-mouse time-date ffap thingatpt scroll-in-place filladapt
ansi-color time quail help-mode dired-x dired dired-loaddefs generic-x
disp-table finder-inf info debian-el package easymenu epg-config
url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache url-vars seq byte-opt gv bytecomp
byte-compile cconv cl-loaddefs cl-lib w3m-load mule-util tooltip eldoc
electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win
x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads dbusbind inotify lcms2 dynamic-setting
font-render-setting x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 859419 104132)
 (symbols 48 61004 1)
 (miscs 40 1838 1970)
 (strings 32 207996 12197)
 (string-bytes 1 6110429)
 (vectors 16 83287)
 (vector-slots 8 2187469 111622)
 (floats 8 914 1312)
 (intervals 56 48134 1037)
 (buffers 992 180))





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì
@ 2020-10-08 12:26 ` Eli Zaretskii
  2020-10-08 12:34   ` Francesco Potortì
  2020-10-08 12:39   ` Robert Pluim
  2020-10-08 15:23 ` Mattias Engdegård
  1 sibling, 2 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-08 12:26 UTC (permalink / raw)
  To: Francesco Potortì; +Cc: 43866

> From: Francesco Potortì <pot@gnu.org>
> Date: Thu, 08 Oct 2020 14:05:55 +0200
> 
> Since the inception of mule, amyyears ago, I have set up an environment
> where I switch between italian-postfix and american input methods.
> 
> Now I realise that I have made long time ago an addition to italian that
> has never gone into emacs.
> 
> The rationale is that in Italy latin-9 should be used insterad of
> latin1, which does not contain the euro symbol.  And that
> italian-postfix should allow introducing the euro symbol.

The Latin-1 vs Latin-9 part is not important nowadays, since Emacs
uses Unicode internally.

As for the Euro symbol, I guess we need to add it to all the Latin
input methods, not just the Italian one?

Thanks.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:26 ` Eli Zaretskii
@ 2020-10-08 12:34   ` Francesco Potortì
  2020-10-08 12:39   ` Robert Pluim
  1 sibling, 0 replies; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 12:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866

>> From: Francesco Potortì <pot@gnu.org>
>> Date: Thu, 08 Oct 2020 14:05:55 +0200
>> 
>> Since the inception of mule, amyyears ago, I have set up an environment
>> where I switch between italian-postfix and american input methods.
>> 
>> Now I realise that I have made long time ago an addition to italian that
>> has never gone into emacs.
>> 
>> The rationale is that in Italy latin-9 should be used insterad of
>> latin1, which does not contain the euro symbol.  And that
>> italian-postfix should allow introducing the euro symbol.
>
>The Latin-1 vs Latin-9 part is not important nowadays, since Emacs
>uses Unicode internally.

I don't know if that's ever used, maybe changing it is worth enyway.

>As for the Euro symbol, I guess we need to add it to all the Latin
>input methods, not just the Italian one?

I guess yes.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:26 ` Eli Zaretskii
  2020-10-08 12:34   ` Francesco Potortì
@ 2020-10-08 12:39   ` Robert Pluim
  2020-10-08 12:57     ` Eli Zaretskii
                       ` (2 more replies)
  1 sibling, 3 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-08 12:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866

>>>>> On Thu, 08 Oct 2020 15:26:16 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Francesco Potortì <pot@gnu.org>
    >> Date: Thu, 08 Oct 2020 14:05:55 +0200
    >> 
    >> Since the inception of mule, amyyears ago, I have set up an environment
    >> where I switch between italian-postfix and american input methods.
    >> 
    >> Now I realise that I have made long time ago an addition to italian that
    >> has never gone into emacs.
    >> 
    >> The rationale is that in Italy latin-9 should be used insterad of
    >> latin1, which does not contain the euro symbol.  And that
    >> italian-postfix should allow introducing the euro symbol.

    Eli> The Latin-1 vs Latin-9 part is not important nowadays, since Emacs
    Eli> uses Unicode internally.

    Eli> As for the Euro symbol, I guess we need to add it to all the Latin
    Eli> input methods, not just the Italian one?

Itʼs already in latin-postfix and on C-x 8 * E, is that really
necessary?

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:39   ` Robert Pluim
@ 2020-10-08 12:57     ` Eli Zaretskii
  2020-10-08 13:54       ` Robert Pluim
  2020-10-08 13:26     ` Francesco Potortì
  2020-10-13 20:07     ` Juri Linkov
  2 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-08 12:57 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Francesco Potortì <pot@gnu.org>,  43866@debbugs.gnu.org
> Date: Thu, 08 Oct 2020 14:39:15 +0200
> 
>     Eli> As for the Euro symbol, I guess we need to add it to all the Latin
>     Eli> input methods, not just the Italian one?
> 
> Itʼs already in latin-postfix and on C-x 8 * E, is that really
> necessary?

You mean, do people (besides Francesco) still use italian-postfix?  I
don't know.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:39   ` Robert Pluim
  2020-10-08 12:57     ` Eli Zaretskii
@ 2020-10-08 13:26     ` Francesco Potortì
  2020-10-08 14:00       ` Robert Pluim
  2020-10-13 20:07     ` Juri Linkov
  2 siblings, 1 reply; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 13:26 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

>    >> The rationale is that in Italy latin-9 should be used insterad of
>    >> latin1, which does not contain the euro symbol.  And that
>    >> italian-postfix should allow introducing the euro symbol.
>
>    Eli> As for the Euro symbol, I guess we need to add it to all the Latin
>    Eli> input methods, not just the Italian one?
>
>Itʼs already in latin-postfix and on C-x 8 * E, is that really
>necessary?

I don't use latin-postfix, because it gets in the way: there are many
more combinations than in italian postifix which change what I am
writing - this is an added burden for a minuscul added benefit.

This is why italian-postifix exists, in spite of the existence of
latin-postfix.

I don't use C-x 8 either, because I'd have to learn a lot of complex
bindings: when I need some exotic char I use the terminal or the X
combinations, which are easier and not specific to Emacs, unless I am
forced to.

This has little to do with italian-postfix, in fact.

All in all, I don't get your objection.  What would be the drawback of
adding the E = keybinding?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:57     ` Eli Zaretskii
@ 2020-10-08 13:54       ` Robert Pluim
  2020-10-08 14:24         ` Robert Pluim
  0 siblings, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-08 13:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866

>>>>> On Thu, 08 Oct 2020 15:57:14 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: Francesco Potortì <pot@gnu.org>,  43866@debbugs.gnu.org
    >> Date: Thu, 08 Oct 2020 14:39:15 +0200
    >> 
    Eli> As for the Euro symbol, I guess we need to add it to all the Latin
    Eli> input methods, not just the Italian one?
    >> 
    >> Itʼs already in latin-postfix and on C-x 8 * E, is that really
    >> necessary?

    Eli> You mean, do people (besides Francesco) still use italian-postfix?  I
    Eli> don't know.

What I meant is: everything is Unicode, and latin-postfix subsumes all
<language>-postfix, as far as I can tell, so people could just use latin-postfix.

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 13:26     ` Francesco Potortì
@ 2020-10-08 14:00       ` Robert Pluim
  0 siblings, 0 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-08 14:00 UTC (permalink / raw)
  To: Francesco Potortì; +Cc: 43866

>>>>> On Thu, 08 Oct 2020 15:26:07 +0200, Francesco Potortì <pot@gnu.org> said:

    >> >> The rationale is that in Italy latin-9 should be used insterad of
    >> >> latin1, which does not contain the euro symbol.  And that
    >> >> italian-postfix should allow introducing the euro symbol.
    >> 
    Eli> As for the Euro symbol, I guess we need to add it to all the Latin
    Eli> input methods, not just the Italian one?
    >> 
    >> Itʼs already in latin-postfix and on C-x 8 * E, is that really
    >> necessary?

    Francesco> I don't use latin-postfix, because it gets in the way: there are many
    Francesco> more combinations than in italian postifix which change what I am
    Francesco> writing - this is an added burden for a minuscul added benefit.

OK

    Francesco> This is why italian-postifix exists, in spite of the existence of
    Francesco> latin-postfix.

    Francesco> I don't use C-x 8 either, because I'd have to learn a lot of complex
    Francesco> bindings: when I need some exotic char I use the terminal or the X
    Francesco> combinations, which are easier and not specific to Emacs, unless I am
    Francesco> forced to.

This would not be for you: you already have your modified
italian-postfix.

    Francesco> This has little to do with italian-postfix, in fact.

    Francesco> All in all, I don't get your objection.  What would be the drawback of
    Francesco> adding the E = keybinding?

Think of it more as 'do we really need to change X numbers of input
methods, is there a simpler way'.

And the answer appears to be 'no such way exists'

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 13:54       ` Robert Pluim
@ 2020-10-08 14:24         ` Robert Pluim
  2020-10-08 14:32           ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-08 14:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866

>>>>> On Thu, 08 Oct 2020 15:54:37 +0200, Robert Pluim <rpluim@gmail.com> said:

>>>>> On Thu, 08 Oct 2020 15:57:14 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >>> From: Robert Pluim <rpluim@gmail.com>
    >>> Cc: Francesco Potortì <pot@gnu.org>,  43866@debbugs.gnu.org
    >>> Date: Thu, 08 Oct 2020 14:39:15 +0200
    >>> 
    Eli> As for the Euro symbol, I guess we need to add it to all the Latin
    Eli> input methods, not just the Italian one?
    >>> 
    >>> Itʼs already in latin-postfix and on C-x 8 * E, is that really
    >>> necessary?

    Eli> You mean, do people (besides Francesco) still use italian-postfix?  I
    Eli> don't know.

    Robert> What I meant is: everything is Unicode, and latin-postfix subsumes all
    Robert> <language>-postfix, as far as I can tell, so people could just use latin-postfix.

As a practical question, if we do decide to add the euro to a bunch of
latin input methods, where do we stop? Should we add it to all the
greek-* methods on the grounds that Greece uses the Euro, but not to
"british" since the UK uses the pound?

(this is why I use C-x 8 * E to type €)

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 14:24         ` Robert Pluim
@ 2020-10-08 14:32           ` Eli Zaretskii
  0 siblings, 0 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-08 14:32 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

> From: Robert Pluim <rpluim@gmail.com>
> Cc: 43866@debbugs.gnu.org
> Date: Thu, 08 Oct 2020 16:24:05 +0200
> 
> As a practical question, if we do decide to add the euro to a bunch of
> latin input methods, where do we stop? Should we add it to all the
> greek-* methods on the grounds that Greece uses the Euro, but not to
> "british" since the UK uses the pound?

I only thought about the Latin-N ones, mainly because all the legacy
encodings which didn't have a Euro sign at some point got upgraded to
newer Latin-N encodings which do.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì
  2020-10-08 12:26 ` Eli Zaretskii
@ 2020-10-08 15:23 ` Mattias Engdegård
  2020-10-08 15:35   ` Robert Pluim
                     ` (2 more replies)
  1 sibling, 3 replies; 109+ messages in thread
From: Mattias Engdegård @ 2020-10-08 15:23 UTC (permalink / raw)
  To: Francesco Potortì, Eli Zaretskii, Robert Pluim; +Cc: 43866

> E= -> €
> 
> Typewriter-style italian characters. 

If they really are typewriter-style, wouldn't C= make more sense? E overstruck with = would just be a smudgy mess even if typed on a beautiful Olivetti.

Both C= and E= seem to work as X11 compose pairs. We could include both.
That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases.






^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 15:23 ` Mattias Engdegård
@ 2020-10-08 15:35   ` Robert Pluim
  2020-10-08 16:22     ` Francesco Potortì
  2020-10-08 15:42   ` Eli Zaretskii
  2020-10-08 16:10   ` Francesco Potortì
  2 siblings, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-08 15:35 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866

>>>>> On Thu, 8 Oct 2020 17:23:35 +0200, Mattias Engdegård <mattiase@acm.org> said:

    >> E= -> €
    >> 
    >> Typewriter-style italian characters. 

    Mattias> If they really are typewriter-style, wouldn't C= make more sense? E overstruck with = would just be a smudgy mess even if typed on a beautiful Olivetti.

    Mattias> Both C= and E= seem to work as X11 compose pairs. We could include both.
    Mattias> That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases.

C= would make more sense, but E= (and of course =E for prefix input
methods) is more mnemonic. I can never remember how to type € on a mac
because itʼs not on something obvious like Option-E.

Thereʼs precedence from C-x 8 * as well, which uses E (though perhaps
it should have 'e' as well).

And for total coverage, we *must* add 'C-x 8 2 0 a c' ;-)

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 15:23 ` Mattias Engdegård
  2020-10-08 15:35   ` Robert Pluim
@ 2020-10-08 15:42   ` Eli Zaretskii
  2020-10-08 16:10   ` Francesco Potortì
  2 siblings, 0 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-08 15:42 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866, rpluim

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Thu, 8 Oct 2020 17:23:35 +0200
> Cc: 43866@debbugs.gnu.org
> 
> Both C= and E= seem to work as X11 compose pairs. We could include both.

I think you are right.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 15:23 ` Mattias Engdegård
  2020-10-08 15:35   ` Robert Pluim
  2020-10-08 15:42   ` Eli Zaretskii
@ 2020-10-08 16:10   ` Francesco Potortì
  2020-10-08 17:18     ` Robert Pluim
  2 siblings, 1 reply; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 16:10 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866, Robert Pluim

>> E= -> €
>> 
>> Typewriter-style italian characters. 
>
>If they really are typewriter-style, wouldn't C= make more sense? E
>overstruck with = would just be a smudgy mess even if typed on a
>beautiful Olivetti.

The euro sign under E is on many Italian keyboards, so that sounds
natural to me.  Never seen C= as a shortcut, my X understands e= and E=
but not c= or C=.
>
>Both C= and E= seem to work as X11 compose pairs. We could include both.
>That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases.

I would not include c= and C=.  I would not include e= either.  These
things are very annoying unless they are really useful, so better just
include what's really needed and nothing more.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 15:35   ` Robert Pluim
@ 2020-10-08 16:22     ` Francesco Potortì
  0 siblings, 0 replies; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 16:22 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Mattias Engdegård, 43866

>And for total coverage, we *must* add 'C-x 8 2 0 a c' ;-)

Wow!  But I can't imagine what that could produce :)





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 16:10   ` Francesco Potortì
@ 2020-10-08 17:18     ` Robert Pluim
  2020-10-08 17:28       ` Francesco Potortì
  2020-10-08 17:59       ` Mattias Engdegård
  0 siblings, 2 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-08 17:18 UTC (permalink / raw)
  To: Francesco Potortì; +Cc: Mattias Engdegård, 43866

>>>>> On Thu, 08 Oct 2020 18:10:34 +0200, Francesco Potortì <pot@gnu.org> said:

    >> Both C= and E= seem to work as X11 compose pairs. We could include both.
    >> That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases.

    Francesco> I would not include c= and C=.  I would not include e= either.  These
    Francesco> things are very annoying unless they are really useful, so better just
    Francesco> include what's really needed and nothing more.

The advantage of e= is that (on a US-type keyboard at least) it
doesnʼt involve any modifiers. But then again adding it would increase
the chances of someone typing it by mistake.

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 17:18     ` Robert Pluim
@ 2020-10-08 17:28       ` Francesco Potortì
  2020-10-08 17:59       ` Mattias Engdegård
  1 sibling, 0 replies; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 17:28 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Mattias Engdegård, 43866

>>>>>> On Thu, 08 Oct 2020 18:10:34 +0200, Francesco Potortì <pot@gnu.org> said:
>
>    >> Both C= and E= seem to work as X11 compose pairs. We could include both.
>    >> That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases.
>
>    Francesco> I would not include c= and C=.  I would not include e= either.  These
>    Francesco> things are very annoying unless they are really useful, so better just
>    Francesco> include what's really needed and nothing more.
>
>The advantage of e= is that (on a US-type keyboard at least) it
>doesnʼt involve any modifiers. But then again adding it would increase
>the chances of someone typing it by mistake.

Yes.  That's exactly the point.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 17:18     ` Robert Pluim
  2020-10-08 17:28       ` Francesco Potortì
@ 2020-10-08 17:59       ` Mattias Engdegård
  2020-10-08 19:55         ` Francesco Potortì
  2020-10-09  4:42         ` Lars Ingebrigtsen
  1 sibling, 2 replies; 109+ messages in thread
From: Mattias Engdegård @ 2020-10-08 17:59 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

8 okt. 2020 kl. 19.18 skrev Robert Pluim <rpluim@gmail.com>:

>    Francesco> I would not include c= and C=.  I would not include e= either.  These
>    Francesco> things are very annoying unless they are really useful, so better just
>    Francesco> include what's really needed and nothing more.
> 
> The advantage of e= is that (on a US-type keyboard at least) it
> doesnʼt involve any modifiers. But then again adding it would increase
> the chances of someone typing it by mistake.

As you noted, e= is already in latin-postfix. It seems odd to require E= in one mode and e= in another. It makes more sense for italian-postfix to be a subset of latin-postfix, so that an Italian user who needs foreign letters can switch without relearning composition pairs.






^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 17:59       ` Mattias Engdegård
@ 2020-10-08 19:55         ` Francesco Potortì
  2020-10-09  4:42         ` Lars Ingebrigtsen
  1 sibling, 0 replies; 109+ messages in thread
From: Francesco Potortì @ 2020-10-08 19:55 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866, Robert Pluim

Francesco> I would not include c= and C=.  I would not include e= either.  These
Francesco> things are very annoying unless they are really useful, so better just
Francesco> include what's really needed and nothing more.

Robert:
>> The advantage of e= is that (on a US-type keyboard at least) it
>> doesnʼt involve any modifiers. But then again adding it would increase
>> the chances of someone typing it by mistake.

Mattias:
>As you noted, e= is already in latin-postfix. It seems odd to require
>E= in one mode and e= in another. It makes more sense for
>italian-postfix to be a subset of latin-postfix, so that an Italian
>user who needs foreign letters can switch without relearning
>composition pairs. 

I once tried using latin-postfix and I soon stopped, as it creates many
more artifacts that you don't want than ones you want.  Having a lowcase
e for making the euro sign is just one more reason why I wouldn't use
latin-postfix.

If the agreed-upon solution is to put 'e =' on all European postfix
languages, I think that's better leaving them all as they are now.

If the only blocking issue here is inconsistency with latin-postfix,
then better use 'E =' in place of 'e =' on latin-postfix and adding the
same to all European languages.  The only drawback would be for those
using latin-postfix, but after my experience using it, I don't think it
is really used in practice.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 17:59       ` Mattias Engdegård
  2020-10-08 19:55         ` Francesco Potortì
@ 2020-10-09  4:42         ` Lars Ingebrigtsen
  2020-10-09 11:26           ` Mattias Engdegård
  1 sibling, 1 reply; 109+ messages in thread
From: Lars Ingebrigtsen @ 2020-10-09  4:42 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866, Robert Pluim

Mattias Engdegård <mattiase@acm.org> writes:

> As you noted, e= is already in latin-postfix. It seems odd to require
> E= in one mode and e= in another. It makes more sense for
> italian-postfix to be a subset of latin-postfix, so that an Italian
> user who needs foreign letters can switch without relearning
> composition pairs.

I'm not sure I agree.  An input method specialised to a specific
language doesn't have to be a superset of the larger group -- a
less-specific input method may be intended for users that have less
experience with the language, and therefore be "sloppier"; i.e., have
more input methods so that the user will find the character easier (but
have more false positives that then will have to be fixed manually).

So I think Francesco is right here -- just add E=, and do nothing else
here (for the Euro).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09  4:42         ` Lars Ingebrigtsen
@ 2020-10-09 11:26           ` Mattias Engdegård
  2020-10-09 11:53             ` Thien-Thi Nguyen
  0 siblings, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-10-09 11:26 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 43866, Robert Pluim

9 okt. 2020 kl. 06.42 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> I'm not sure I agree.  An input method specialised to a specific
> language doesn't have to be a superset of the larger group -- a
> less-specific input method may be intended for users that have less
> experience with the language, and therefore be "sloppier"; i.e., have
> more input methods so that the user will find the character easier (but
> have more false positives that then will have to be fixed manually).

The choice of input method is not about proficiency in the language but rather that a more constrained method is more efficient for monolingual use. The many input sequences of 'latin-postfix' are not there to help the learner stumble upon the right one by luck, but to allow the entry of more characters.

However, Francesco is right in that latin-postfix is too heavily loaded for smooth use, and I certainly understand why he prefers italian-postfix. 'latin-alt-postfix' is somewhat more practical (and uses e=).

> So I think Francesco is right here -- just add E=, and do nothing else
> here (for the Euro).

I wouldn't mind, although we may be straying a bit into tailoring parts of Emacs to a single user.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 11:26           ` Mattias Engdegård
@ 2020-10-09 11:53             ` Thien-Thi Nguyen
  2020-10-09 12:45               ` Robert Pluim
  0 siblings, 1 reply; 109+ messages in thread
From: Thien-Thi Nguyen @ 2020-10-09 11:53 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 43866, Robert Pluim

[-- Attachment #1: Type: text/plain, Size: 749 bytes --]

() Mattias Engdegård <mattiase@acm.org>
() Fri, 9 Oct 2020 13:26:12 +0200

   > So I think Francesco is right here -- just add E=, and do
   > nothing else here (for the Euro).

   I wouldn't mind, although we may be straying a bit into
   tailoring parts of Emacs to a single user.

FWIW, i use italian-postfix, too, and would welcome this (E=
only) change.

-- 
Thien-Thi Nguyen -----------------------------------------------
 (defun responsep (query)               ; (2020) Software Libero
   (pcase (context query)               ;       = Dissenso Etico
     (`(technical ,ml) (correctp ml))
     ...))                              748E A0E8 1CB8 A748 9BFA
--------------------------------------- 6CE4 6703 2224 4C80 7502


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 219 bytes --]

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 11:53             ` Thien-Thi Nguyen
@ 2020-10-09 12:45               ` Robert Pluim
  2020-10-09 14:31                 ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-09 12:45 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: Mattias Engdegård, Lars Ingebrigtsen, 43866

>>>>> On Fri, 09 Oct 2020 07:53:58 -0400, Thien-Thi Nguyen <ttn@gnuvola.org> said:

    Thien-Thi> () Mattias Engdegård <mattiase@acm.org>
    Thien-Thi> () Fri, 9 Oct 2020 13:26:12 +0200

    >> So I think Francesco is right here -- just add E=, and do
    >> nothing else here (for the Euro).

    Thien-Thi>    I wouldn't mind, although we may be straying a bit into
    Thien-Thi>    tailoring parts of Emacs to a single user.

    Thien-Thi> FWIW, i use italian-postfix, too, and would welcome this (E=
    Thien-Thi> only) change.

I guess Iʼm the weirdo here: I use latin-prefix :-)

(it has € on ~e, which is not a great choice: various other
latin-prefix methods use ~e and ~E for other codepoints. Perhaps we
should add =E (or =e) to latin-prefix and maybe the other
latin-N-prefix methods)

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 12:45               ` Robert Pluim
@ 2020-10-09 14:31                 ` Eli Zaretskii
  2020-10-09 14:48                   ` Robert Pluim
  2020-10-09 15:05                   ` Mattias Engdegård
  0 siblings, 2 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-09 14:31 UTC (permalink / raw)
  To: Robert Pluim; +Cc: mattiase, larsi, 43866, ttn

> From: Robert Pluim <rpluim@gmail.com>
> Date: Fri, 09 Oct 2020 14:45:34 +0200
> Cc: Mattias Engdegård <mattiase@acm.org>,
>  Lars Ingebrigtsen <larsi@gnus.org>, 43866@debbugs.gnu.org
> 
>     Thien-Thi> FWIW, i use italian-postfix, too, and would welcome this (E=
>     Thien-Thi> only) change.
> 
> I guess Iʼm the weirdo here: I use latin-prefix :-)
> 
> (it has € on ~e, which is not a great choice: various other
> latin-prefix methods use ~e and ~E for other codepoints. Perhaps we
> should add =E (or =e) to latin-prefix and maybe the other
> latin-N-prefix methods)

Based on the discussion, I've decided to make a minimal change, so I
added E= as a sequence for the Euro sign to Latin-1 language input
methods (on the master branch).

Any reason not to close this bug report now?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 14:31                 ` Eli Zaretskii
@ 2020-10-09 14:48                   ` Robert Pluim
  2020-10-09 15:04                     ` Eli Zaretskii
  2020-10-09 15:05                   ` Mattias Engdegård
  1 sibling, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-09 14:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: mattiase, larsi, 43866, ttn

>>>>> On Fri, 09 Oct 2020 17:31:17 +0300, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Based on the discussion, I've decided to make a minimal change, so I
    Eli> added E= as a sequence for the Euro sign to Latin-1 language input
    Eli> methods (on the master branch).

    Eli> Any reason not to close this bug report now?

If you've decided that the prefix methods donʼt get a similar
treatment then we can close it.

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 14:48                   ` Robert Pluim
@ 2020-10-09 15:04                     ` Eli Zaretskii
  2020-10-10 20:54                       ` Lars Ingebrigtsen
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-09 15:04 UTC (permalink / raw)
  To: Robert Pluim; +Cc: mattiase, larsi, 43866, ttn

> From: Robert Pluim <rpluim@gmail.com>
> Cc: ttn@gnuvola.org,  mattiase@acm.org,  larsi@gnus.org,  43866@debbugs.gnu.org
> Date: Fri, 09 Oct 2020 16:48:07 +0200
> 
>     Eli> Any reason not to close this bug report now?
> 
> If you've decided that the prefix methods donʼt get a similar
> treatment then we can close it.

You mean, use =E for the Euro?  I don't mind if there are no
objections.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 14:31                 ` Eli Zaretskii
  2020-10-09 14:48                   ` Robert Pluim
@ 2020-10-09 15:05                   ` Mattias Engdegård
  2020-10-09 15:08                     ` Robert Pluim
  2020-10-09 15:10                     ` Eli Zaretskii
  1 sibling, 2 replies; 109+ messages in thread
From: Mattias Engdegård @ 2020-10-09 15:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866, larsi, Robert Pluim, ttn

9 okt. 2020 kl. 16.31 skrev Eli Zaretskii <eliz@gnu.org>:

> Based on the discussion, I've decided to make a minimal change, so I
> added E= as a sequence for the Euro sign to Latin-1 language input
> methods (on the master branch).

The minimal change would be to do it for italian-postfix only but perhaps it doesn't hurt too much elsewhere.
(I don't think prefix methods need it.)

> Any reason not to close this bug report now?

Maybe it merits a NEWS entry?






^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 15:05                   ` Mattias Engdegård
@ 2020-10-09 15:08                     ` Robert Pluim
  2020-10-09 15:28                       ` Mattias Engdegård
  2020-10-09 15:10                     ` Eli Zaretskii
  1 sibling, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-09 15:08 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866, larsi, ttn

>>>>> On Fri, 9 Oct 2020 17:05:23 +0200, Mattias Engdegård <mattiase@acm.org> said:

    Mattias> 9 okt. 2020 kl. 16.31 skrev Eli Zaretskii <eliz@gnu.org>:
    >> Based on the discussion, I've decided to make a minimal change, so I
    >> added E= as a sequence for the Euro sign to Latin-1 language input
    >> methods (on the master branch).

    Mattias> The minimal change would be to do it for italian-postfix only but perhaps it doesn't hurt too much elsewhere.
    Mattias> (I don't think prefix methods need it.)

Why? If eg french-postfix has it, why not french-prefix?

    >> Any reason not to close this bug report now?

    Mattias> Maybe it merits a NEWS entry?

Itʼs a user-visible change, so I guess so.

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 15:05                   ` Mattias Engdegård
  2020-10-09 15:08                     ` Robert Pluim
@ 2020-10-09 15:10                     ` Eli Zaretskii
  2020-10-09 15:21                       ` Robert Pluim
  1 sibling, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-09 15:10 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 43866, larsi, rpluim, ttn

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Fri, 9 Oct 2020 17:05:23 +0200
> Cc: Robert Pluim <rpluim@gmail.com>, ttn@gnuvola.org, larsi@gnus.org,
>         43866@debbugs.gnu.org
> 
> 9 okt. 2020 kl. 16.31 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> > Based on the discussion, I've decided to make a minimal change, so I
> > added E= as a sequence for the Euro sign to Latin-1 language input
> > methods (on the master branch).
> 
> The minimal change would be to do it for italian-postfix only but perhaps it doesn't hurt too much elsewhere.

I couldn't explain to myself why Italian should have it, but, say,
German or French shouldn't.

> (I don't think prefix methods need it.)

OK.

> > Any reason not to close this bug report now?
> 
> Maybe it merits a NEWS entry?

Sounds too small to announce, but if others think it should be in
NEWS, I won't object.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 15:10                     ` Eli Zaretskii
@ 2020-10-09 15:21                       ` Robert Pluim
  0 siblings, 0 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-09 15:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Mattias Engdegård, larsi, 43866, ttn

>>>>> On Fri, 09 Oct 2020 18:10:48 +0300, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Sounds too small to announce, but if others think it should be in
    Eli> NEWS, I won't object.

git never forgets :-)

    commit 3409fe0362c52127c52f854a7300f4dde4b8fffe
    Author: Eli Zaretskii <eliz@gnu.org>
    Date:   Thu Mar 29 19:45:13 2018 +0300

        Support Capital sharp S in German input methods

        * lisp/leim/quail/latin-post.el ("german-postfix"):
        * lisp/leim/quail/latin-pre.el ("german-prefix"): Add Capital
        sharp S.  (Bug#30988)

        * etc/NEWS: Mention the support of Capital sharp S.

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 15:08                     ` Robert Pluim
@ 2020-10-09 15:28                       ` Mattias Engdegård
  0 siblings, 0 replies; 109+ messages in thread
From: Mattias Engdegård @ 2020-10-09 15:28 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866, larsi, ttn

9 okt. 2020 kl. 17.08 skrev Robert Pluim <rpluim@gmail.com>:

> Why? If eg french-postfix has it, why not french-prefix?

We would have to be rather sure about what it should be; each new sequence will be a potential point of annoyance.
The prefix methods that define € use ~e, but our users apparently didn't want us to follow existing practice for the postfix methods (e=).

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-09 15:04                     ` Eli Zaretskii
@ 2020-10-10 20:54                       ` Lars Ingebrigtsen
  2020-10-12  9:26                         ` Robert Pluim
  0 siblings, 1 reply; 109+ messages in thread
From: Lars Ingebrigtsen @ 2020-10-10 20:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: mattiase, Robert Pluim, 43866, ttn

Eli Zaretskii <eliz@gnu.org> writes:

>> If you've decided that the prefix methods donʼt get a similar
>> treatment then we can close it.
>
> You mean, use =E for the Euro?  I don't mind if there are no
> objections.

I think it sounds logical, but I don't think we should make such a
change without it being requested by somebody using those input
methods.  Perhaps =E would be annoying for them?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-10 20:54                       ` Lars Ingebrigtsen
@ 2020-10-12  9:26                         ` Robert Pluim
  0 siblings, 0 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-12  9:26 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: mattiase, 43866, ttn

>>>>> On Sat, 10 Oct 2020 22:54:14 +0200, Lars Ingebrigtsen <larsi@gnus.org> said:

    Lars> Eli Zaretskii <eliz@gnu.org> writes:
    >>> If you've decided that the prefix methods donʼt get a similar
    >>> treatment then we can close it.
    >> 
    >> You mean, use =E for the Euro?  I don't mind if there are no
    >> objections.

    Lars> I think it sounds logical, but I don't think we should make such a
    Lars> change without it being requested by somebody using those input
    Lars> methods.  Perhaps =E would be annoying for them?

I donʼt think it would be annoying, but I agree thereʼs probably no
need to start adding things people haven't requested (and ~e already
exists).

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-08 12:39   ` Robert Pluim
  2020-10-08 12:57     ` Eli Zaretskii
  2020-10-08 13:26     ` Francesco Potortì
@ 2020-10-13 20:07     ` Juri Linkov
  2020-10-14  2:31       ` Eli Zaretskii
  2020-10-14  4:38       ` Richard Stallman
  2 siblings, 2 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-13 20:07 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

> Itʼs already in latin-postfix and on C-x 8 * E, is that really
> necessary?

I wonder why C-x 8 provides key sequences that are not mnemonic
and so hard to remember?

Would it make sense to support exactly the same keys that are
provided by the X11 compose method?  I mean that are in the file
/usr/share/X11/locale/en_US.UTF-8/Compose
also available at
https://help.ubuntu.com/community/ComposeKey
and
https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre

For example, for every such line:

  <Multi_key> <equal> <E>      : "€"   EuroSign # EURO SIGN

replace <Multi_key> with C-x 8, and bind such key sequences:

  C-x 8 = E   => "€"

and for all other keys as well, e.g.

  C-x 8 . . . => "…" (HORIZONTAL ELLIPSIS)

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-13 20:07     ` Juri Linkov
@ 2020-10-14  2:31       ` Eli Zaretskii
  2020-10-14  8:07         ` Juri Linkov
  2020-10-15  3:52         ` Richard Stallman
  2020-10-14  4:38       ` Richard Stallman
  1 sibling, 2 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-14  2:31 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: Eli Zaretskii <eliz@gnu.org>,  43866@debbugs.gnu.org
> Date: Tue, 13 Oct 2020 23:07:13 +0300
> 
> I wonder why C-x 8 provides key sequences that are not mnemonic
> and so hard to remember?
> 
> Would it make sense to support exactly the same keys that are
> provided by the X11 compose method?  I mean that are in the file
> /usr/share/X11/locale/en_US.UTF-8/Compose
> also available at
> https://help.ubuntu.com/community/ComposeKey
> and
> https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre

How about making a new input method for those?  It seems to me that
C-x 8 is already too "fat".





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-13 20:07     ` Juri Linkov
  2020-10-14  2:31       ` Eli Zaretskii
@ 2020-10-14  4:38       ` Richard Stallman
  2020-10-14  8:11         ` Juri Linkov
                           ` (2 more replies)
  1 sibling, 3 replies; 109+ messages in thread
From: Richard Stallman @ 2020-10-14  4:38 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Would it make sense to support exactly the same keys that are
  > provided by the X11 compose method?

That might be a good idea.

Also, I wonder if we could make that command more self-documenting.
Maybe C-h in the argument for C-x 8 could display a buffer
which displays characters you can choose.
Each character would be followed by the sequence to type to choose that
character.

This should include all the characters Emacs supports, divided clearly
into Unicode code blocks, with their unicode names.  Not just the ones
that have specific short C-x 8 sequences definied in Emacs.

It would be nice to have a prefix more mnemonic than C-x 8.
But I have nothing to suggest.

It would be good to shorten C-x 8 RET.  That is my go-to method
of inserting characters for which I don't know a sequence.

Currently, 8 upper-case letters are valid after C-h 8, and 6
lower-case.  Suppose we free up one case -- either the upper-case
letters or the lower-case letters.  Then we could make typing
a letter of that case throw you into the minibuffer.

In this way, we could replace C-x 8 RET UNICODE-NAME RET with
C-x 8 UNICODE-NAME RET.

Also, why not change the Unicode character names to lower-case?
They would look nicer that way, I think.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14  2:31       ` Eli Zaretskii
@ 2020-10-14  8:07         ` Juri Linkov
  2020-10-14 15:07           ` Eli Zaretskii
  2020-10-15  3:52         ` Richard Stallman
  1 sibling, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-14  8:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

>> I wonder why C-x 8 provides key sequences that are not mnemonic
>> and so hard to remember?
>>
>> Would it make sense to support exactly the same keys that are
>> provided by the X11 compose method?  I mean that are in the file
>> /usr/share/X11/locale/en_US.UTF-8/Compose
>> also available at
>> https://help.ubuntu.com/community/ComposeKey
>> and
>> https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre
>
> How about making a new input method for those?  It seems to me that
> C-x 8 is already too "fat".

Yes, a new method might be useful as well.  But since such method can't
be enabled all the time because such sequences as "= E" should be inserted
literally in normal circumstances.  So such method needs to be enabled
temporarily, and it takes more time to enable/disable it, while it's
useful only to insert a single special character sometimes, it would be
much easier to type some prefix key before typing "= E" to insert €
when such a need arises occasionally.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14  4:38       ` Richard Stallman
@ 2020-10-14  8:11         ` Juri Linkov
  2020-10-14 10:43         ` Robert Pluim
  2020-10-14 14:56         ` Eli Zaretskii
  2 siblings, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-14  8:11 UTC (permalink / raw)
  To: Richard Stallman; +Cc: rpluim, 43866

>   > Would it make sense to support exactly the same keys that are
>   > provided by the X11 compose method?
>
> That might be a good idea.

Do we have the rights to copy all key definitions from the X11 compose method?
I guess there are no licensing restrictions?

> Also, I wonder if we could make that command more self-documenting.
> Maybe C-h in the argument for C-x 8 could display a buffer
> which displays characters you can choose.
> Each character would be followed by the sequence to type to choose that
> character.

Yes, displaying a separate buffer would be useful.  Then maybe
displaying these keys could be moved from the Help buffer of 'C-h b'
that currently displays a very long list of 'C-x 8' keys at the beginning
of the Help buffer, so it's very difficult to see the keys of the
current mode that are at the end of the long Help buffer.

> This should include all the characters Emacs supports, divided clearly
> into Unicode code blocks, with their unicode names.  Not just the ones
> that have specific short C-x 8 sequences definied in Emacs.

Maybe also 'C-u C-x =' could suggest how to input characters
using C-x 8 mnemonics.

> It would be nice to have a prefix more mnemonic than C-x 8.
> But I have nothing to suggest.

Yes, to find a more mnemonic and shorter key would be useful.
Maybe this question could be asked on emacs-devel
where someone might have ideas for such a key.

> It would be good to shorten C-x 8 RET.  That is my go-to method
> of inserting characters for which I don't know a sequence.
>
> Currently, 8 upper-case letters are valid after C-h 8, and 6
> lower-case.  Suppose we free up one case -- either the upper-case
> letters or the lower-case letters.  Then we could make typing
> a letter of that case throw you into the minibuffer.

Sorry, I don't understand.  I tried to type 'C-h 8', and it's undefined.

> In this way, we could replace C-x 8 RET UNICODE-NAME RET with
> C-x 8 UNICODE-NAME RET.
>
> Also, why not change the Unicode character names to lower-case?
> They would look nicer that way, I think.

I don't know why the Unicode standard uses upper-case, but I see no problem
in Emacs with upper-case letters when case-fold is non-nil, so you can type
lower-case letters in completions.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14  4:38       ` Richard Stallman
  2020-10-14  8:11         ` Juri Linkov
@ 2020-10-14 10:43         ` Robert Pluim
  2020-10-15  3:54           ` Richard Stallman
  2020-10-14 14:56         ` Eli Zaretskii
  2 siblings, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-14 10:43 UTC (permalink / raw)
  To: Richard Stallman; +Cc: 43866, Juri Linkov

>>>>> On Wed, 14 Oct 2020 00:38:48 -0400, Richard Stallman <rms@gnu.org> said:

    >> Would it make sense to support exactly the same keys that are
    >> provided by the X11 compose method?

    Richard> That might be a good idea.

Could we provide this as an input method?

    Richard> Also, I wonder if we could make that command more self-documenting.
    Richard> Maybe C-h in the argument for C-x 8 could display a buffer
    Richard> which displays characters you can choose.
    Richard> Each character would be followed by the sequence to type to choose that
    Richard> character.

The problem is that such a list is very long. 'C-h b' after 'C-x 8
RET' will display the bindings, but it does not currently contain the
character names, and TAB after 'C-x 8 RET' will list all the names but
not the sequences for entering them.

There are completion frameworks that have solved this, eg with helm
you can start typing right after 'C-x 8 RET' and it will narrow the
list down automatically. Iʼm sure we could do something similar.

    Richard> This should include all the characters Emacs supports, divided clearly
    Richard> into Unicode code blocks, with their unicode names.  Not just the ones
    Richard> that have specific short C-x 8 sequences definied in Emacs.

Why does it matter which code block a character is in?

    Richard> It would be nice to have a prefix more mnemonic than C-x 8.
    Richard> But I have nothing to suggest.

    Richard> It would be good to shorten C-x 8 RET.  That is my go-to method
    Richard> of inserting characters for which I don't know a sequence.

Where would you put it? Note that if you do know the sequence you can
use Alt or a dead accent key instead of 'C-x 8' (someone did suggest
freeing up F2 recently)

    Richard> Currently, 8 upper-case letters are valid after C-h 8, and 6
    Richard> lower-case.  Suppose we free up one case -- either the upper-case
    Richard> letters or the lower-case letters.  Then we could make typing
    Richard> a letter of that case throw you into the minibuffer.

I think itʼs a tossup as to which of them would be easier to free
up. The lower case bindings have one fewer prefix key, so perhaps
lower case. Or perhaps a completely different binding.

    Richard> In this way, we could replace C-x 8 RET UNICODE-NAME RET with
    Richard> C-x 8 UNICODE-NAME RET.

    Richard> Also, why not change the Unicode character names to lower-case?
    Richard> They would look nicer that way, I think.

The Unicode character names are always described in upper case, but I
guess we could add a configuration option so that 'ucs-names'
downcased them (the completion in C-x 8 RET is case insensitive)

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14  4:38       ` Richard Stallman
  2020-10-14  8:11         ` Juri Linkov
  2020-10-14 10:43         ` Robert Pluim
@ 2020-10-14 14:56         ` Eli Zaretskii
  2 siblings, 0 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-14 14:56 UTC (permalink / raw)
  To: rms; +Cc: rpluim, 43866, juri

> From: Richard Stallman <rms@gnu.org>
> Date: Wed, 14 Oct 2020 00:38:48 -0400
> Cc: rpluim@gmail.com, 43866@debbugs.gnu.org
> 
> Also, I wonder if we could make that command more self-documenting.
> Maybe C-h in the argument for C-x 8 could display a buffer
> which displays characters you can choose.

I don't understand: "C-x 8 C-h" already shows such a buffer.

> Also, why not change the Unicode character names to lower-case?
> They would look nicer that way, I think.

You can type in lower-case, then TAB will upcase them for you.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14  8:07         ` Juri Linkov
@ 2020-10-14 15:07           ` Eli Zaretskii
  2020-10-14 19:40             ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-14 15:07 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Wed, 14 Oct 2020 11:07:41 +0300
> 
> > How about making a new input method for those?  It seems to me that
> > C-x 8 is already too "fat".
> 
> Yes, a new method might be useful as well.  But since such method can't
> be enabled all the time because such sequences as "= E" should be inserted
> literally in normal circumstances.  So such method needs to be enabled
> temporarily, and it takes more time to enable/disable it, while it's
> useful only to insert a single special character sometimes, it would be
> much easier to type some prefix key before typing "= E" to insert €
> when such a need arises occasionally.

But turning an input method on and off is just 1 key, C-\, whereas
C-x 8 is 2 keys, and not very convenient sequence to type, at least on
QWERTY keyboards.  So it looks like a dedicated input method will
still be a win.  I don't think it's right that the only Unicode input
method we have is TeX -- that is great for TeX users, but many people
don't use (La)TeX, and will find it unintuitive to type the TeX
sequences.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14 15:07           ` Eli Zaretskii
@ 2020-10-14 19:40             ` Juri Linkov
  2020-10-15  2:34               ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-14 19:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

>> > How about making a new input method for those?  It seems to me that
>> > C-x 8 is already too "fat".
>>
>> Yes, a new method might be useful as well.  But since such method can't
>> be enabled all the time because such sequences as "= E" should be inserted
>> literally in normal circumstances.  So such method needs to be enabled
>> temporarily, and it takes more time to enable/disable it, while it's
>> useful only to insert a single special character sometimes, it would be
>> much easier to type some prefix key before typing "= E" to insert €
>> when such a need arises occasionally.
>
> But turning an input method on and off is just 1 key, C-\, whereas

1 key C-\ to enable, and 1 key C-\ to disable.  Also might need to select
another input method name from 'C-u C-\' when also using other input methods.

> C-x 8 is 2 keys, and not very convenient sequence to type, at least on
> QWERTY keyboards.

I agree, C-x 8 is not easy to type.

> So it looks like a dedicated input method will still be a win.

A win for some users, not a win for other users, so adding both
(an input method and a prefix key) would be fine for all.

> I don't think it's right that the only Unicode input method we have is
> TeX -- that is great for TeX users, but many people don't use (La)TeX,
> and will find it unintuitive to type the TeX sequences.

It seems the TeX input method requires typing whole Unicode names,
or at least unambiguous parts of names, e.g. '\euro' inserts €,
'\smile' inserts ⌣, but can't type '\smiling face with sunglasses'.
Also I see a hex Unicode input method in uni-input.el that supports
e.g. U<hex> or u<hex>, RFC1345 mnemonics in rfc1345.el,
SGML entities in sgml-input.el.  So adding a X11 Compose method would be handy.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14 19:40             ` Juri Linkov
@ 2020-10-15  2:34               ` Eli Zaretskii
  2020-10-19 20:45                 ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-15  2:34 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Wed, 14 Oct 2020 22:40:48 +0300
> 
> >> Yes, a new method might be useful as well.  But since such method can't
> >> be enabled all the time because such sequences as "= E" should be inserted
> >> literally in normal circumstances.  So such method needs to be enabled
> >> temporarily, and it takes more time to enable/disable it, while it's
> >> useful only to insert a single special character sometimes, it would be
> >> much easier to type some prefix key before typing "= E" to insert €
> >> when such a need arises occasionally.
> >
> > But turning an input method on and off is just 1 key, C-\, whereas
> 
> 1 key C-\ to enable, and 1 key C-\ to disable.  Also might need to select
> another input method name from 'C-u C-\' when also using other input methods.

Btw, input methods that use =E or E= could (and in many cases do) have
==E and E== to insert just "=E" and "E=", so no toggling is needed.

> > I don't think it's right that the only Unicode input method we have is
> > TeX -- that is great for TeX users, but many people don't use (La)TeX,
> > and will find it unintuitive to type the TeX sequences.
> 
> It seems the TeX input method requires typing whole Unicode names,
> or at least unambiguous parts of names, e.g. '\euro' inserts €,
> '\smile' inserts ⌣, but can't type '\smiling face with sunglasses'.
> Also I see a hex Unicode input method in uni-input.el that supports
> e.g. U<hex> or u<hex>, RFC1345 mnemonics in rfc1345.el,
> SGML entities in sgml-input.el.  So adding a X11 Compose method would be handy.

Agreed.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14  2:31       ` Eli Zaretskii
  2020-10-14  8:07         ` Juri Linkov
@ 2020-10-15  3:52         ` Richard Stallman
  1 sibling, 0 replies; 109+ messages in thread
From: Richard Stallman @ 2020-10-15  3:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866, juri

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > Would it make sense to support exactly the same keys that are
  > > provided by the X11 compose method?  I mean that are in the file
  > > /usr/share/X11/locale/en_US.UTF-8/Compose
  > > also available at
  > > https://help.ubuntu.com/community/ComposeKey
  > > and
  > > https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre

  > How about making a new input method for those?  It seems to me that
  > C-x 8 is already too "fat".

That may be useful, but it has a drawback compared with C-x 8.

It is inconvenient to change input methods just for one character and
then change back.  C-x 8 avoids that inconvenience; you can use it to
enter one character, any one character, without changing the current
input method.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-14 10:43         ` Robert Pluim
@ 2020-10-15  3:54           ` Richard Stallman
  0 siblings, 0 replies; 109+ messages in thread
From: Richard Stallman @ 2020-10-15  3:54 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866, juri

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  >     Richard> Maybe C-h in the argument for C-x 8 could display a buffer
  >     Richard> which displays characters you can choose.
  >     Richard> Each character would be followed by the sequence to type to choose that
  >     Richard> character.

  > The problem is that such a list is very long. 'C-h b' after 'C-x 8
  > RET' will display the bindings, but it does not currently contain the
  > character names, and TAB after 'C-x 8 RET' will list all the names but
  > not the sequences for entering them.

It would be a problem if they are displayed in an inconvenient
way, not designed specifically for this purpose.  My idea is to display
them in a buffer which is divided into pages, so you could use  C-x ]
and C-x [ to move around in it, as well as search commands.

  > Why does it matter which code block a character is in?

Organizing the buffer by code blocks makes it feasible to navigate
through the long list of all the Unicode characters and find 
the one you want, without knowing its name in advance.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-15  2:34               ` Eli Zaretskii
@ 2020-10-19 20:45                 ` Juri Linkov
  2020-10-19 23:12                   ` Stefan Kangas
  2020-10-20 14:12                   ` Eli Zaretskii
  0 siblings, 2 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-19 20:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

>> It seems the TeX input method requires typing whole Unicode names,
>> or at least unambiguous parts of names, e.g. '\euro' inserts €,
>> '\smile' inserts ⌣, but can't type '\smiling face with sunglasses'.
>> Also I see a hex Unicode input method in uni-input.el that supports
>> e.g. U<hex> or u<hex>, RFC1345 mnemonics in rfc1345.el,
>> SGML entities in sgml-input.el.  So adding a X11 Compose method would be handy.
>
> Agreed.

Here's is a working implementation.  It binds all key sequences to the key
'C-+' that has the mnemonics of adding a character.  'C-+' is free because
it can't be used to zoom text since its counterpart key 'C--' is already
taken to input numeric arguments.  'C-+ C-+' is bound to 'insert-char'
like the current longer key sequence 'C-x 8 RET' that is hard to type.


[-- Attachment #2: x-compose.el --]
[-- Type: application/emacs-lisp, Size: 1474 bytes --]

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-19 20:45                 ` Juri Linkov
@ 2020-10-19 23:12                   ` Stefan Kangas
  2020-10-20 18:42                     ` Juri Linkov
  2020-10-20 14:12                   ` Eli Zaretskii
  1 sibling, 1 reply; 109+ messages in thread
From: Stefan Kangas @ 2020-10-19 23:12 UTC (permalink / raw)
  To: Juri Linkov, Eli Zaretskii; +Cc: rpluim, 43866

Juri Linkov <juri@linkov.net> writes:

> Here's is a working implementation.  It binds all key sequences to the key
> 'C-+' that has the mnemonics of adding a character.  'C-+' is free because
> it can't be used to zoom text since its counterpart key 'C--' is already
> taken to input numeric arguments.

Right, but the idea of using it still makes me feel a bit uneasy.

May I suggest that we use a different key for this?

A while back, RMS suggested that we could bind `C-+' to
text-scale-adjust even if we can't bind `C-'.  I was not super
enthusiastic about this at the time, but perhaps that idea is the least
bad option.

One could imagine that in combination with, for example, optionally
binding the numerical prefix argument only to `M--'.  We could perhaps
then consider enabling that in the "beginner friendly profile" we have
been discussing on emacs-devel (but that no one has yet seriously worked
on).

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-19 20:45                 ` Juri Linkov
  2020-10-19 23:12                   ` Stefan Kangas
@ 2020-10-20 14:12                   ` Eli Zaretskii
  2020-10-20 14:47                     ` Robert Pluim
                                       ` (2 more replies)
  1 sibling, 3 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-20 14:12 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Mon, 19 Oct 2020 23:45:48 +0300
> 
> Here's is a working implementation.  It binds all key sequences to the key
> 'C-+' that has the mnemonics of adding a character.  'C-+' is free because
> it can't be used to zoom text since its counterpart key 'C--' is already
> taken to input numeric arguments.  'C-+ C-+' is bound to 'insert-char'
> like the current longer key sequence 'C-x 8 RET' that is hard to type.

The implementation seems to rely on a file in the /usr/include tree
that might not be there.  This is a significant disadvantage, IMO.  It
means that, unlike all other similar facilities in Emacs, this one is
not self-contained.

Is it possible to lift this limitation?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 14:12                   ` Eli Zaretskii
@ 2020-10-20 14:47                     ` Robert Pluim
  2020-10-20 15:50                       ` Eli Zaretskii
  2020-10-20 18:44                       ` Juri Linkov
  2020-10-20 19:05                     ` Juri Linkov
  2020-10-20 19:56                     ` Juri Linkov
  2 siblings, 2 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-20 14:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866, Juri Linkov

>>>>> On Tue, 20 Oct 2020 17:12:02 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Juri Linkov <juri@linkov.net>
    >> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
    >> Date: Mon, 19 Oct 2020 23:45:48 +0300
    >> 
    >> Here's is a working implementation.  It binds all key sequences to the key
    >> 'C-+' that has the mnemonics of adding a character.  'C-+' is free because
    >> it can't be used to zoom text since its counterpart key 'C--' is already
    >> taken to input numeric arguments.  'C-+ C-+' is bound to 'insert-char'
    >> like the current longer key sequence 'C-x 8 RET' that is hard to type.

    Eli> The implementation seems to rely on a file in the /usr/include tree
    Eli> that might not be there.  This is a significant disadvantage, IMO.  It
    Eli> means that, unlike all other similar facilities in Emacs, this one is
    Eli> not self-contained.

    Eli> Is it possible to lift this limitation?

Aren't all those definitions in lisp/term/x-win.el anyway?

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 14:47                     ` Robert Pluim
@ 2020-10-20 15:50                       ` Eli Zaretskii
  2020-10-20 18:44                       ` Juri Linkov
  1 sibling, 0 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-20 15:50 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866, juri

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Juri Linkov <juri@linkov.net>,  43866@debbugs.gnu.org
> Date: Tue, 20 Oct 2020 16:47:12 +0200
> 
>     Eli> The implementation seems to rely on a file in the /usr/include tree
>     Eli> that might not be there.  This is a significant disadvantage, IMO.  It
>     Eli> means that, unlike all other similar facilities in Emacs, this one is
>     Eli> not self-contained.
> 
>     Eli> Is it possible to lift this limitation?
> 
> Aren't all those definitions in lisp/term/x-win.el anyway?

Probably.  But even that is sub-optimal (though better than reading a
/usr/include file): it is only available on X.  What about TTY
sessions, what about w32?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-19 23:12                   ` Stefan Kangas
@ 2020-10-20 18:42                     ` Juri Linkov
  0 siblings, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-20 18:42 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: 43866, rpluim

> May I suggest that we use a different key for this?
>
> A while back, RMS suggested that we could bind `C-+' to
> text-scale-adjust even if we can't bind `C-'.  I was not super
> enthusiastic about this at the time, but perhaps that idea is the least
> bad option.
>
> One could imagine that in combination with, for example, optionally
> binding the numerical prefix argument only to `M--'.  We could perhaps
> then consider enabling that in the "beginner friendly profile" we have
> been discussing on emacs-devel (but that no one has yet seriously worked
> on).

Ah, C-+ could be suitable for a beginner profile indeed.

What key other programs use for a Compose-like Multi_key?

https://help.ubuntu.com/community/ComposeKey
says that the default Compose Multi_key is Shift+AltGr,
and the Unicode composition key is Shift+Ctrl+U.
And indeed Shift+Ctrl+U works in all applications and xterm,
but not in Emacs on X (in Emacs on tty Shift+Ctrl+U works).

And here is more information for different systems:
https://en.wikipedia.org/wiki/Compose_key
https://en.wikipedia.org/wiki/Unicode_input





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 14:47                     ` Robert Pluim
  2020-10-20 15:50                       ` Eli Zaretskii
@ 2020-10-20 18:44                       ` Juri Linkov
  1 sibling, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-20 18:44 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

>     Eli> The implementation seems to rely on a file in the /usr/include tree
>     Eli> that might not be there.  This is a significant disadvantage, IMO.  It
>     Eli> means that, unlike all other similar facilities in Emacs, this one is
>     Eli> not self-contained.
>
>     Eli> Is it possible to lift this limitation?
>
> Aren't all those definitions in lisp/term/x-win.el anyway?

It seems the list in lisp/term/x-win.el is not needed at run-time,
since Eli want to pre-generate these keymappings, so at the time
of generation, keysymdef.h can be used because we need such mappings as
from XK_Aogonek to U+0104, not from 0x01a1 to U+0104 like in x-win.el.

The only remaining problem with keysymdef.h is how to process such
definitions in keysymdef.h:

#define XK_KP_Enter                      0xff8d  /* Enter */

There is no Unicode character, even in x-win.el.
I guess it should be hard-coded to map directly to [kp-enter].

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 14:12                   ` Eli Zaretskii
  2020-10-20 14:47                     ` Robert Pluim
@ 2020-10-20 19:05                     ` Juri Linkov
  2020-10-21  8:11                       ` Robert Pluim
  2020-10-20 19:56                     ` Juri Linkov
  2 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-20 19:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

> The implementation seems to rely on a file in the /usr/include tree
> that might not be there.  This is a significant disadvantage, IMO.  It
> means that, unlike all other similar facilities in Emacs, this one is
> not self-contained.
>
> Is it possible to lift this limitation?

Yes, this is easy to do.  But I have one problem:
/usr/share/X11/locale/en_US.UTF-8/Compose contains 83 lines
where a key sequence maps to 2 characters, not to 1 character, e.g.

  <Multi_key> <acute> <Cyrillic_u> : "у́" # CYRILLIC SMALL LETTER U WITH COMBINING ACUTE ACCENT

where "у́" is 2 characters: CYRILLIC SMALL LETTER U and COMBINING ACUTE ACCENT.

iso-transl.el maps a key sequence to a single character only using

  (define-key map (apply 'vector '(?' ?у)) (vector ?у))

I don't know how to map a key sequence to 2 characters.
When trying to map to 2 characters ?у and ?́ :

  (define-key map (apply 'vector '(?' ?у)) (vector ?у ?́ ))

typing 'y inserts only the last character ?́ , not both ?у and ?́ .





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 14:12                   ` Eli Zaretskii
  2020-10-20 14:47                     ` Robert Pluim
  2020-10-20 19:05                     ` Juri Linkov
@ 2020-10-20 19:56                     ` Juri Linkov
  2020-10-21 14:02                       ` Eli Zaretskii
  2 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-20 19:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

> The implementation seems to rely on a file in the /usr/include tree
> that might not be there.  This is a significant disadvantage, IMO.  It
> means that, unlike all other similar facilities in Emacs, this one is
> not self-contained.
>
> Is it possible to lift this limitation?

I tried to generate an output with a list of characters,
but can't find a print-related variable that would
print a number as a character.

For example, currently

  (prin1 ?⌘ (current-buffer)) => 8984

prints the number 8984, but I need to print the character, i.e.

  (prin1 ?⌘ (current-buffer)) => ?⌘

There is the variable 'float-output-format' that affects the output
of floating-point numbers, e.g.

  (let ((float-output-format "%.2f")) (prin1 12.345 (current-buffer)))

but I can't find a variable to print characters instead of integers.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 19:05                     ` Juri Linkov
@ 2020-10-21  8:11                       ` Robert Pluim
  2020-10-21 14:29                         ` Eli Zaretskii
  2020-10-21 17:30                         ` Juri Linkov
  0 siblings, 2 replies; 109+ messages in thread
From: Robert Pluim @ 2020-10-21  8:11 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 43866

>>>>> On Tue, 20 Oct 2020 22:05:31 +0300, Juri Linkov <juri@linkov.net> said:

    >> The implementation seems to rely on a file in the /usr/include tree
    >> that might not be there.  This is a significant disadvantage, IMO.  It
    >> means that, unlike all other similar facilities in Emacs, this one is
    >> not self-contained.
    >> 
    >> Is it possible to lift this limitation?

    Juri> Yes, this is easy to do.  But I have one problem:
    Juri> /usr/share/X11/locale/en_US.UTF-8/Compose contains 83 lines
    Juri> where a key sequence maps to 2 characters, not to 1 character, e.g.

    Juri>   <Multi_key> <acute> <Cyrillic_u> : "у́" # CYRILLIC SMALL LETTER U WITH COMBINING ACUTE ACCENT

    Juri> where "у́" is 2 characters: CYRILLIC SMALL LETTER U and COMBINING ACUTE ACCENT.

    Juri> iso-transl.el maps a key sequence to a single character only using

    Juri>   (define-key map (apply 'vector '(?' ?у)) (vector ?у))

    Juri> I don't know how to map a key sequence to 2 characters.
    Juri> When trying to map to 2 characters ?у and ?́ :

    Juri>   (define-key map (apply 'vector '(?' ?у)) (vector ?у ?́ ))

    Juri> typing 'y inserts only the last character ?́ , not both ?у and ?́ .

Canʼt you pass a string containing ?y and ?́ as the last argument to
define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX
syntax to stop Emacs combining that U+0301 with the question mark)

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-20 19:56                     ` Juri Linkov
@ 2020-10-21 14:02                       ` Eli Zaretskii
  2020-10-21 17:23                         ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-21 14:02 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Tue, 20 Oct 2020 22:56:07 +0300
> 
> I tried to generate an output with a list of characters,
> but can't find a print-related variable that would
> print a number as a character.
> 
> For example, currently
> 
>   (prin1 ?⌘ (current-buffer)) => 8984
> 
> prints the number 8984, but I need to print the character, i.e.
> 
>   (prin1 ?⌘ (current-buffer)) => ?⌘

I don't think I understand what you are looking for.  Would using the
%c format in a call to 'format' be okay?  If not, why not?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21  8:11                       ` Robert Pluim
@ 2020-10-21 14:29                         ` Eli Zaretskii
  2020-10-21 14:40                           ` Robert Pluim
  2020-10-21 17:30                         ` Juri Linkov
  1 sibling, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-21 14:29 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866, juri

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  43866@debbugs.gnu.org
> Date: Wed, 21 Oct 2020 10:11:55 +0200
> 
> Canʼt you pass a string containing ?y and ?́ as the last argument to
> define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX
> syntax to stop Emacs combining that U+0301 with the question mark)

The character composition happens only on display, the buffer or
string still have two codepoints.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 14:29                         ` Eli Zaretskii
@ 2020-10-21 14:40                           ` Robert Pluim
  2020-10-21 15:23                             ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Robert Pluim @ 2020-10-21 14:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 43866, juri

>>>>> On Wed, 21 Oct 2020 17:29:58 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: Eli Zaretskii <eliz@gnu.org>,  43866@debbugs.gnu.org
    >> Date: Wed, 21 Oct 2020 10:11:55 +0200
    >> 
    >> Canʼt you pass a string containing ?y and ?́ as the last argument to
    >> define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX
    >> syntax to stop Emacs combining that U+0301 with the question mark)

    Eli> The character composition happens only on display, the buffer or
    Eli> string still have two codepoints.

Yes. But when looking at the code it would look like a single glyph,
which would be confusing.

Robert
-- 





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 14:40                           ` Robert Pluim
@ 2020-10-21 15:23                             ` Eli Zaretskii
  0 siblings, 0 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-21 15:23 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866, juri

> From: Robert Pluim <rpluim@gmail.com>
> Cc: juri@linkov.net,  43866@debbugs.gnu.org
> Date: Wed, 21 Oct 2020 16:40:47 +0200
> 
>     Eli> The character composition happens only on display, the buffer or
>     Eli> string still have two codepoints.
> 
> Yes. But when looking at the code it would look like a single glyph,
> which would be confusing.

We could have a comment about that.  IMO, using the ?\N{NAME} for that
is gross.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 14:02                       ` Eli Zaretskii
@ 2020-10-21 17:23                         ` Juri Linkov
  2020-10-21 18:16                           ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-21 17:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

>> I tried to generate an output with a list of characters,
>> but can't find a print-related variable that would
>> print a number as a character.
>>
>> For example, currently
>>
>>   (prin1 ?⌘ (current-buffer)) => 8984
>>
>> prints the number 8984, but I need to print the character, i.e.
>>
>>   (prin1 ?⌘ (current-buffer)) => ?⌘
>
> I don't think I understand what you are looking for.  Would using the
> %c format in a call to 'format' be okay?  If not, why not?

The problem is that it's necessary to print a long list with vectors
that contain characters.  For example:

(prin1 '(("'A" . [?Á])
         ("'E" . [?É])
         ("'I" . [?Í])
         ("'O" . [?Ó])
         ("'U" . [?Ú])
         ("'Y" . [?Ý]))
       (current-buffer))

currently prints:

(("'A" . [193])
 ("'E" . [201])
 ("'I" . [205])
 ("'O" . [211])
 ("'U" . [218])
 ("'Y" . [221]))

whereas it would be nicer to print characters as characters,
not as integers:

(("'A" . [?Á])
 ("'E" . [?É])
 ("'I" . [?Í])
 ("'O" . [?Ó])
 ("'U" . [?Ú])
 ("'Y" . [?Ý]))

I can't find a variable that could change the output format
of integers to print them as characters.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21  8:11                       ` Robert Pluim
  2020-10-21 14:29                         ` Eli Zaretskii
@ 2020-10-21 17:30                         ` Juri Linkov
  1 sibling, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-21 17:30 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 43866

>> I don't know how to map a key sequence to 2 characters.
>> When trying to map to 2 characters ?у and ?́ :
>
>>   (define-key map (apply 'vector '(?' ?у)) (vector ?у ?́ ))
>
>> typing 'y inserts only the last character ?́ , not both ?у and ?́ .
>
> Canʼt you pass a string containing ?y and ?́ as the last argument to
> define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX
> syntax to stop Emacs combining that U+0301 with the question mark)

I tried to use a string as the last argument to define-key,
and the result is weird: the Help buffer says that the key binding
is actually a keyboard macro, and invoking it does strange things.
For example, when a binding is a string "ö", then typing its keys
calls the command 'upcase-word'.  No idea why it works this way.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 17:23                         ` Juri Linkov
@ 2020-10-21 18:16                           ` Eli Zaretskii
  2020-10-21 18:27                             ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-21 18:16 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Wed, 21 Oct 2020 20:23:51 +0300
> 
> > I don't think I understand what you are looking for.  Would using the
> > %c format in a call to 'format' be okay?  If not, why not?
> 
> The problem is that it's necessary to print a long list with vectors
> that contain characters.  For example:
> 
> (prin1 '(("'A" . [?Á])
>          ("'E" . [?É])
>          ("'I" . [?Í])
>          ("'O" . [?Ó])
>          ("'U" . [?Ú])
>          ("'Y" . [?Ý]))
>        (current-buffer))

Why do you have to use prin1?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 18:16                           ` Eli Zaretskii
@ 2020-10-21 18:27                             ` Juri Linkov
  2020-10-21 18:35                               ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-21 18:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

>> The problem is that it's necessary to print a long list with vectors
>> that contain characters.  For example:
>> 
>> (prin1 '(("'A" . [?Á])
>>          ("'E" . [?É])
>>          ("'I" . [?Í])
>>          ("'O" . [?Ó])
>>          ("'U" . [?Ú])
>>          ("'Y" . [?Ý]))
>>        (current-buffer))
>
> Why do you have to use prin1?

Actually I need to use pp-to-string to pretty-print the list,
but pp-to-string calls '(prin1 object (current-buffer))'.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 18:27                             ` Juri Linkov
@ 2020-10-21 18:35                               ` Eli Zaretskii
  2020-10-21 19:39                                 ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-21 18:35 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Wed, 21 Oct 2020 21:27:16 +0300
> 
> >> (prin1 '(("'A" . [?Á])
> >>          ("'E" . [?É])
> >>          ("'I" . [?Í])
> >>          ("'O" . [?Ó])
> >>          ("'U" . [?Ú])
> >>          ("'Y" . [?Ý]))
> >>        (current-buffer))
> >
> > Why do you have to use prin1?
> 
> Actually I need to use pp-to-string to pretty-print the list,
> but pp-to-string calls '(prin1 object (current-buffer))'.

prin1 accepts a function as its 2nd argument; can you use that?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 18:35                               ` Eli Zaretskii
@ 2020-10-21 19:39                                 ` Juri Linkov
  2020-10-22 12:59                                   ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-21 19:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866

[-- Attachment #1: Type: text/plain, Size: 1084 bytes --]

>> >> (prin1 '(("'A" . [?Á])
>> >>          ("'E" . [?É])
>> >>          ("'I" . [?Í])
>> >>          ("'O" . [?Ó])
>> >>          ("'U" . [?Ú])
>> >>          ("'Y" . [?Ý]))
>> >>        (current-buffer))
>> >
>> > Why do you have to use prin1?
>>
>> Actually I need to use pp-to-string to pretty-print the list,
>> but pp-to-string calls '(prin1 object (current-buffer))'.
>
> prin1 accepts a function as its 2nd argument; can you use that?

I tried to use a function in the 2nd argument, but it's called
for every digit of the integer that represents a character,
so I don't know what to do with these digits.

However, do you think something like the following is a good idea?

Let-binding a new variable 'print-integers-as-chars' to t:

(let ((print-integers-as-chars t))
  (pp '(("'A" . [?Á])
        ("'E" . [?É])
        ("'I" . [?Í])
        ("'O" . [?Ó])
        ("'U" . [?Ú])
        ("'Y" . [?Ý]))
      (current-buffer)))

prints integers as characters:

(("'A" .  [?Á])
 ("'E" .  [?É])
 ("'I" .  [?Í])
 ("'O" .  [?Ó])
 ("'U" .  [?Ú])
 ("'Y" .  [?Ý]))

with this patch:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: print-integers-as-chars.patch --]
[-- Type: text/x-diff, Size: 1244 bytes --]

diff --git a/src/print.c b/src/print.c
index dca095f281..1755eea738 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,16 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        if (!NILP (Vprint_integers_as_chars) && CHARACTERP (obj))
+          {
+            int len = sprintf (buf, "%s", SDATA (call1 (intern ("prin1-char"), obj)));
+            strout (buf, len, len, printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2255,10 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars,
+	       doc: /* Print integers as characters.  */);
+  Vprint_integers_as_chars = Qnil;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-21 19:39                                 ` Juri Linkov
@ 2020-10-22 12:59                                   ` Eli Zaretskii
  2020-10-22 20:56                                     ` bug#44155: Print integers as characters Juri Linkov
  2022-04-30 12:19                                     ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen
  0 siblings, 2 replies; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-22 12:59 UTC (permalink / raw)
  To: Juri Linkov; +Cc: rpluim, 43866

> From: Juri Linkov <juri@linkov.net>
> Cc: rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Wed, 21 Oct 2020 22:39:08 +0300
> 
> However, do you think something like the following is a good idea?
> 
> Let-binding a new variable 'print-integers-as-chars' to t:
> 
> (let ((print-integers-as-chars t))
>   (pp '(("'A" . [?Á])
>         ("'E" . [?É])
>         ("'I" . [?Í])
>         ("'O" . [?Ó])
>         ("'U" . [?Ú])
>         ("'Y" . [?Ý]))
>       (current-buffer)))
> 
> prints integers as characters:
> 
> (("'A" .  [?Á])
>  ("'E" .  [?É])
>  ("'I" .  [?Í])
>  ("'O" .  [?Ó])
>  ("'U" .  [?Ú])
>  ("'Y" .  [?Ý]))
> 
> with this patch:

The idea is fine, but I have a few comments about implementation:

>      case_Lisp_Int:
>        {
> -	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
> -	strout (buf, len, len, printcharfun);
> +        if (!NILP (Vprint_integers_as_chars) && CHARACTERP (obj))
                      ^^^^^^^^^^^^^^^^^^^^^^^^
If this is supposed to be a boolean variable, please use DEFVAR_BOOL,
with all the consequences.

> +            int len = sprintf (buf, "%s", SDATA (call1 (intern ("prin1-char"), obj)));

Do we really need to call Lisp?  I thought we were quite capable of
printing characters from C, aren't we?

> @@ -2247,6 +2255,10 @@ syms_of_print (void)
>  that represents the number without losing information.  */);
>    Vfloat_output_format = Qnil;
>  
> +  DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars,
> +	       doc: /* Print integers as characters.  */);
> +  Vprint_integers_as_chars = Qnil;

I wonder whether it wouldn't be cleaner to add another optional
argument to prin1, and let it bind some internal variable so that
print_object does this, instead  of exposing this knob to Lisp.
Because print_object is used all over the place, and who knows what
will this do to other callers?

Thanks.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-22 12:59                                   ` Eli Zaretskii
@ 2020-10-22 20:56                                     ` Juri Linkov
  2020-10-22 22:39                                       ` Andreas Schwab
  2020-11-01 12:03                                       ` Mattias Engdegård
  2022-04-30 12:19                                     ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen
  1 sibling, 2 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-22 20:56 UTC (permalink / raw)
  To: 44155

[-- Attachment #1: Type: text/plain, Size: 2932 bytes --]

Tags: patch

[Creating a separate feature request from bug#43866]

>> Let-binding a new variable 'print-integers-as-chars' to t:
>> 
>> (let ((print-integers-as-chars t))
>>   (pp '(("'A" . [?Á])
>>         ("'E" . [?É])
>>         ("'I" . [?Í])
>>         ("'O" . [?Ó])
>>         ("'U" . [?Ú])
>>         ("'Y" . [?Ý]))
>>       (current-buffer)))
>> 
>> prints integers as characters:
>> 
>> (("'A" .  [?Á])
>>  ("'E" .  [?É])
>>  ("'I" .  [?Í])
>>  ("'O" .  [?Ó])
>>  ("'U" .  [?Ú])
>>  ("'Y" .  [?Ý]))
>> 
>> with this patch:
>
> The idea is fine, but I have a few comments about implementation:
>
>>      case_Lisp_Int:
>>        {
>> -	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
>> -	strout (buf, len, len, printcharfun);
>> +        if (!NILP (Vprint_integers_as_chars) && CHARACTERP (obj))
>                       ^^^^^^^^^^^^^^^^^^^^^^^^
> If this is supposed to be a boolean variable, please use DEFVAR_BOOL,
> with all the consequences.

Fixed in the next patch.

>> +            int len = sprintf (buf, "%s", SDATA (call1 (intern ("prin1-char"), obj)));
>
> Do we really need to call Lisp?  I thought we were quite capable of
> printing characters from C, aren't we?

Thanks for the hint.  Now the patch uses only C functions.
(My initial idea was to use eval-expression-print-format as a base that has

    (let ((char-string
           (and (characterp value)
                (<= value eval-expression-print-maximum-character)
                (char-displayable-p value)
                (prin1-char value))))

but it seems only the condition 'characterp' is needed in C implementation.)

>> @@ -2247,6 +2255,10 @@ syms_of_print (void)
>>  that represents the number without losing information.  */);
>>    Vfloat_output_format = Qnil;
>>  
>> +  DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars,
>> +	       doc: /* Print integers as characters.  */);
>> +  Vprint_integers_as_chars = Qnil;
>
> I wonder whether it wouldn't be cleaner to add another optional
> argument to prin1, and let it bind some internal variable so that
> print_object does this, instead  of exposing this knob to Lisp.
> Because print_object is used all over the place, and who knows what
> will this do to other callers?

The variable 'print-integers-as-chars' is modeled after many similar
variables that affect the prin1 output:

- print-escape-control-characters
- print-escape-newlines
- print-escape-nonascii
- print-escape-multibyte
- print-length
- print-level
- print-quoted
- print-circle
- float-output-format

But now this leads me to think that maybe the new variable should be
like 'float-output-format', so it could be named 'integer-output-format'
and support options for different integer formats:

- 'character': print integers as characters;
- 'decimal': the default format;
- 'binary': print integers as e.g. #b010101;
- 'octal': print integers as e.g. #o777;
- 'hex': print integers as e.g. #x00ff;


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: print-integers-as-characters.patch --]
[-- Type: text/x-diff, Size: 1219 bytes --]

diff --git a/src/print.c b/src/print.c
index dca095f281..909c55efed 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,16 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        if (print_integers_as_characters && CHARACTERP (obj))
+          {
+            printchar ('?', printcharfun);
+            print_string (CALLN (Fstring, obj), printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2255,10 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Print integers as characters.  */);
+  print_integers_as_characters = 0;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-22 20:56                                     ` bug#44155: Print integers as characters Juri Linkov
@ 2020-10-22 22:39                                       ` Andreas Schwab
  2020-10-23  8:16                                         ` Juri Linkov
  2020-10-23  8:32                                         ` Juri Linkov
  2020-11-01 12:03                                       ` Mattias Engdegård
  1 sibling, 2 replies; 109+ messages in thread
From: Andreas Schwab @ 2020-10-22 22:39 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155

On Okt 22 2020, Juri Linkov wrote:

> diff --git a/src/print.c b/src/print.c
> index dca095f281..909c55efed 100644
> --- a/src/print.c
> +++ b/src/print.c
> @@ -1908,8 +1908,16 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
>      {
>      case_Lisp_Int:
>        {
> -	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
> -	strout (buf, len, len, printcharfun);
> +        if (print_integers_as_characters && CHARACTERP (obj))
> +          {
> +            printchar ('?', printcharfun);
> +            print_string (CALLN (Fstring, obj), printcharfun);

That will create ambigous output.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-22 22:39                                       ` Andreas Schwab
@ 2020-10-23  8:16                                         ` Juri Linkov
  2020-10-23  8:32                                         ` Juri Linkov
  1 sibling, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-10-23  8:16 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 44155

>> +        if (print_integers_as_characters && CHARACTERP (obj))
>> +          {
>> +            printchar ('?', printcharfun);
>> +            print_string (CALLN (Fstring, obj), printcharfun);
>
> That will create ambigous output.

No ambiguities found:

(let ((strings (make-hash-table :test 'equal)))
  (dotimes (i (max-char))
    (let ((s (string i)))
      (if (gethash s strings)
          (message "! %S %S" s (gethash s strings))
        (puthash s i strings)))))





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-22 22:39                                       ` Andreas Schwab
  2020-10-23  8:16                                         ` Juri Linkov
@ 2020-10-23  8:32                                         ` Juri Linkov
  2020-10-24 19:53                                           ` Juri Linkov
  1 sibling, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-23  8:32 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 44155

>> +        if (print_integers_as_characters && CHARACTERP (obj))
>> +          {
>> +            printchar ('?', printcharfun);
>> +            print_string (CALLN (Fstring, obj), printcharfun);
>
> That will create ambigous output.

Or do you mean:

(dotimes (i (max-char))
  (condition-case err
      (unless (eq i (read (concat "?" (string i))))
        (message "%d ?%s" i (string i)))
    (error (message "%d ?%s ;; %s" i (string i) (error-message-string err)))))

92 ?\ ;; End of file during parsing
4194176 ?\200
...
4194302 ?\376





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-23  8:32                                         ` Juri Linkov
@ 2020-10-24 19:53                                           ` Juri Linkov
  2020-10-25 17:22                                             ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-24 19:53 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 44155

[-- Attachment #1: Type: text/plain, Size: 1457 bytes --]

>>> +        if (print_integers_as_characters && CHARACTERP (obj))
>>> +          {
>>> +            printchar ('?', printcharfun);
>>> +            print_string (CALLN (Fstring, obj), printcharfun);
>>
>> That will create ambigous output.
>
> Or do you mean:
>
> (dotimes (i (max-char))
>   (condition-case err
>       (unless (eq i (read (concat "?" (string i))))
>         (message "%d ?%s" i (string i)))
>     (error (message "%d ?%s ;; %s" i (string i) (error-message-string err)))))
>
> 92 ?\ ;; End of file during parsing
> 4194176 ?\200
> ...
> 4194302 ?\376

Now the following patch on this code

(let ((integer-output-format t))
  (pp '(?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ 4194176)
      (current-buffer)))

outputs

(?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ 4194176)

and no ambiguities found with

(let ((integer-output-format t))
  (dotimes (i (+ (max-char) 2))
    (condition-case err
	(unless (eq i (read (format "%S" i)))
          (message "%d ?%s" i (string i)))
      (error (message "%d ?%s ;; %s" i (string i) (error-message-string err))))))

The list of escaped characters was taken from 'prin1-char',
not from a similar list in 'print_object' in 'case Lisp_Symbol' branch.

Also 'integer-output-format' prints integers in hex format when set to 16.

(let ((integer-output-format 16))
  (pp '(?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ 4194176)
      (current-buffer)))
=>
(#x3b #x28 #x29 #x7b #x7d #x5b #x5d #x22 #x27 #x5c #x3fff80)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: integer-output-format.patch --]
[-- Type: text/x-diff, Size: 1897 bytes --]

diff --git a/src/print.c b/src/print.c
index 53aa353769..53c8c4c91a 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,29 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        EMACS_INT c = XFIXNUM (obj);
+
+        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && c < 4194176)
+          {
+            printchar ('?', printcharfun);
+
+            if (escapeflag
+                && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
+                    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+              printchar ('\\', printcharfun);
+            print_string (Fchar_to_string (obj), printcharfun);
+          }
+        else if (INTEGERP (Vinteger_output_format)
+                 && XFIXNUM (Vinteger_output_format) == 16 && c >= 0)
+          {
+            int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) c);
+            strout (buf, len, len, printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", c);
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2268,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
+	       doc: /* The format used to print integers.
+When 't', print integers as characters.
+When a number 16, print numbers in hex format.
+Otherwise, print integers in decimal format.  */);
+  Vinteger_output_format = Qnil;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-24 19:53                                           ` Juri Linkov
@ 2020-10-25 17:22                                             ` Eli Zaretskii
  2020-10-25 19:09                                               ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-25 17:22 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, schwab

> From: Juri Linkov <juri@linkov.net>
> Date: Sat, 24 Oct 2020 22:53:44 +0300
> Cc: 44155@debbugs.gnu.org
> 
> +        EMACS_INT c = XFIXNUM (obj);

There's no need to use EMACS_INT, a character code is at most 22 bits,
so it always fits into an 'int'.

> +        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && c < 4194176)
                                                                          ^^^^^^^

Please use MAX_5_BYTE_CHAR here.  Or, better yet, CHAR_BYTE8_P.

And, btw, why not allow raw bytes here as well? is there some problem?

> +          {
> +            printchar ('?', printcharfun);
> +
> +            if (escapeflag
> +                && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
> +                    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
> +              printchar ('\\', printcharfun);
> +            print_string (Fchar_to_string (obj), printcharfun);

Why are you using print_string here instead of printchar?  IOW, what
is the difference between printing a backslash and printing any other
character, that you can use printchar for the former, but not for the
latter?

> +        else if (INTEGERP (Vinteger_output_format)
> +                 && XFIXNUM (Vinteger_output_format) == 16 && c >= 0)

If you really want to allow Vinteger_output_format to be a bignum, you
cannot use XFIXNUM with it, you need to use integer_to_intmax or
somesuch.  Otherwise, you should use FIXNUMP instead of INTEGERP.

> +  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
> +	       doc: /* The format used to print integers.
> +When 't', print integers as characters.
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
But only integers that are small enough, yes?

> +When a number 16, print numbers in hex format.

This immediately begs the question: why cannot the value be 8 or 2?

Thanks.

P.S. This will eventually need a NEWS entry.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-25 17:22                                             ` Eli Zaretskii
@ 2020-10-25 19:09                                               ` Juri Linkov
  2020-10-25 19:53                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-25 19:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab

[-- Attachment #1: Type: text/plain, Size: 1914 bytes --]

>> +        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && c < 4194176)
>                                                                           ^^^^^^^
>
> Please use MAX_5_BYTE_CHAR here.  Or, better yet, CHAR_BYTE8_P.

Thanks, fixed.

> And, btw, why not allow raw bytes here as well? is there some problem?

Because of ambiguity, both these return the same value:

(read (concat "?" (string 128)))     => 128
(read (concat "?" (string 4194176))) => 128

>> +            print_string (Fchar_to_string (obj), printcharfun);
>
> Why are you using print_string here instead of printchar?  IOW, what
> is the difference between printing a backslash and printing any other
> character, that you can use printchar for the former, but not for the
> latter?

It was needed in earlier versions, but not now; fixed.

>> +        else if (INTEGERP (Vinteger_output_format)
>> +                 && XFIXNUM (Vinteger_output_format) == 16 && c >= 0)
>
> If you really want to allow Vinteger_output_format to be a bignum, you
> cannot use XFIXNUM with it, you need to use integer_to_intmax or
> somesuch.  Otherwise, you should use FIXNUMP instead of INTEGERP.

Fixed.

>> +  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
>> +	       doc: /* The format used to print integers.
>> +When 't', print integers as characters.
>              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> But only integers that are small enough, yes?

Fixed the docstring as well.

>> +When a number 16, print numbers in hex format.
>
> This immediately begs the question: why cannot the value be 8 or 2?

Because octal and binary are not so widely used as hex.
But variable makes room for further improvements to later support
octal and binary too, and maybe string formats like in float-output-format.

> P.S. This will eventually need a NEWS entry.

And also updates in the Info manual will be in the final version of the patch.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: integer-output-format-2.patch --]
[-- Type: text/x-diff, Size: 2008 bytes --]

diff --git a/src/print.c b/src/print.c
index 53aa353769..b04d5023f8 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        int c = XFIXNUM (obj);
+        intmax_t i;
+
+        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && ! CHAR_BYTE8_P (c))
+          {
+            printchar ('?', printcharfun);
+            if (escapeflag
+                && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
+                    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+              printchar ('\\', printcharfun);
+            printchar (c, printcharfun);
+          }
+        else if (INTEGERP (Vinteger_output_format)
+                 && integer_to_intmax (Vinteger_output_format, &i)
+                 && i == 16 && XFIXNUM (obj) >= 0)
+          {
+            int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2269,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
+	       doc: /* The format used to print integers.
+When 't', print characters from integers that represent characters.
+When a number 16, print non-negative numbers in hex format.
+Otherwise, print integers in decimal format.  */);
+  Vinteger_output_format = Qnil;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-25 19:09                                               ` Juri Linkov
@ 2020-10-25 19:53                                                 ` Eli Zaretskii
  2020-10-27 20:08                                                   ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-25 19:53 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: schwab@linux-m68k.org,  44155@debbugs.gnu.org
> Date: Sun, 25 Oct 2020 21:09:07 +0200
> 
> > And, btw, why not allow raw bytes here as well? is there some problem?
> 
> Because of ambiguity, both these return the same value:
> 
> (read (concat "?" (string 128)))     => 128
> (read (concat "?" (string 4194176))) => 128

And why is that a problem?

Alternatively, we could print raw bytes in some special way.  But not
treating them as characters sounds some subtlety that will be hard to
explain.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-25 19:53                                                 ` Eli Zaretskii
@ 2020-10-27 20:08                                                   ` Juri Linkov
  2020-10-28 15:51                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-27 20:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]

>> > And, btw, why not allow raw bytes here as well? is there some problem?
>>
>> Because of ambiguity, both these return the same value:
>>
>> (read (concat "?" (string 128)))     => 128
>> (read (concat "?" (string 4194176))) => 128
>
> And why is that a problem?

I don't know, Andreas remarked that it creates ambiguous output,
and I fixed the reported problem.

> Alternatively, we could print raw bytes in some special way.  But not
> treating them as characters sounds some subtlety that will be hard to
> explain.

The existing 'prin1-char' used as a reference implementation
doesn't print integers like 4194176 as characters, so the patch
does the same.

Anyway, here is a complete patch with tests and documentation:


[-- Attachment #2: integer-output-format-3.patch --]
[-- Type: text/x-diff, Size: 4855 bytes --]

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index 2cd61ad04f..f171f13779 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -902,3 +902,11 @@ Output Variables
 in the C function @code{sprintf}.  For further restrictions on what
 you can use, see the variable's documentation string.
 @end defvar
+
+@defvar integer-output-format
+This variable specifies how to print integer numbers.  The default is
+@code{nil}, meaning use the decimal format.  When bound to @code{t},
+print integers as characters when an integer represents a character
+(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
+print non-negative integers in the hexadecimal format.
+@end defvar
diff --git a/etc/NEWS b/etc/NEWS
index a77c1c883e..2f7d08ad08 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1631,6 +1631,12 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
+** New variable 'integer-output-format' defines the format of integers.
+When this variable is bound to the value 't', integers are printed by
+printing functions as characters when an integer represents a character.
+When bound to the number 16, non-negative integers are printed in the
+hexadecimal format.
+
 +++
 ** 'define-globalized-minor-mode' now takes a :predicate parameter.
 This can be used to control which major modes the minor mode should be
diff --git a/src/print.c b/src/print.c
index 53aa353769..a5c56c6b48 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        int c;
+        intmax_t i;
+
+        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
+            && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
+          {
+            printchar ('?', printcharfun);
+            if (escapeflag
+                && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
+                    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+              printchar ('\\', printcharfun);
+            printchar (c, printcharfun);
+          }
+        else if (INTEGERP (Vinteger_output_format)
+                 && integer_to_intmax (Vinteger_output_format, &i)
+                 && i == 16 && Fnatnump (obj))
+          {
+            int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2270,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
+	       doc: /* The format used to print integers.
+When t, print characters from integers that represent a character.
+When a number 16, print non-negative integers in the hexadecimal format.
+Otherwise, by default print integers in the decimal format.  */);
+  Vinteger_output_format = Qnil;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index eb9572dbdf..7b026b6b21 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,5 +383,25 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
+(print-tests--deftest print-integer-output-format ()
+  ;; Bug#44155.
+  (let ((integer-output-format t)
+        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
+    (should (equal (read (print-tests--prin1-to-string syms)) syms))
+    (should (equal (print-tests--prin1-to-string syms)
+                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
+  (let ((integer-output-format t)
+        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
+    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
+  (let ((integer-output-format 16)
+        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
+    (should (equal (read (print-tests--prin1-to-string syms)) syms))
+    (should (equal (print-tests--prin1-to-string syms)
+                   (concat "(" (mapconcat
+                                (lambda (i)
+                                  (if (and (>= i 0) (<= i most-positive-fixnum))
+                                      (format "#x%x" i) (format "%d" i)))
+                                syms " ") ")")))))
+
 (provide 'print-tests)
 ;;; print-tests.el ends here

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-27 20:08                                                   ` Juri Linkov
@ 2020-10-28 15:51                                                     ` Eli Zaretskii
  2020-10-28 19:41                                                       ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-28 15:51 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: schwab@linux-m68k.org,  44155@debbugs.gnu.org
> Date: Tue, 27 Oct 2020 22:08:12 +0200
> 
> > Alternatively, we could print raw bytes in some special way.  But not
> > treating them as characters sounds some subtlety that will be hard to
> > explain.
> 
> The existing 'prin1-char' used as a reference implementation
> doesn't print integers like 4194176 as characters, so the patch
> does the same.

I don't think it's right, FWIW.  Displaying something like \100 would
be better, IMO.

> +@defvar integer-output-format
> +This variable specifies how to print integer numbers.  The default is
> +@code{nil}, meaning use the decimal format.  When bound to @code{t},
> +print integers as characters when an integer represents a character
> +(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
> +print non-negative integers in the hexadecimal format.

This should mention the functions affected by the variable.

> +** New variable 'integer-output-format' defines the format of integers.
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"determines how to print integer values"

> +When this variable is bound to the value 't', integers are printed by
> +printing functions as characters when an integer represents a character.

Please give at least one example of a function affected by this.

Thanks.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-28 15:51                                                     ` Eli Zaretskii
@ 2020-10-28 19:41                                                       ` Juri Linkov
  2020-10-29 14:20                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-28 19:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab

[-- Attachment #1: Type: text/plain, Size: 1353 bytes --]

>> > Alternatively, we could print raw bytes in some special way.  But not
>> > treating them as characters sounds some subtlety that will be hard to
>> > explain.
>> 
>> The existing 'prin1-char' used as a reference implementation
>> doesn't print integers like 4194176 as characters, so the patch
>> does the same.
>
> I don't think it's right, FWIW.  Displaying something like \100 would
> be better, IMO.

Sorry, I don't understand why 4194176 could be printed as \100.

>> +@defvar integer-output-format
>> +This variable specifies how to print integer numbers.  The default is
>> +@code{nil}, meaning use the decimal format.  When bound to @code{t},
>> +print integers as characters when an integer represents a character
>> +(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
>> +print non-negative integers in the hexadecimal format.
>
> This should mention the functions affected by the variable.
>
>> +** New variable 'integer-output-format' defines the format of integers.
>                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> "determines how to print integer values"
>
>> +When this variable is bound to the value 't', integers are printed by
>> +printing functions as characters when an integer represents a character.
>
> Please give at least one example of a function affected by this.

Ok, fixed:


[-- Attachment #2: integer-output-format-4.patch --]
[-- Type: text/x-diff, Size: 4991 bytes --]

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index 2cd61ad04f..08d8032e6f 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -902,3 +902,12 @@ Output Variables
 in the C function @code{sprintf}.  For further restrictions on what
 you can use, see the variable's documentation string.
 @end defvar
+
+@defvar integer-output-format
+This variable specifies how to print integer numbers.  The default is
+@code{nil}, meaning use the decimal format.  When bound to @code{t},
+print integers as characters when an integer represents a character
+(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
+print non-negative integers in the hexadecimal format.
+This variable affects all print functions.
+@end defvar
diff --git a/etc/NEWS b/etc/NEWS
index 5e159480e0..202e449b16 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1641,6 +1641,12 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
+** New variable 'integer-output-format' determines how to print integer values.
+When this variable is bound to the value 't', integers are printed by
+printing functions as characters when an integer represents a character.
+When bound to the number 16, non-negative integers are printed in the
+hexadecimal format.
+
 +++
 ** 'define-globalized-minor-mode' now takes a :predicate parameter.
 This can be used to control which major modes the minor mode should be
diff --git a/src/print.c b/src/print.c
index 53aa353769..7b3dc61065 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        int c;
+        intmax_t i;
+
+        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
+            && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
+          {
+            printchar ('?', printcharfun);
+            if (escapeflag
+                && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
+                    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+              printchar ('\\', printcharfun);
+            printchar (c, printcharfun);
+          }
+        else if (INTEGERP (Vinteger_output_format)
+                 && integer_to_intmax (Vinteger_output_format, &i)
+                 && i == 16 && Fnatnump (obj))
+          {
+            int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2270,15 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
+	       doc: /* The format used to print integers.
+When t, print characters from integers that represent a character.
+When a number 16, print non-negative integers in the hexadecimal format.
+Otherwise, by default print integers in the decimal format.
+This variable affects all print functions, for example, such function
+as `print'.  */);
+  Vinteger_output_format = Qnil;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index eb9572dbdf..7b026b6b21 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,5 +383,25 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
+(print-tests--deftest print-integer-output-format ()
+  ;; Bug#44155.
+  (let ((integer-output-format t)
+        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
+    (should (equal (read (print-tests--prin1-to-string syms)) syms))
+    (should (equal (print-tests--prin1-to-string syms)
+                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
+  (let ((integer-output-format t)
+        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
+    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
+  (let ((integer-output-format 16)
+        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
+    (should (equal (read (print-tests--prin1-to-string syms)) syms))
+    (should (equal (print-tests--prin1-to-string syms)
+                   (concat "(" (mapconcat
+                                (lambda (i)
+                                  (if (and (>= i 0) (<= i most-positive-fixnum))
+                                      (format "#x%x" i) (format "%d" i)))
+                                syms " ") ")")))))
+
 (provide 'print-tests)
 ;;; print-tests.el ends here

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-28 19:41                                                       ` Juri Linkov
@ 2020-10-29 14:20                                                         ` Eli Zaretskii
  2020-10-29 21:00                                                           ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-29 14:20 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: schwab@linux-m68k.org,  44155@debbugs.gnu.org
> Date: Wed, 28 Oct 2020 21:41:46 +0200
> 
> >> The existing 'prin1-char' used as a reference implementation
> >> doesn't print integers like 4194176 as characters, so the patch
> >> does the same.
> >
> > I don't think it's right, FWIW.  Displaying something like \100 would
> > be better, IMO.
> 
> Sorry, I don't understand why 4194176 could be printed as \100.

I meant \200, sorry.  That's the raw byte that 4194176 stands for.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-29 14:20                                                         ` Eli Zaretskii
@ 2020-10-29 21:00                                                           ` Juri Linkov
  2020-10-30  7:35                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-29 21:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab

[-- Attachment #1: Type: text/plain, Size: 482 bytes --]

>> >> The existing 'prin1-char' used as a reference implementation
>> >> doesn't print integers like 4194176 as characters, so the patch
>> >> does the same.
>> >
>> > I don't think it's right, FWIW.  Displaying something like \100 would
>> > be better, IMO.
>>
>> Sorry, I don't understand why 4194176 could be printed as \100.
>
> I meant \200, sorry.  That's the raw byte that 4194176 stands for.

OK, in this patch the condition !CHAR_BYTE8_P(c) is removed, so
it prints \200:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: integer-output-format-4.patch --]
[-- Type: text/x-diff, Size: 2037 bytes --]

diff --git a/src/print.c b/src/print.c
index 53aa353769..20841eba61 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1908,8 +1908,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
-	strout (buf, len, len, printcharfun);
+        int c;
+        intmax_t i;
+
+        if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
+            && (c = XFIXNUM (obj)))
+          {
+            printchar ('?', printcharfun);
+            if (escapeflag
+                && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
+                    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+              printchar ('\\', printcharfun);
+            printchar (c, printcharfun);
+          }
+        else if (INTEGERP (Vinteger_output_format)
+                 && integer_to_intmax (Vinteger_output_format, &i)
+                 && i == 16 && !NILP (Fnatnump (obj)))
+          {
+            int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
+        else
+          {
+            int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+            strout (buf, len, len, printcharfun);
+          }
       }
       break;
 
@@ -2247,6 +2270,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
+  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
+	       doc: /* The format used to print integers.
+When t, print characters from integers that represent a character.
+When a number 16, print non-negative integers in the hexadecimal format.
+Otherwise, by default print integers in the decimal format.  */);
+  Vinteger_output_format = Qnil;
+
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
 A value of nil means no limit.  See also `eval-expression-print-length'.  */);

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-29 21:00                                                           ` Juri Linkov
@ 2020-10-30  7:35                                                             ` Eli Zaretskii
  2020-10-31 20:11                                                               ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-10-30  7:35 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: schwab@linux-m68k.org,  44155@debbugs.gnu.org
> Date: Thu, 29 Oct 2020 23:00:48 +0200
> 
> > I meant \200, sorry.  That's the raw byte that 4194176 stands for.
> 
> OK, in this patch the condition !CHAR_BYTE8_P(c) is removed, so
> it prints \200:

Thanks.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-30  7:35                                                             ` Eli Zaretskii
@ 2020-10-31 20:11                                                               ` Juri Linkov
  2020-10-31 23:27                                                                 ` Glenn Morris
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-10-31 20:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab

tags 44155 fixed
close 44155 28.0.50
quit

>> > I meant \200, sorry.  That's the raw byte that 4194176 stands for.
>> 
>> OK, in this patch the condition !CHAR_BYTE8_P(c) is removed, so
>> it prints \200:
>
> Thanks.

Now pushed to master and closed.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-31 20:11                                                               ` Juri Linkov
@ 2020-10-31 23:27                                                                 ` Glenn Morris
  2020-11-01  7:58                                                                   ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Glenn Morris @ 2020-10-31 23:27 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, schwab


New test fails on some systems.
Ref: https://hydra.nixos.org/build/129474379
Reproduced on CentOS 8.2.

Test print-integer-output-format condition:
    (ert-test-failed
     ((should
       (equal
        (read ...)
        syms))
      :form
      (equal
       (-1 0 1 80 4194175 128 255 4194304)
       (-1 0 1 80 4194175 4194176 4194303 4194304))
      :value nil :explanation
      (list-elt 5
                (different-atoms
                 (128 "#x80" "?")
                 (4194176 "#x3fff80" "?\200")))))
   FAILED  19/39  print-integer-output-format (0.002202 sec)





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-31 23:27                                                                 ` Glenn Morris
@ 2020-11-01  7:58                                                                   ` Juri Linkov
  2020-11-01 15:13                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-11-01  7:58 UTC (permalink / raw)
  To: Glenn Morris; +Cc: 44155, schwab

> New test fails on some systems.
>
>       (equal
>        (-1 0 1 80 4194175 128 255 4194304)
>        (-1 0 1 80 4194175 4194176 4194303 4194304))
>       :value nil :explanation
>       (list-elt 5
>                 (different-atoms
>                  (128 "#x80" "?")
>                  (4194176 "#x3fff80" "?\200")))))

This is because 4194176 is printed as ?\200 that is parsed as 128.

This patch should fix test failures by printing integers
for ambiguous characters.  I'm sure no user would complain
that numbers between 4194176 and 4194303 are printed as integers.

diff --git a/src/print.c b/src/print.c
index fa65a3cb26..49daf753bd 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1912,7 +1912,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 	intmax_t i;
 
 	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	    && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
 	  {
 	    printchar ('?', printcharfun);
 	    if (escapeflag





^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-10-22 20:56                                     ` bug#44155: Print integers as characters Juri Linkov
  2020-10-22 22:39                                       ` Andreas Schwab
@ 2020-11-01 12:03                                       ` Mattias Engdegård
  2020-11-01 18:35                                         ` Juri Linkov
  1 sibling, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-01 12:03 UTC (permalink / raw)
  To: Juri Linkov, Eli Zaretskii, Andreas Schwab; +Cc: 44155

reopen 44155
stop

I don't mind the basic idea, but I'm reopening the bug since it looks like there is some unfinished business. Hope you don't mind.

> When t, print characters from integers that represent a character.

In what way does 't' suggest a character? Wouldn't something like 'character' be more suggestive?
The variable isn't named 'print-integers-as-chars'.

> When a number 16, print non-negative integers in the hexadecimal format.

Doesn't work for bignums:

(let ((integer-output-format 16))
  (print 394583945873948753948539845))

394583945873948753948539845

This must be a bug since there is no reason why bignums should be treated specially. In general we try hard not to.

Since there is a read syntax for binary and octal numbers as well, why not permit 2 and 8?
(And why not print negative numbers in the selected radix?)

And C0/C1 controls aren't printed well:

(let ((integer-output-format t))
  (print 10)
  (print 127))

?

?\x7f^?

I strongly suggest that the controls that have special escapes, like \n, use them. What to use for the rest depends on the user's preference really -- for example, 31 might be printed as 31, ?\037, #o37 or #x1f.

Whether to print 32 as ?‹SPACE› or ?\s is a matter of taste.

For that matter, the variable name should perhaps start with 'print-' like other variables that control printing. Maybe we should separate the default radix and print integers as characters? Thus, we'd have:

print-integer-radix -- 2, 8, 16, 10 or nil (which means 10)

print-integers-as-characters -- nil or t

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01  7:58                                                                   ` Juri Linkov
@ 2020-11-01 15:13                                                                     ` Eli Zaretskii
  2020-11-01 18:39                                                                       ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-11-01 15:13 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, rgm, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: Eli Zaretskii <eliz@gnu.org>,  44155@debbugs.gnu.org,
>   schwab@linux-m68k.org
> Date: Sun, 01 Nov 2020 09:58:25 +0200
> 
> > New test fails on some systems.
> >
> >       (equal
> >        (-1 0 1 80 4194175 128 255 4194304)
> >        (-1 0 1 80 4194175 4194176 4194303 4194304))
> >       :value nil :explanation
> >       (list-elt 5
> >                 (different-atoms
> >                  (128 "#x80" "?")
> >                  (4194176 "#x3fff80" "?\200")))))
> 
> This is because 4194176 is printed as ?\200 that is parsed as 128.
> 
> This patch should fix test failures by printing integers
> for ambiguous characters.  I'm sure no user would complain
> that numbers between 4194176 and 4194303 are printed as integers.
> 
> diff --git a/src/print.c b/src/print.c
> index fa65a3cb26..49daf753bd 100644
> --- a/src/print.c
> +++ b/src/print.c
> @@ -1912,7 +1912,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
>  	intmax_t i;
>  
>  	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
> -	    && (c = XFIXNUM (obj)))
> +	    && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
>  	  {
>  	    printchar ('?', printcharfun);
>  	    if (escapeflag

If a test fails, it is better to fix the test and not make the code
less powerful, don't you agree?

To produce 4194176 from ?\200, one way is this:

  (decode-char 'eight-bit ?\200)

Can't this be used in the test?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 12:03                                       ` Mattias Engdegård
@ 2020-11-01 18:35                                         ` Juri Linkov
  2020-11-01 20:52                                           ` Mattias Engdegård
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-11-01 18:35 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 44155, Andreas Schwab

> reopen 44155
> stop
>
> I don't mind the basic idea, but I'm reopening the bug since it looks
> like there is some unfinished business.  Hope you don't mind.

Thanks for bringing a fresh perspective to this feature request.

>> When t, print characters from integers that represent a character.
>
> In what way does 't' suggest a character? Wouldn't something like 'character' be more suggestive?
> The variable isn't named 'print-integers-as-chars'.

As the most frequent usage pattern, 't' is more convenient to use in code:

  (let ((integer-output-format t))

whereas this would be uglier and harder to type with:

  (let ((integer-output-format 'character))

>> When a number 16, print non-negative integers in the hexadecimal format.
>
> Doesn't work for bignums:
>
> (let ((integer-output-format 16))
>   (print 394583945873948753948539845))
>
> 394583945873948753948539845

Yes, this is known current limitation.

> This must be a bug since there is no reason why bignums should be treated specially.
> In general we try hard not to.

I agree, support for big numbers should be added as well.

> Since there is a read syntax for binary and octal numbers as well, why not permit 2 and 8?
> (And why not print negative numbers in the selected radix?)

2 and 8 could be added as well.

> And C0/C1 controls aren't printed well:
>
> (let ((integer-output-format t))
>   (print 10)
>   (print 127))
>
> ?
>
>
> ?\x7f^?
>
> I strongly suggest that the controls that have special escapes, like
> \n, use them.

prin1-char uses more readable format, is this better?

(prin1-char 10) ?\C-j
(prin1-char 127) ?\C-?

Or should 10 be printed as '?\n'?

> What to use for the rest depends on the user's preference really --
> for example, 31 might be printed as 31, ?\037, #o37 or #x1f.

Maybe more user choices should be supported by the variable?

> Whether to print 32 as ?‹SPACE› or ?\s is a matter of taste.

?\s is less error-prone.

> For that matter, the variable name should perhaps start with 'print-'
> like other variables that control printing.  Maybe we should separate
> the default radix and print integers as characters?  Thus, we'd have:

The variable name was modeled after the similar variable float-output-format.

> print-integer-radix -- 2, 8, 16, 10 or nil (which means 10)
>
> print-integers-as-characters -- nil or t

What should be printed when both variables are bound to non-default values,
e.g. print-integers-as-characters to t, and print-integer-radix to 16?
Maybe to print with character syntax and the given radix, e.g. '?\x1f'.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 15:13                                                                     ` Eli Zaretskii
@ 2020-11-01 18:39                                                                       ` Juri Linkov
  2020-11-01 18:51                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-11-01 18:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, rgm, schwab

>> This is because 4194176 is printed as ?\200 that is parsed as 128.
>> 
>> This patch should fix test failures by printing integers
>> for ambiguous characters.  I'm sure no user would complain
>> that numbers between 4194176 and 4194303 are printed as integers.
>> 
>>  	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
>> -	    && (c = XFIXNUM (obj)))
>> +	    && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
>
> If a test fails, it is better to fix the test and not make the code
> less powerful, don't you agree?

This means sweeping the problems under the carpet.

> To produce 4194176 from ?\200, one way is this:
>
>   (decode-char 'eight-bit ?\200)
>
> Can't this be used in the test?

Using this code in tests means that the users should use the same code
in their programs.  Thus 'print' should print '(33 4194176) as such ugly code:
`(?! ,(decode-char 'eight-bit ?\200))





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 18:39                                                                       ` Juri Linkov
@ 2020-11-01 18:51                                                                         ` Eli Zaretskii
  2020-11-01 19:13                                                                           ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-11-01 18:51 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, rgm, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: rgm@gnu.org,  44155@debbugs.gnu.org,  schwab@linux-m68k.org
> Date: Sun, 01 Nov 2020 20:39:48 +0200
> 
> >>  	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
> >> -	    && (c = XFIXNUM (obj)))
> >> +	    && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
> >
> > If a test fails, it is better to fix the test and not make the code
> > less powerful, don't you agree?
> 
> This means sweeping the problems under the carpet.

Which problem?

> >   (decode-char 'eight-bit ?\200)
> >
> > Can't this be used in the test?
> 
> Using this code in tests means that the users should use the same code
> in their programs.

Why would they need to do that?  The test needs it because it wants to
verify the result, but "normal" programs don't need to read back the
values they printed.

> Thus 'print' should print '(33 4194176) as such ugly code:
> `(?! ,(decode-char 'eight-bit ?\200))

I don't see why.  ?\200 and 4194176 are two forms of the same
character.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 18:51                                                                         ` Eli Zaretskii
@ 2020-11-01 19:13                                                                           ` Juri Linkov
  2020-11-01 19:41                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-11-01 19:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, rgm, schwab

>> >>  	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
>> >> -	    && (c = XFIXNUM (obj)))
>> >> +	    && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
>> >
>> > If a test fails, it is better to fix the test and not make the code
>> > less powerful, don't you agree?
>> 
>> This means sweeping the problems under the carpet.
>
> Which problem?

Problem of ambiguous numbers 128 and 4194176 that are both printed as ?\200.

>> >   (decode-char 'eight-bit ?\200)
>> >
>> > Can't this be used in the test?
>> 
>> Using this code in tests means that the users should use the same code
>> in their programs.
>
> Why would they need to do that?  The test needs it because it wants to
> verify the result, but "normal" programs don't need to read back the
> values they printed.

Programs print the lists of characters, and other programs read them.

>> Thus 'print' should print '(33 4194176) as such ugly code:
>> `(?! ,(decode-char 'eight-bit ?\200))
>
> I don't see why.  ?\200 and 4194176 are two forms of the same
> character.

?\200 and 128 are two forms of the same character too.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 19:13                                                                           ` Juri Linkov
@ 2020-11-01 19:41                                                                             ` Eli Zaretskii
  2020-11-01 20:16                                                                               ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-11-01 19:41 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, rgm, schwab

> From: Juri Linkov <juri@linkov.net>
> Cc: rgm@gnu.org,  44155@debbugs.gnu.org,  schwab@linux-m68k.org
> Date: Sun, 01 Nov 2020 21:13:03 +0200
> 
> >> >>  	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
> >> >> -	    && (c = XFIXNUM (obj)))
> >> >> +	    && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c))
> >> >
> >> > If a test fails, it is better to fix the test and not make the code
> >> > less powerful, don't you agree?
> >> 
> >> This means sweeping the problems under the carpet.
> >
> > Which problem?
> 
> Problem of ambiguous numbers 128 and 4194176 that are both printed as ?\200.

Octal escapes are generally a sign of a raw byte.  This is not
different from buffer display -- how do you know what does ?\200 mean
inside buffer text?

> ?\200 and 128 are two forms of the same character too.

See my question above.  I don't think what you say is true.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 19:41                                                                             ` Eli Zaretskii
@ 2020-11-01 20:16                                                                               ` Juri Linkov
  0 siblings, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-11-01 20:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, rgm, schwab

>> Problem of ambiguous numbers 128 and 4194176 that are both printed as ?\200.
>
> Octal escapes are generally a sign of a raw byte.  This is not
> different from buffer display -- how do you know what does ?\200 mean
> inside buffer text?
>
>> ?\200 and 128 are two forms of the same character too.
>
> See my question above.  I don't think what you say is true.

Typing 'C-x C-e' after ?\200 displays: 128 (#o200, #x80, ?\x80)





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 18:35                                         ` Juri Linkov
@ 2020-11-01 20:52                                           ` Mattias Engdegård
  2020-11-02 21:36                                             ` Juri Linkov
  0 siblings, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-01 20:52 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, Andreas Schwab

1 nov. 2020 kl. 19.35 skrev Juri Linkov <juri@linkov.net>:

> Thanks for bringing a fresh perspective to this feature request.

You are very graceful. The devil is in the details, as always!

> (prin1-char 10) ?\C-j
> (prin1-char 127) ?\C-?
> 
> Or should 10 be printed as '?\n'?

Yes, I think ?\n is more useful. As a character, 10 is more commonly thought of as newline than as control-j.

>> What to use for the rest depends on the user's preference really --
>> for example, 31 might be printed as 31, ?\037, #o37 or #x1f.
> 
> Maybe more user choices should be supported by the variable?

Maybe, but only if we can identify sensible such choices. Otherwise we should just try to pick the best representation in each case. Giving users too much choice isn't necessarily making them a favour!

I'd suggest plain number syntax for control characters without named escapes, for several reasons:

* Such numbers are less likely to represent characters and more likely to be, well, numbers.
* It would allow a separate radix control to govern their output format.
* Writing ?\x1f is no clearer than #x1f, and sometimes more confusing: \xff is a raw byte in a string, but ?\xff is always 255.

Thus we would have 10 -> ?\n, 13 -> ?\r, 127 -> ?\d, 65 -> ?A, 255 -> ?ÿ, but 31 -> 31, 129 -> 129, 4194303 -> 4194303.

>> Whether to print 32 as ?‹SPACE› or ?\s is a matter of taste.
> 
> ?\s is less error-prone.

Yes, I agree. (I prefer ?\s or 32 as characters, but " " in strings.)

>> For that matter, the variable name should perhaps start with 'print-'
>> like other variables that control printing.  Maybe we should separate
>> the default radix and print integers as characters?  Thus, we'd have:
> 
> The variable name was modeled after the similar variable float-output-format.

I see, interesting! One possibility would be to use a string in the same way, thus "%x", "%c" etc, but it makes less sense for integers than floating-point: no precision field, and many format alternatives such as %#x do not produce valid Lisp read syntax. Better keep it simple.

>> print-integer-radix -- 2, 8, 16, 10 or nil (which means 10)
>> 
>> print-integers-as-characters -- nil or t
> 
> What should be printed when both variables are bound to non-default values,
> e.g. print-integers-as-characters to t, and print-integer-radix to 16?
> Maybe to print with character syntax and the given radix, e.g. '?\x1f'.

Well, it should clearly use character syntax for printable characters and the given radix for non-characters. As you correctly point out, what to use for non-printable characters (C0 and C1 controls, raw bytes) is less obvious. I'd probably just use the given radix; I see no readability advantage in printing ?\x1f to #x1f.

Since your original motivation was to print characters in pretty-printed nested Lisp expressions, perhaps we should just define print-integers-as-characters as a Boolean and skip the radix for the time being? We could add a print radix control later on if desired. (That would save us the hassle to deal with bignums, for that matter.)

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-01 20:52                                           ` Mattias Engdegård
@ 2020-11-02 21:36                                             ` Juri Linkov
  2020-11-02 23:03                                               ` Mattias Engdegård
  0 siblings, 1 reply; 109+ messages in thread
From: Juri Linkov @ 2020-11-02 21:36 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 44155, Andreas Schwab

> Thus we would have 10 -> ?\n, 13 -> ?\r, 127 -> ?\d, 65 -> ?A,
> 255 -> ?ÿ, but 31 -> 31, 129 -> 129, 4194303 -> 4194303.

Hopefully, printing some characters as numbers will fix
the currently broken test.

> Since your original motivation was to print characters in pretty-printed
> nested Lisp expressions, perhaps we should just define
> print-integers-as-characters as a Boolean and skip the radix for the time
> being? We could add a print radix control later on if desired. (That would
> save us the hassle to deal with bignums, for that matter.)

This was my intention - to start with something simple that does only
what was needed (to print integers as characters), then extend it later
when such a need arises as printing hex numbers.  I added hex numbers only
as a proof that the variable integer-output-format is extensible enough
to support more formats in the future.

But as you point out, this is achievable by adding another variable like
print-integer-radix.

PS: I notices inconsistency in these names: "integer" in print-integer-radix
is singular, but "integers" in print-integers-as-characters is plural.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-02 21:36                                             ` Juri Linkov
@ 2020-11-02 23:03                                               ` Mattias Engdegård
  2020-11-03  8:30                                                 ` Juri Linkov
  2020-11-03 15:24                                                 ` Eli Zaretskii
  0 siblings, 2 replies; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-02 23:03 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 44155, Andreas Schwab

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

2 nov. 2020 kl. 22.36 skrev Juri Linkov <juri@linkov.net>:
> 
>> Thus we would have 10 -> ?\n, 13 -> ?\r, 127 -> ?\d, 65 -> ?A,
>> 255 -> ?ÿ, but 31 -> 31, 129 -> 129, 4194303 -> 4194303.
> 
> Hopefully, printing some characters as numbers will fix
> the currently broken test.

It does! Here is a proposed patch. We could add a separate radix control later if you like.

One detail that I'm undecided about is whether to remove the more obscure control escapes \f, \a, \v, \e and \d, on the grounds that they are less likely to be used as actual characters and that users may prefer to see them as numbers instead. C, and most languages inheriting them from C, lack \e or \d; \f and \a are rare today, and \v is an anachronism.

> PS: I notices inconsistency in these names: "integer" in print-integer-radix
> is singular, but "integers" in print-integers-as-characters is plural.

Actually, 'integer' in 'integer radix' plays the part of adjective!


[-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --]
[-- Type: application/octet-stream, Size: 9171 bytes --]

From 0dc27757cd53bca3e05c93f29ca96d0845a50ec2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 2 Nov 2020 23:37:16 +0100
Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters

The variable now only controls whether characters are printed, not
the radix.  Control chars are printed in human-readable syntax
such as ?\n if available, as numbers otherwise (bug#44155).
Done in collaboration with Juri Linkov.

* src/print.c (named_escape): New function.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format.  Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics.  The test now passes.
---
 doc/lispref/streams.texi | 13 ++++----
 etc/NEWS                 | 11 ++++---
 src/print.c              | 65 ++++++++++++++++++++++++++--------------
 test/src/print-tests.el  | 34 ++++++++++-----------
 4 files changed, 71 insertions(+), 52 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index f171f13779..4bc97e4c48 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -903,10 +903,11 @@ Output Variables
 you can use, see the variable's documentation string.
 @end defvar
 
-@defvar integer-output-format
-This variable specifies how to print integer numbers.  The default is
-@code{nil}, meaning use the decimal format.  When bound to @code{t},
-print integers as characters when an integer represents a character
-(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
-print non-negative integers in the hexadecimal format.
+@defvar print-integers-as-characters
+When this variable is non-@code{nil}, integers that represent
+printable characters or control characters with their own escape
+syntax such as newline will be printed using Lisp character syntax
+(@pxref{Basic Char Syntax}).  Other numbers are printed the usual way.
+For example, the list @code{(4 65 -1 10)} will be printed as
+@samp{(4 ?A -1 ?\n)}.
 @end defvar
diff --git a/etc/NEWS b/etc/NEWS
index e11effc9e8..810d6794f2 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
-** New variable 'integer-output-format' determines how to print integer values.
-When this variable is bound to the value 't', integers are printed by
-printing functions as characters when an integer represents a character.
-When bound to the number 16, non-negative integers are printed in the
-hexadecimal format.
-
 +++
 ** 'define-globalized-minor-mode' now takes a ':predicate' parameter.
 This can be used to control which major modes the minor mode should be
@@ -1887,6 +1881,11 @@ file can affect code in another.  For details, see the manual section
 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal'
 and 'play-sound-file'.
 
++++
+** New variable 'print-integers-as-characters' modifies integer printing.
+When this variable is non-nil, integers representing characters are
+printed using Lisp character syntax, such as '?*' for 42.
+
 \f
 * Changes in Emacs 28.1 on Non-Free Operating Systems
 
diff --git a/src/print.c b/src/print.c
index fa65a3cb26..89efcb2006 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1848,6 +1848,25 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
   return true;
 }
 
+static char
+named_escape (int i)
+{
+  switch (i)
+    {
+    case '\a': return 'a';
+    case '\b': return 'b';
+    case '\t': return 't';
+    case '\n': return 'n';
+    case '\v': return 'v';
+    case '\f': return 'f';
+    case '\r': return 'r';
+    case 27:   return 'e';
+    case ' ':  return 's';
+    case 127:  return 'd';
+    }
+  return 0;
+}
+
 static void
 print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 {
@@ -1908,29 +1927,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int c;
-	intmax_t i;
+        EMACS_INT i = XFIXNUM (obj);
+        char escaped_name;
 
-	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR
+            && ((escaped_name = named_escape (i))
+                || (i >= 32 && i <= 127)
+                || i >= 0xa0))
 	  {
 	    printchar ('?', printcharfun);
-	    if (escapeflag
-		&& (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
-		    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+            if (escaped_name)
+              {
+                printchar ('\\', printcharfun);
+                i = escaped_name;
+              }
+            else if (escapeflag
+                     && (i == ';' || i == '\"' || i == '\'' || i == '\\'
+                         || i == '(' || i == ')'
+                         || i == '{' || i == '}'
+                         || i == '[' || i == ']'))
 	      printchar ('\\', printcharfun);
-	    printchar (c, printcharfun);
-	  }
-	else if (INTEGERP (Vinteger_output_format)
-		 && integer_to_intmax (Vinteger_output_format, &i)
-		 && i == 16 && !NILP (Fnatnump (obj)))
-	  {
-	    int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
-	    strout (buf, len, len, printcharfun);
+	    printchar (i, printcharfun);
 	  }
 	else
 	  {
-	    int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+	    int len = sprintf (buf, "%"pI"d", i);
 	    strout (buf, len, len, printcharfun);
 	  }
       }
@@ -2270,12 +2291,12 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
-  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
-	       doc: /* The format used to print integers.
-When t, print characters from integers that represent a character.
-When a number 16, print non-negative integers in the hexadecimal format.
-Otherwise, by default print integers in the decimal format.  */);
-  Vinteger_output_format = Qnil;
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Non-nil means integers are printed using characters syntax.
+Only non-control characters, and control characters with named escape
+sequences such as newline, are printed this way.  Other integers,
+including those corresponding to raw bytes, are not affected.  */);
+  print_integers_as_characters = Qnil;
 
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 7b026b6b21..0053f3cac0 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,25 +383,23 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
-(print-tests--deftest print-integer-output-format ()
+(print-tests--deftest print-integers-as-characters ()
   ;; Bug#44155.
-  (let ((integer-output-format t)
-        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
-  (let ((integer-output-format t)
-        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
-  (let ((integer-output-format 16)
-        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat
-                                (lambda (i)
-                                  (if (and (>= i 0) (<= i most-positive-fixnum))
-                                      (format "#x%x" i) (format "%d" i)))
-                                syms " ") ")")))))
+  (let* ((print-integers-as-characters t)
+         (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32
+                  ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d))
+         (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff))
+         (printed-chars (print-tests--prin1-to-string chars))
+         (printed-nums (print-tests--prin1-to-string nums)))
+    (should (equal (read printed-chars) chars))
+    (should (equal
+             printed-chars
+             (concat
+              "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\"
+              " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f ?\\a ?\\v ?\\e ?\\d)")))
+    (should (equal (read printed-nums) nums))
+    (should (equal printed-nums
+                   "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)"))))
 
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-02 23:03                                               ` Mattias Engdegård
@ 2020-11-03  8:30                                                 ` Juri Linkov
  2020-11-03 15:24                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 109+ messages in thread
From: Juri Linkov @ 2020-11-03  8:30 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 44155, Andreas Schwab

>> Hopefully, printing some characters as numbers will fix
>> the currently broken test.
>
> It does! Here is a proposed patch. We could add a separate radix control later if you like.

Thanks, I like your patch, hope that Eli will like it too.

> One detail that I'm undecided about is whether to remove the more obscure
> control escapes \f, \a, \v, \e and \d, on the grounds that they are less
> likely to be used as actual characters and that users may prefer to see
> them as numbers instead. C, and most languages inheriting them from C, lack
> \e or \d; \f and \a are rare today, and \v is an anachronism.

I don't think that \f is rare, it's used as a page separator
in many Emacs Lisp files.  But it would be surprising to me to see
127 printed as ?\d, maybe because C lacks it.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-02 23:03                                               ` Mattias Engdegård
  2020-11-03  8:30                                                 ` Juri Linkov
@ 2020-11-03 15:24                                                 ` Eli Zaretskii
  2020-11-03 18:47                                                   ` Mattias Engdegård
  1 sibling, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-11-03 15:24 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 44155, schwab, juri

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Tue, 3 Nov 2020 00:03:31 +0100
> Cc: Eli Zaretskii <eliz@gnu.org>, Andreas Schwab <schwab@suse.de>,
>         44155@debbugs.gnu.org
> 
> +@defvar print-integers-as-characters
> +When this variable is non-@code{nil}, integers that represent
> +printable characters or control characters with their own escape
> +syntax such as newline will be printed using Lisp character syntax

What is meant by "printable characters" here?  One could think you
mean [:print:], but that doesn't seem to be what then code does.

> +  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
> +	       doc: /* Non-nil means integers are printed using characters syntax.
> +Only non-control characters, and control characters with named escape
> +sequences such as newline, are printed this way.  Other integers,
> +including those corresponding to raw bytes, are not affected.  */);

And here, what does "non-control characters" mean?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-03 15:24                                                 ` Eli Zaretskii
@ 2020-11-03 18:47                                                   ` Mattias Engdegård
  2020-11-03 19:36                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-03 18:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab, juri

[-- Attachment #1: Type: text/plain, Size: 781 bytes --]

3 nov. 2020 kl. 16.24 skrev Eli Zaretskii <eliz@gnu.org>:

> What is meant by "printable characters" here?  One could think you
> mean [:print:], but that doesn't seem to be what then code does.

Non-control characters (characters other than control characters), in this case. I wanted to keep things simple and not involve the Unicode database in the printer.

(For that matter, [:print:] is a regexp feature and doesn't really define the meaning of 'printable', but your question was valid.)

On the other hand, printing all non-controls using the ?X syntax is maybe not ideal. Attached is a new patch that uses Unicode properties to select only printable base characters.

This patch also removes \a, \v, \e and \d from the characters printed as escaped controls.


[-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --]
[-- Type: application/octet-stream, Size: 11404 bytes --]

From 3da6d9055b0ae68fc7b3bbee52885113c8c30b6d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 2 Nov 2020 23:37:16 +0100
Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters

The variable now only controls whether characters are printed, not
the radix.  Control chars are printed in human-readable syntax
such as ?\n if available, as numbers otherwise (bug#44155).
Done in collaboration with Juri Linkov.

* src/character.c (printable_base_p):
* src/print.c (named_escape): New functions.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format.  Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics.  The test now passes.
---
 doc/lispref/streams.texi | 13 +++++----
 etc/NEWS                 | 11 ++++---
 src/character.c          | 21 ++++++++++++++
 src/character.h          |  1 +
 src/print.c              | 63 ++++++++++++++++++++++++++--------------
 test/src/print-tests.el  | 39 +++++++++++++------------
 6 files changed, 96 insertions(+), 52 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index f171f13779..4bc97e4c48 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -903,10 +903,11 @@ Output Variables
 you can use, see the variable's documentation string.
 @end defvar
 
-@defvar integer-output-format
-This variable specifies how to print integer numbers.  The default is
-@code{nil}, meaning use the decimal format.  When bound to @code{t},
-print integers as characters when an integer represents a character
-(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
-print non-negative integers in the hexadecimal format.
+@defvar print-integers-as-characters
+When this variable is non-@code{nil}, integers that represent
+printable characters or control characters with their own escape
+syntax such as newline will be printed using Lisp character syntax
+(@pxref{Basic Char Syntax}).  Other numbers are printed the usual way.
+For example, the list @code{(4 65 -1 10)} will be printed as
+@samp{(4 ?A -1 ?\n)}.
 @end defvar
diff --git a/etc/NEWS b/etc/NEWS
index e11effc9e8..384c64a91e 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
-** New variable 'integer-output-format' determines how to print integer values.
-When this variable is bound to the value 't', integers are printed by
-printing functions as characters when an integer represents a character.
-When bound to the number 16, non-negative integers are printed in the
-hexadecimal format.
-
 +++
 ** 'define-globalized-minor-mode' now takes a ':predicate' parameter.
 This can be used to control which major modes the minor mode should be
@@ -1887,6 +1881,11 @@ file can affect code in another.  For details, see the manual section
 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal'
 and 'play-sound-file'.
 
++++
+** New variable 'print-integers-as-characters' modifies integer printing.
+When this variable is non-nil, character syntax is used for printing
+numbers for which this makes sense, such as '?*' for 42.
+
 \f
 * Changes in Emacs 28.1 on Non-Free Operating Systems
 
diff --git a/src/character.c b/src/character.c
index 5860f6a0c8..6d18e78f26 100644
--- a/src/character.c
+++ b/src/character.c
@@ -982,6 +982,27 @@ printablep (int c)
 	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
 }
 
+/* Return true if C is a printable independent character.  */
+bool
+printable_base_p (int c)
+{
+  Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c);
+  if (! FIXNUMP (category))
+    return false;
+  EMACS_INT gen_cat = XFIXNUM (category);
+
+  /* See UTS #18.  */
+  return (!(gen_cat == UNICODE_CATEGORY_Mn       /* mark, nonspacing */
+            || gen_cat == UNICODE_CATEGORY_Mc    /* mark, combining */
+            || gen_cat == UNICODE_CATEGORY_Me    /* mark, enclosing */
+            || gen_cat == UNICODE_CATEGORY_Zl    /* separator, line */
+            || gen_cat == UNICODE_CATEGORY_Zp    /* separator, paragraph */
+            || gen_cat == UNICODE_CATEGORY_Cc    /* other, control */
+            || gen_cat == UNICODE_CATEGORY_Cs    /* other, surrogate */
+            || gen_cat == UNICODE_CATEGORY_Cf    /* other, format */
+            || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */
+}
+
 /* Return true if C is a horizontal whitespace character, as defined
    by https://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
 bool
diff --git a/src/character.h b/src/character.h
index af5023f77c..260c550108 100644
--- a/src/character.h
+++ b/src/character.h
@@ -583,6 +583,7 @@ char_surrogate_p (int c)
 extern bool graphicp (int);
 extern bool printablep (int);
 extern bool blankp (int);
+extern bool printable_base_p (int);
 
 /* Look up the element in char table OBJ at index CH, and return it as
    an integer.  If the element is not a character, return CH itself.  */
diff --git a/src/print.c b/src/print.c
index fa65a3cb26..f7158dbac0 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
   return true;
 }
 
+static char
+named_escape (int i)
+{
+  switch (i)
+    {
+    case '\b': return 'b';
+    case '\t': return 't';
+    case '\n': return 'n';
+    case '\f': return 'f';
+    case '\r': return 'r';
+    case ' ':  return 's';
+      /* \a, \v, \e and \d are excluded from printing as escapes since
+         they are somewhat rare as characters and more likely to be
+         plain integers. */
+    }
+  return 0;
+}
+
 static void
 print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 {
@@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int c;
-	intmax_t i;
+        EMACS_INT i = XFIXNUM (obj);
+        char escaped_name;
 
-	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR
+            && ((escaped_name = named_escape (i))
+                || printable_base_p (i)))
 	  {
 	    printchar ('?', printcharfun);
-	    if (escapeflag
-		&& (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
-		    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+            if (escaped_name)
+              {
+                printchar ('\\', printcharfun);
+                i = escaped_name;
+              }
+            else if (escapeflag
+                     && (i == ';' || i == '\"' || i == '\'' || i == '\\'
+                         || i == '(' || i == ')'
+                         || i == '{' || i == '}'
+                         || i == '[' || i == ']'))
 	      printchar ('\\', printcharfun);
-	    printchar (c, printcharfun);
-	  }
-	else if (INTEGERP (Vinteger_output_format)
-		 && integer_to_intmax (Vinteger_output_format, &i)
-		 && i == 16 && !NILP (Fnatnump (obj)))
-	  {
-	    int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
-	    strout (buf, len, len, printcharfun);
+	    printchar (i, printcharfun);
 	  }
 	else
 	  {
-	    int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+	    int len = sprintf (buf, "%"pI"d", i);
 	    strout (buf, len, len, printcharfun);
 	  }
       }
@@ -2270,12 +2289,12 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
-  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
-	       doc: /* The format used to print integers.
-When t, print characters from integers that represent a character.
-When a number 16, print non-negative integers in the hexadecimal format.
-Otherwise, by default print integers in the decimal format.  */);
-  Vinteger_output_format = Qnil;
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Non-nil means integers are printed using characters syntax.
+Only printable characters, and control characters with named escape
+sequences such as newline, are printed this way.  Other integers,
+including those corresponding to raw bytes, are not affected.  */);
+  print_integers_as_characters = Qnil;
 
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 7b026b6b21..05b1e4e6e4 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,25 +383,28 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
-(print-tests--deftest print-integer-output-format ()
+(print-tests--deftest print-integers-as-characters ()
   ;; Bug#44155.
-  (let ((integer-output-format t)
-        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
-  (let ((integer-output-format t)
-        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
-  (let ((integer-output-format 16)
-        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat
-                                (lambda (i)
-                                  (if (and (>= i 0) (<= i most-positive-fixnum))
-                                      (format "#x%x" i) (format "%d" i)))
-                                syms " ") ")")))))
+  (let* ((print-integers-as-characters t)
+         (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32
+                  ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d))
+         (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff))
+         (nonprints '(#xd800 #xdfff #x030a #xffff #x200c))
+         (printed-chars (print-tests--prin1-to-string chars))
+         (printed-nums (print-tests--prin1-to-string nums))
+         (printed-nonprints (print-tests--prin1-to-string nonprints)))
+    (should (equal (read printed-chars) chars))
+    (should (equal
+             printed-chars
+             (concat
+              "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\"
+              " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)")))
+    (should (equal (read printed-nums) nums))
+    (should (equal printed-nums
+                   "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)"))
+    (should (equal (read printed-nonprints) nonprints))
+    (should (equal printed-nonprints
+                   "(55296 57343 778 65535 8204)"))))
 
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-03 18:47                                                   ` Mattias Engdegård
@ 2020-11-03 19:36                                                     ` Eli Zaretskii
  2020-11-04 11:03                                                       ` Mattias Engdegård
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-11-03 19:36 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 44155, schwab, juri

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Tue, 3 Nov 2020 19:47:17 +0100
> Cc: juri@linkov.net, schwab@suse.de, 44155@debbugs.gnu.org
> 
> > What is meant by "printable characters" here?  One could think you
> > mean [:print:], but that doesn't seem to be what then code does.
> 
> Non-control characters (characters other than control characters), in this case. I wanted to keep things simple and not involve the Unicode database in the printer.
> 
> (For that matter, [:print:] is a regexp feature and doesn't really define the meaning of 'printable', but your question was valid.)
> 
> On the other hand, printing all non-controls using the ?X syntax is maybe not ideal. Attached is a new patch that uses Unicode properties to select only printable base characters.

Thanks, but my main question is still not answered.  I asked it from
the POV of documentation: we should provide a more specific
description of which characters will be printed as characters, so that
users are not surprised.  The text in NEWS still says "printable
characters" without defining that term, and so does the doc string of
print-integers-as-characters.

And now there's another question, which is what caused you to filter
characters like you did?  E.g., what's wrong with combining classes?
why not simply use graphicp?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-03 19:36                                                     ` Eli Zaretskii
@ 2020-11-04 11:03                                                       ` Mattias Engdegård
  2020-11-04 15:38                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-04 11:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab, juri

[-- Attachment #1: Type: text/plain, Size: 1879 bytes --]

3 nov. 2020 kl. 20.36 skrev Eli Zaretskii <eliz@gnu.org>:

> Thanks, but my main question is still not answered.  I asked it from
> the POV of documentation: we should provide a more specific
> description of which characters will be printed as characters, so that
> users are not surprised.  The text in NEWS still says "printable
> characters" without defining that term, and so does the doc string of
> print-integers-as-characters.

'Printable' was used informally, not in an exact technical meaning. Intuitively, it should be the set of characters that make sense to print using the '?X' syntax. I initially thought that 'graphic' was too technical but it is more precise. 'Independently printable graphic character' is descriptive but a mouthful; perhaps 'independent graphic char' would do?

> And now there's another question, which is what caused you to filter
> characters like you did?  E.g., what's wrong with combining classes?
> why not simply use graphicp?

For the ?X syntax to make sense, X must be visible; thus controls are out, and so are formatting chars (language tags etc). Spaces should probably have been excluded as well since it's typically not possible to see what kind of space follows the '?' (SPC is explicitly rendered as ?\s).

Furthermore, X must be independent since it isn't a grapheme cluster but a single code point. Therefore combining chars cannot be included as they would attach to the '?'.

'graphicp' cannot be used because it includes combining, enclosing and nonspacing marks (M) and formats (Cf); otherwise it's fine.

While we could put the exact list of excluded general categories in the documentation, it is not very important because the selection only matters for usability and aesthetics, not (realistically) for code behaviour.

The attached patch excludes spaces (Zs) and revises the terminology.


[-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --]
[-- Type: application/octet-stream, Size: 11560 bytes --]

From fd24ef7e7b71308ff29b8d1b2f7be64254469521 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 2 Nov 2020 23:37:16 +0100
Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters

The variable now only controls whether characters are printed, not
the radix.  Control chars are printed in human-readable syntax
only when special escapes such as ?\n are available.  Spaces,
formatting and combining chars are excluded (bug#44155).
Done in collaboration with Juri Linkov.

* src/character.c (graphic_base_p):
* src/print.c (named_escape): New functions.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format.  Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics.  The test now passes.
---
 doc/lispref/streams.texi | 13 ++++----
 etc/NEWS                 | 11 ++++---
 src/character.c          | 21 +++++++++++++
 src/character.h          |  1 +
 src/print.c              | 64 ++++++++++++++++++++++++++--------------
 test/src/print-tests.el  | 39 +++++++++++++-----------
 6 files changed, 97 insertions(+), 52 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index f171f13779..799d35b070 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -903,10 +903,11 @@ Output Variables
 you can use, see the variable's documentation string.
 @end defvar
 
-@defvar integer-output-format
-This variable specifies how to print integer numbers.  The default is
-@code{nil}, meaning use the decimal format.  When bound to @code{t},
-print integers as characters when an integer represents a character
-(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
-print non-negative integers in the hexadecimal format.
+@defvar print-integers-as-characters
+When this variable is non-@code{nil}, integers that represent
+independent graphic characters or control characters with their own
+escape syntax such as newline will be printed using Lisp character
+syntax (@pxref{Basic Char Syntax}).  Other numbers are printed the
+usual way.  For example, the list @code{(4 65 -1 10)} will be printed
+as @samp{(4 ?A -1 ?\n)}.
 @end defvar
diff --git a/etc/NEWS b/etc/NEWS
index e11effc9e8..384c64a91e 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
-** New variable 'integer-output-format' determines how to print integer values.
-When this variable is bound to the value 't', integers are printed by
-printing functions as characters when an integer represents a character.
-When bound to the number 16, non-negative integers are printed in the
-hexadecimal format.
-
 +++
 ** 'define-globalized-minor-mode' now takes a ':predicate' parameter.
 This can be used to control which major modes the minor mode should be
@@ -1887,6 +1881,11 @@ file can affect code in another.  For details, see the manual section
 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal'
 and 'play-sound-file'.
 
++++
+** New variable 'print-integers-as-characters' modifies integer printing.
+When this variable is non-nil, character syntax is used for printing
+numbers for which this makes sense, such as '?*' for 42.
+
 \f
 * Changes in Emacs 28.1 on Non-Free Operating Systems
 
diff --git a/src/character.c b/src/character.c
index 5860f6a0c8..00b73293a3 100644
--- a/src/character.c
+++ b/src/character.c
@@ -982,6 +982,27 @@ printablep (int c)
 	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
 }
 
+/* Return true if C is graphic character that can be printed independently.  */
+bool
+graphic_base_p (int c)
+{
+  Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c);
+  if (! FIXNUMP (category))
+    return false;
+  EMACS_INT gen_cat = XFIXNUM (category);
+
+  return (!(gen_cat == UNICODE_CATEGORY_Mn       /* mark, nonspacing */
+            || gen_cat == UNICODE_CATEGORY_Mc    /* mark, combining */
+            || gen_cat == UNICODE_CATEGORY_Me    /* mark, enclosing */
+            || gen_cat == UNICODE_CATEGORY_Zs    /* separator, space */
+            || gen_cat == UNICODE_CATEGORY_Zl    /* separator, line */
+            || gen_cat == UNICODE_CATEGORY_Zp    /* separator, paragraph */
+            || gen_cat == UNICODE_CATEGORY_Cc    /* other, control */
+            || gen_cat == UNICODE_CATEGORY_Cs    /* other, surrogate */
+            || gen_cat == UNICODE_CATEGORY_Cf    /* other, format */
+            || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */
+}
+
 /* Return true if C is a horizontal whitespace character, as defined
    by https://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
 bool
diff --git a/src/character.h b/src/character.h
index af5023f77c..cbf43097ae 100644
--- a/src/character.h
+++ b/src/character.h
@@ -583,6 +583,7 @@ char_surrogate_p (int c)
 extern bool graphicp (int);
 extern bool printablep (int);
 extern bool blankp (int);
+extern bool graphic_base_p (int);
 
 /* Look up the element in char table OBJ at index CH, and return it as
    an integer.  If the element is not a character, return CH itself.  */
diff --git a/src/print.c b/src/print.c
index fa65a3cb26..f2e2dd131d 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
   return true;
 }
 
+static char
+named_escape (int i)
+{
+  switch (i)
+    {
+    case '\b': return 'b';
+    case '\t': return 't';
+    case '\n': return 'n';
+    case '\f': return 'f';
+    case '\r': return 'r';
+    case ' ':  return 's';
+      /* \a, \v, \e and \d are excluded from printing as escapes since
+         they are somewhat rare as characters and more likely to be
+         plain integers. */
+    }
+  return 0;
+}
+
 static void
 print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 {
@@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int c;
-	intmax_t i;
+        EMACS_INT i = XFIXNUM (obj);
+        char escaped_name;
 
-	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR
+            && ((escaped_name = named_escape (i))
+                || graphic_base_p (i)))
 	  {
 	    printchar ('?', printcharfun);
-	    if (escapeflag
-		&& (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
-		    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+            if (escaped_name)
+              {
+                printchar ('\\', printcharfun);
+                i = escaped_name;
+              }
+            else if (escapeflag
+                     && (i == ';' || i == '\"' || i == '\'' || i == '\\'
+                         || i == '(' || i == ')'
+                         || i == '{' || i == '}'
+                         || i == '[' || i == ']'))
 	      printchar ('\\', printcharfun);
-	    printchar (c, printcharfun);
-	  }
-	else if (INTEGERP (Vinteger_output_format)
-		 && integer_to_intmax (Vinteger_output_format, &i)
-		 && i == 16 && !NILP (Fnatnump (obj)))
-	  {
-	    int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
-	    strout (buf, len, len, printcharfun);
+	    printchar (i, printcharfun);
 	  }
 	else
 	  {
-	    int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+	    int len = sprintf (buf, "%"pI"d", i);
 	    strout (buf, len, len, printcharfun);
 	  }
       }
@@ -2270,12 +2289,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
-  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
-	       doc: /* The format used to print integers.
-When t, print characters from integers that represent a character.
-When a number 16, print non-negative integers in the hexadecimal format.
-Otherwise, by default print integers in the decimal format.  */);
-  Vinteger_output_format = Qnil;
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Non-nil means integers are printed using characters syntax.
+Only independent graphic characters, and control characters with named
+escape sequences such as newline, are printed this way.  Other
+integers, including those corresponding to raw bytes, are printed
+affected.  */);
+  print_integers_as_characters = Qnil;
 
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 7b026b6b21..202555adb3 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,25 +383,28 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
-(print-tests--deftest print-integer-output-format ()
+(print-tests--deftest print-integers-as-characters ()
   ;; Bug#44155.
-  (let ((integer-output-format t)
-        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
-  (let ((integer-output-format t)
-        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
-  (let ((integer-output-format 16)
-        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat
-                                (lambda (i)
-                                  (if (and (>= i 0) (<= i most-positive-fixnum))
-                                      (format "#x%x" i) (format "%d" i)))
-                                syms " ") ")")))))
+  (let* ((print-integers-as-characters t)
+         (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32
+                  ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d))
+         (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff))
+         (nonprints '(#xd800 #xdfff #x030a #xffff #x2002 #x200c))
+         (printed-chars (print-tests--prin1-to-string chars))
+         (printed-nums (print-tests--prin1-to-string nums))
+         (printed-nonprints (print-tests--prin1-to-string nonprints)))
+    (should (equal (read printed-chars) chars))
+    (should (equal
+             printed-chars
+             (concat
+              "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\"
+              " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)")))
+    (should (equal (read printed-nums) nums))
+    (should (equal printed-nums
+                   "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)"))
+    (should (equal (read printed-nonprints) nonprints))
+    (should (equal printed-nonprints
+                   "(55296 57343 778 65535 8194 8204)"))))
 
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-04 11:03                                                       ` Mattias Engdegård
@ 2020-11-04 15:38                                                         ` Eli Zaretskii
  2020-11-04 16:46                                                           ` Mattias Engdegård
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2020-11-04 15:38 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 44155, schwab, juri

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Wed, 4 Nov 2020 12:03:32 +0100
> Cc: juri@linkov.net, schwab@suse.de, 44155@debbugs.gnu.org
> 
> 'Printable' was used informally, not in an exact technical meaning. Intuitively, it should be the set of characters that make sense to print using the '?X' syntax. I initially thought that 'graphic' was too technical but it is more precise. 'Independently printable graphic character' is descriptive but a mouthful; perhaps 'independent graphic char' would do?

I'm not sure.  I think we should use something more familiar, or
explain it in more detail.  We already mention Unicode properties
elsewhere in the manual, so we could define this in those terms, and
send the reader there for the details, for example.

> For the ?X syntax to make sense, X must be visible; thus controls are out, and so are formatting chars (language tags etc). Spaces should probably have been excluded as well since it's typically not possible to see what kind of space follows the '?' (SPC is explicitly rendered as ?\s).
> 
> Furthermore, X must be independent since it isn't a grapheme cluster but a single code point. Therefore combining chars cannot be included as they would attach to the '?'.
> 
> 'graphicp' cannot be used because it includes combining, enclosing and nonspacing marks (M) and formats (Cf); otherwise it's fine.
> 
> While we could put the exact list of excluded general categories in the documentation, it is not very important because the selection only matters for usability and aesthetics, not (realistically) for code behaviour.
> 
> The attached patch excludes spaces (Zs) and revises the terminology.

I'm not going to argue about this aspect, but just FTR: whether to
include combining characters is a decision that we make here, it is
not a necessity.  Because we are perfectly capable of displaying
combining characters without risking them to become composed with
surrounding characters: we could either precede them with U+25CC
DOTTED CIRCLE, or use the technique describe-char-padded-string in
descr-text.el uses.

Thanks.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-04 15:38                                                         ` Eli Zaretskii
@ 2020-11-04 16:46                                                           ` Mattias Engdegård
  2020-11-04 16:58                                                             ` Mattias Engdegård
  0 siblings, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-04 16:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab, juri

[-- Attachment #1: Type: text/plain, Size: 918 bytes --]

4 nov. 2020 kl. 16.38 skrev Eli Zaretskii <eliz@gnu.org>:

> I'm not sure.  I think we should use something more familiar, or
> explain it in more detail.  We already mention Unicode properties
> elsewhere in the manual, so we could define this in those terms, and
> send the reader there for the details, for example.

Thanks for the review. Please look at the revised patch below with your requested changes.

> I'm not going to argue about this aspect, but just FTR: whether to
> include combining characters is a decision that we make here, it is
> not a necessity.  Because we are perfectly capable of displaying
> combining characters without risking them to become composed with
> surrounding characters: we could either precede them with U+25CC
> DOTTED CIRCLE, or use the technique describe-char-padded-string in
> descr-text.el uses.

No we cannot, because the output must be valid Lisp.


[-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --]
[-- Type: application/octet-stream, Size: 11560 bytes --]

From aadbdd31b85e8b4459d903ef1bed1fdf8272588f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 2 Nov 2020 23:37:16 +0100
Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters

The variable now only controls whether characters are printed, not
the radix.  Control chars are printed in human-readable syntax
only when special escapes such as ?\n are available.  Spaces,
formatting and combining chars are excluded (bug#44155).
Done in collaboration with Juri Linkov.

* src/character.c (graphic_base_p):
* src/print.c (named_escape): New functions.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format.  Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics.  The test now passes.
---
 doc/lispref/streams.texi | 13 ++++----
 etc/NEWS                 | 11 ++++---
 src/character.c          | 21 +++++++++++++
 src/character.h          |  1 +
 src/print.c              | 64 ++++++++++++++++++++++++++--------------
 test/src/print-tests.el  | 39 +++++++++++++-----------
 6 files changed, 97 insertions(+), 52 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index f171f13779..799d35b070 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -903,10 +903,11 @@ Output Variables
 you can use, see the variable's documentation string.
 @end defvar
 
-@defvar integer-output-format
-This variable specifies how to print integer numbers.  The default is
-@code{nil}, meaning use the decimal format.  When bound to @code{t},
-print integers as characters when an integer represents a character
-(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
-print non-negative integers in the hexadecimal format.
+@defvar print-integers-as-characters
+When this variable is non-@code{nil}, integers that represent
+independent graphic characters or control characters with their own
+escape syntax such as newline will be printed using Lisp character
+syntax (@pxref{Basic Char Syntax}).  Other numbers are printed the
+usual way.  For example, the list @code{(4 65 -1 10)} will be printed
+as @samp{(4 ?A -1 ?\n)}.
 @end defvar
diff --git a/etc/NEWS b/etc/NEWS
index d15f3ed1ae..e3ac15f7e3 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1697,12 +1697,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
-** New variable 'integer-output-format' determines how to print integer values.
-When this variable is bound to the value 't', integers are printed by
-printing functions as characters when an integer represents a character.
-When bound to the number 16, non-negative integers are printed in the
-hexadecimal format.
-
 +++
 ** 'define-globalized-minor-mode' now takes a ':predicate' parameter.
 This can be used to control which major modes the minor mode should be
@@ -1895,6 +1889,11 @@ file can affect code in another.  For details, see the manual section
 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal'
 and 'play-sound-file'.
 
++++
+** New variable 'print-integers-as-characters' modifies integer printing.
+When this variable is non-nil, character syntax is used for printing
+numbers for which this makes sense, such as '?*' for 42.
+
 \f
 * Changes in Emacs 28.1 on Non-Free Operating Systems
 
diff --git a/src/character.c b/src/character.c
index 5860f6a0c8..00b73293a3 100644
--- a/src/character.c
+++ b/src/character.c
@@ -982,6 +982,27 @@ printablep (int c)
 	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
 }
 
+/* Return true if C is graphic character that can be printed independently.  */
+bool
+graphic_base_p (int c)
+{
+  Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c);
+  if (! FIXNUMP (category))
+    return false;
+  EMACS_INT gen_cat = XFIXNUM (category);
+
+  return (!(gen_cat == UNICODE_CATEGORY_Mn       /* mark, nonspacing */
+            || gen_cat == UNICODE_CATEGORY_Mc    /* mark, combining */
+            || gen_cat == UNICODE_CATEGORY_Me    /* mark, enclosing */
+            || gen_cat == UNICODE_CATEGORY_Zs    /* separator, space */
+            || gen_cat == UNICODE_CATEGORY_Zl    /* separator, line */
+            || gen_cat == UNICODE_CATEGORY_Zp    /* separator, paragraph */
+            || gen_cat == UNICODE_CATEGORY_Cc    /* other, control */
+            || gen_cat == UNICODE_CATEGORY_Cs    /* other, surrogate */
+            || gen_cat == UNICODE_CATEGORY_Cf    /* other, format */
+            || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */
+}
+
 /* Return true if C is a horizontal whitespace character, as defined
    by https://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
 bool
diff --git a/src/character.h b/src/character.h
index af5023f77c..cbf43097ae 100644
--- a/src/character.h
+++ b/src/character.h
@@ -583,6 +583,7 @@ char_surrogate_p (int c)
 extern bool graphicp (int);
 extern bool printablep (int);
 extern bool blankp (int);
+extern bool graphic_base_p (int);
 
 /* Look up the element in char table OBJ at index CH, and return it as
    an integer.  If the element is not a character, return CH itself.  */
diff --git a/src/print.c b/src/print.c
index fa65a3cb26..f2e2dd131d 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
   return true;
 }
 
+static char
+named_escape (int i)
+{
+  switch (i)
+    {
+    case '\b': return 'b';
+    case '\t': return 't';
+    case '\n': return 'n';
+    case '\f': return 'f';
+    case '\r': return 'r';
+    case ' ':  return 's';
+      /* \a, \v, \e and \d are excluded from printing as escapes since
+         they are somewhat rare as characters and more likely to be
+         plain integers. */
+    }
+  return 0;
+}
+
 static void
 print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 {
@@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int c;
-	intmax_t i;
+        EMACS_INT i = XFIXNUM (obj);
+        char escaped_name;
 
-	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR
+            && ((escaped_name = named_escape (i))
+                || graphic_base_p (i)))
 	  {
 	    printchar ('?', printcharfun);
-	    if (escapeflag
-		&& (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
-		    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+            if (escaped_name)
+              {
+                printchar ('\\', printcharfun);
+                i = escaped_name;
+              }
+            else if (escapeflag
+                     && (i == ';' || i == '\"' || i == '\'' || i == '\\'
+                         || i == '(' || i == ')'
+                         || i == '{' || i == '}'
+                         || i == '[' || i == ']'))
 	      printchar ('\\', printcharfun);
-	    printchar (c, printcharfun);
-	  }
-	else if (INTEGERP (Vinteger_output_format)
-		 && integer_to_intmax (Vinteger_output_format, &i)
-		 && i == 16 && !NILP (Fnatnump (obj)))
-	  {
-	    int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
-	    strout (buf, len, len, printcharfun);
+	    printchar (i, printcharfun);
 	  }
 	else
 	  {
-	    int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+	    int len = sprintf (buf, "%"pI"d", i);
 	    strout (buf, len, len, printcharfun);
 	  }
       }
@@ -2270,12 +2289,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
-  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
-	       doc: /* The format used to print integers.
-When t, print characters from integers that represent a character.
-When a number 16, print non-negative integers in the hexadecimal format.
-Otherwise, by default print integers in the decimal format.  */);
-  Vinteger_output_format = Qnil;
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Non-nil means integers are printed using characters syntax.
+Only independent graphic characters, and control characters with named
+escape sequences such as newline, are printed this way.  Other
+integers, including those corresponding to raw bytes, are printed
+affected.  */);
+  print_integers_as_characters = Qnil;
 
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 7b026b6b21..202555adb3 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,25 +383,28 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
-(print-tests--deftest print-integer-output-format ()
+(print-tests--deftest print-integers-as-characters ()
   ;; Bug#44155.
-  (let ((integer-output-format t)
-        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
-  (let ((integer-output-format t)
-        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
-  (let ((integer-output-format 16)
-        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat
-                                (lambda (i)
-                                  (if (and (>= i 0) (<= i most-positive-fixnum))
-                                      (format "#x%x" i) (format "%d" i)))
-                                syms " ") ")")))))
+  (let* ((print-integers-as-characters t)
+         (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32
+                  ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d))
+         (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff))
+         (nonprints '(#xd800 #xdfff #x030a #xffff #x2002 #x200c))
+         (printed-chars (print-tests--prin1-to-string chars))
+         (printed-nums (print-tests--prin1-to-string nums))
+         (printed-nonprints (print-tests--prin1-to-string nonprints)))
+    (should (equal (read printed-chars) chars))
+    (should (equal
+             printed-chars
+             (concat
+              "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\"
+              " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)")))
+    (should (equal (read printed-nums) nums))
+    (should (equal printed-nums
+                   "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)"))
+    (should (equal (read printed-nonprints) nonprints))
+    (should (equal printed-nonprints
+                   "(55296 57343 778 65535 8194 8204)"))))
 
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-04 16:46                                                           ` Mattias Engdegård
@ 2020-11-04 16:58                                                             ` Mattias Engdegård
  2020-11-06 13:02                                                               ` Mattias Engdegård
  0 siblings, 1 reply; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-04 16:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155, schwab, juri

[-- Attachment #1: Type: text/plain, Size: 85 bytes --]

The last patch was incorrect; here is the right one. Apologies for the confusion.


[-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --]
[-- Type: application/octet-stream, Size: 11788 bytes --]

From 2a7bd3b8393f182e42d77e929d5e02a137c8e89b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 2 Nov 2020 23:37:16 +0100
Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters

The variable now only controls whether characters are printed, not
the radix.  Control chars are printed in human-readable syntax
only when special escapes such as ?\n are available.  Spaces,
formatting and combining chars are excluded (bug#44155).
Done in collaboration with Juri Linkov.

* src/character.c (graphic_base_p):
* src/print.c (named_escape): New functions.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format.  Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics.  The test now passes.
---
 doc/lispref/streams.texi | 18 +++++++----
 etc/NEWS                 | 11 ++++---
 src/character.c          | 21 +++++++++++++
 src/character.h          |  1 +
 src/print.c              | 64 ++++++++++++++++++++++++++--------------
 test/src/print-tests.el  | 39 +++++++++++++-----------
 6 files changed, 102 insertions(+), 52 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index f171f13779..0534afb67f 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -903,10 +903,16 @@ Output Variables
 you can use, see the variable's documentation string.
 @end defvar
 
-@defvar integer-output-format
-This variable specifies how to print integer numbers.  The default is
-@code{nil}, meaning use the decimal format.  When bound to @code{t},
-print integers as characters when an integer represents a character
-(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
-print non-negative integers in the hexadecimal format.
+@defvar print-integers-as-characters
+When this variable is non-@code{nil}, integers that represent
+graphic base characters will be printed using Lisp character syntax
+(@pxref{Basic Char Syntax}). Other numbers are printed the usual way.
+For example, the list @code{(4 65 -1 10)} would be printed as
+@samp{(4 ?A -1 ?\n)}.
+
+More precisely, values printed in character syntax are those
+representing characters belonging to the Unicode general categories
+Letter, Number, Punctuation, Symbol and Private-use
+(@pxref{Character Properties}), as well as the control characters
+having their own escape syntax such as newline.
 @end defvar
diff --git a/etc/NEWS b/etc/NEWS
index d15f3ed1ae..9dcdcc3079 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1697,12 +1697,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
-** New variable 'integer-output-format' determines how to print integer values.
-When this variable is bound to the value 't', integers are printed by
-printing functions as characters when an integer represents a character.
-When bound to the number 16, non-negative integers are printed in the
-hexadecimal format.
-
 +++
 ** 'define-globalized-minor-mode' now takes a ':predicate' parameter.
 This can be used to control which major modes the minor mode should be
@@ -1895,6 +1889,11 @@ file can affect code in another.  For details, see the manual section
 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal'
 and 'play-sound-file'.
 
++++
+** New variable 'print-integers-as-characters' modifies integer printing.
+If this variable is non-nil, character syntax is used for printing
+numbers when this makes sense, such as '?A' for 65.
+
 \f
 * Changes in Emacs 28.1 on Non-Free Operating Systems
 
diff --git a/src/character.c b/src/character.c
index 5860f6a0c8..00b73293a3 100644
--- a/src/character.c
+++ b/src/character.c
@@ -982,6 +982,27 @@ printablep (int c)
 	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
 }
 
+/* Return true if C is graphic character that can be printed independently.  */
+bool
+graphic_base_p (int c)
+{
+  Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c);
+  if (! FIXNUMP (category))
+    return false;
+  EMACS_INT gen_cat = XFIXNUM (category);
+
+  return (!(gen_cat == UNICODE_CATEGORY_Mn       /* mark, nonspacing */
+            || gen_cat == UNICODE_CATEGORY_Mc    /* mark, combining */
+            || gen_cat == UNICODE_CATEGORY_Me    /* mark, enclosing */
+            || gen_cat == UNICODE_CATEGORY_Zs    /* separator, space */
+            || gen_cat == UNICODE_CATEGORY_Zl    /* separator, line */
+            || gen_cat == UNICODE_CATEGORY_Zp    /* separator, paragraph */
+            || gen_cat == UNICODE_CATEGORY_Cc    /* other, control */
+            || gen_cat == UNICODE_CATEGORY_Cs    /* other, surrogate */
+            || gen_cat == UNICODE_CATEGORY_Cf    /* other, format */
+            || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */
+}
+
 /* Return true if C is a horizontal whitespace character, as defined
    by https://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
 bool
diff --git a/src/character.h b/src/character.h
index af5023f77c..cbf43097ae 100644
--- a/src/character.h
+++ b/src/character.h
@@ -583,6 +583,7 @@ char_surrogate_p (int c)
 extern bool graphicp (int);
 extern bool printablep (int);
 extern bool blankp (int);
+extern bool graphic_base_p (int);
 
 /* Look up the element in char table OBJ at index CH, and return it as
    an integer.  If the element is not a character, return CH itself.  */
diff --git a/src/print.c b/src/print.c
index fa65a3cb26..f2e2dd131d 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
   return true;
 }
 
+static char
+named_escape (int i)
+{
+  switch (i)
+    {
+    case '\b': return 'b';
+    case '\t': return 't';
+    case '\n': return 'n';
+    case '\f': return 'f';
+    case '\r': return 'r';
+    case ' ':  return 's';
+      /* \a, \v, \e and \d are excluded from printing as escapes since
+         they are somewhat rare as characters and more likely to be
+         plain integers. */
+    }
+  return 0;
+}
+
 static void
 print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 {
@@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int c;
-	intmax_t i;
+        EMACS_INT i = XFIXNUM (obj);
+        char escaped_name;
 
-	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR
+            && ((escaped_name = named_escape (i))
+                || graphic_base_p (i)))
 	  {
 	    printchar ('?', printcharfun);
-	    if (escapeflag
-		&& (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
-		    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+            if (escaped_name)
+              {
+                printchar ('\\', printcharfun);
+                i = escaped_name;
+              }
+            else if (escapeflag
+                     && (i == ';' || i == '\"' || i == '\'' || i == '\\'
+                         || i == '(' || i == ')'
+                         || i == '{' || i == '}'
+                         || i == '[' || i == ']'))
 	      printchar ('\\', printcharfun);
-	    printchar (c, printcharfun);
-	  }
-	else if (INTEGERP (Vinteger_output_format)
-		 && integer_to_intmax (Vinteger_output_format, &i)
-		 && i == 16 && !NILP (Fnatnump (obj)))
-	  {
-	    int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
-	    strout (buf, len, len, printcharfun);
+	    printchar (i, printcharfun);
 	  }
 	else
 	  {
-	    int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+	    int len = sprintf (buf, "%"pI"d", i);
 	    strout (buf, len, len, printcharfun);
 	  }
       }
@@ -2270,12 +2289,13 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
-  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
-	       doc: /* The format used to print integers.
-When t, print characters from integers that represent a character.
-When a number 16, print non-negative integers in the hexadecimal format.
-Otherwise, by default print integers in the decimal format.  */);
-  Vinteger_output_format = Qnil;
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Non-nil means integers are printed using characters syntax.
+Only independent graphic characters, and control characters with named
+escape sequences such as newline, are printed this way.  Other
+integers, including those corresponding to raw bytes, are printed
+affected.  */);
+  print_integers_as_characters = Qnil;
 
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 7b026b6b21..202555adb3 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,25 +383,28 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
-(print-tests--deftest print-integer-output-format ()
+(print-tests--deftest print-integers-as-characters ()
   ;; Bug#44155.
-  (let ((integer-output-format t)
-        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
-  (let ((integer-output-format t)
-        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
-  (let ((integer-output-format 16)
-        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat
-                                (lambda (i)
-                                  (if (and (>= i 0) (<= i most-positive-fixnum))
-                                      (format "#x%x" i) (format "%d" i)))
-                                syms " ") ")")))))
+  (let* ((print-integers-as-characters t)
+         (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32
+                  ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d))
+         (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff))
+         (nonprints '(#xd800 #xdfff #x030a #xffff #x2002 #x200c))
+         (printed-chars (print-tests--prin1-to-string chars))
+         (printed-nums (print-tests--prin1-to-string nums))
+         (printed-nonprints (print-tests--prin1-to-string nonprints)))
+    (should (equal (read printed-chars) chars))
+    (should (equal
+             printed-chars
+             (concat
+              "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\"
+              " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)")))
+    (should (equal (read printed-nums) nums))
+    (should (equal printed-nums
+                   "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)"))
+    (should (equal (read printed-nonprints) nonprints))
+    (should (equal printed-nonprints
+                   "(55296 57343 778 65535 8194 8204)"))))
 
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 109+ messages in thread

* bug#44155: Print integers as characters
  2020-11-04 16:58                                                             ` Mattias Engdegård
@ 2020-11-06 13:02                                                               ` Mattias Engdegård
  0 siblings, 0 replies; 109+ messages in thread
From: Mattias Engdegård @ 2020-11-06 13:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 44155-done, Andreas Schwab, Juri Linkov

4 nov. 2020 kl. 17.58 skrev Mattias Engdegård <mattiase@acm.org>:

> The last patch was incorrect; here is the right one. Apologies for the confusion.

Pushed to master, since there wasn't much left to discuss. As usual, it can be modified or reverted as needed.






^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2020-10-22 12:59                                   ` Eli Zaretskii
  2020-10-22 20:56                                     ` bug#44155: Print integers as characters Juri Linkov
@ 2022-04-30 12:19                                     ` Lars Ingebrigtsen
  2022-04-30 12:29                                       ` Eli Zaretskii
  1 sibling, 1 reply; 109+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-30 12:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866, Juri Linkov

Eli Zaretskii <eliz@gnu.org> writes:

>> +  DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars,
>> +	       doc: /* Print integers as characters.  */);
>> +  Vprint_integers_as_chars = Qnil;
>
> I wonder whether it wouldn't be cleaner to add another optional
> argument to prin1, and let it bind some internal variable so that
> print_object does this, instead  of exposing this knob to Lisp.
> Because print_object is used all over the place, and who knows what
> will this do to other callers?

There's also prin1-to-string, and adding a parameter to these functions
just for this doesn't seem quite right.

However, I agree with you that adding a new print-* variable is bad, too
(because users will invariably set them in .emacs and then things break
in some obscure package).

So I wonder whether we could come up with a new convention for print
variables like this, which would allow us to extend printing more
without adding new print variables.

What about -- adding a new parameter to prin1 and prin1-to-string that's
a plist of printing features?  That is, something like:

(prin1 object nil '(length 20 integers-as-chars t))

And this would allow us to introduce a special value for that parameter,
like t, which means "use the standard values for everything".

That means we could get rid of the gazillion places where we have

(let ((print-length nil)
      (print-level nil))
  (prin1 object))

That would instead just be (prin1 object nil t).  (And the same with
prin1-to-string.)  This would hopefully be less error-prone than what we
have today -- we've had so many bug reports from packages forgetting to
bind one or the other when saving data.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2022-04-30 12:19                                     ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen
@ 2022-04-30 12:29                                       ` Eli Zaretskii
  2022-04-30 14:49                                         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2022-04-30 12:29 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: rpluim, 43866, juri

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Juri Linkov <juri@linkov.net>,  rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Sat, 30 Apr 2022 14:19:32 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> +  DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars,
> >> +	       doc: /* Print integers as characters.  */);
> >> +  Vprint_integers_as_chars = Qnil;
> >
> > I wonder whether it wouldn't be cleaner to add another optional
> > argument to prin1, and let it bind some internal variable so that
> > print_object does this, instead  of exposing this knob to Lisp.
> > Because print_object is used all over the place, and who knows what
> > will this do to other callers?
> 
> There's also prin1-to-string, and adding a parameter to these functions
> just for this doesn't seem quite right.
> 
> However, I agree with you that adding a new print-* variable is bad, too
> (because users will invariably set them in .emacs and then things break
> in some obscure package).
> 
> So I wonder whether we could come up with a new convention for print
> variables like this, which would allow us to extend printing more
> without adding new print variables.
> 
> What about -- adding a new parameter to prin1 and prin1-to-string that's
> a plist of printing features?  That is, something like:
> 
> (prin1 object nil '(length 20 integers-as-chars t))

My worries were mainly because this new variable affected print_object
directly, and because print_object is called in many places unrelated
to prin1 etc.

I'm okay with what you propose, but I don't see how would that
eliminate the reasons for my worries.  The implementation of the
effect of this argument is still in print_object, so the question that
is of interest to me is how will we communicate these arguments to
print_object?





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2022-04-30 12:29                                       ` Eli Zaretskii
@ 2022-04-30 14:49                                         ` Lars Ingebrigtsen
  2022-04-30 15:26                                           ` Eli Zaretskii
  0 siblings, 1 reply; 109+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-30 14:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866, juri

Eli Zaretskii <eliz@gnu.org> writes:

> I'm okay with what you propose, but I don't see how would that
> eliminate the reasons for my worries.  The implementation of the
> effect of this argument is still in print_object, so the question that
> is of interest to me is how will we communicate these arguments to
> print_object?

I was thinking that prin1* would just set/bind a new global variable
(but one that isn't visible to the Lisp level).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2022-04-30 14:49                                         ` Lars Ingebrigtsen
@ 2022-04-30 15:26                                           ` Eli Zaretskii
  2022-04-30 18:49                                             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 109+ messages in thread
From: Eli Zaretskii @ 2022-04-30 15:26 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: rpluim, 43866, juri

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: juri@linkov.net,  rpluim@gmail.com,  43866@debbugs.gnu.org
> Date: Sat, 30 Apr 2022 16:49:14 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I'm okay with what you propose, but I don't see how would that
> > eliminate the reasons for my worries.  The implementation of the
> > effect of this argument is still in print_object, so the question that
> > is of interest to me is how will we communicate these arguments to
> > print_object?
> 
> I was thinking that prin1* would just set/bind a new global variable
> (but one that isn't visible to the Lisp level).

Then this sounds almost exactly like what I suggested back then, so I
agree, of course.





^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2022-04-30 15:26                                           ` Eli Zaretskii
@ 2022-04-30 18:49                                             ` Lars Ingebrigtsen
  2022-05-29 13:35                                               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 109+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-30 18:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866, juri

Heh, `print-integers-as-characters' already exists -- it was added in
2020.

Anyway, I still think adding a parameter like described to prin1 would
be nice, but it's not necessary for this feature, which somehow had
something to do with Italian postfix.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no






^ permalink raw reply	[flat|nested] 109+ messages in thread

* bug#43866: 26.3; italian postfix additions
  2022-04-30 18:49                                             ` Lars Ingebrigtsen
@ 2022-05-29 13:35                                               ` Lars Ingebrigtsen
  0 siblings, 0 replies; 109+ messages in thread
From: Lars Ingebrigtsen @ 2022-05-29 13:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 43866, juri

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Anyway, I still think adding a parameter like described to prin1 would
> be nice, but it's not necessary for this feature, which somehow had
> something to do with Italian postfix.

Re-skimming this bug thread, I think the original issue was fixed by
Eli at the time -- E= was added (for euro sign) -- but then the
discussion went on to whether we should have an input method/command
based on

/usr/share/X11/locale/en_US.UTF-8/Compose

So I'm closing this bug report and opening a new one that's about that.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 109+ messages in thread

end of thread, other threads:[~2022-05-29 13:35 UTC | newest]

Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì
2020-10-08 12:26 ` Eli Zaretskii
2020-10-08 12:34   ` Francesco Potortì
2020-10-08 12:39   ` Robert Pluim
2020-10-08 12:57     ` Eli Zaretskii
2020-10-08 13:54       ` Robert Pluim
2020-10-08 14:24         ` Robert Pluim
2020-10-08 14:32           ` Eli Zaretskii
2020-10-08 13:26     ` Francesco Potortì
2020-10-08 14:00       ` Robert Pluim
2020-10-13 20:07     ` Juri Linkov
2020-10-14  2:31       ` Eli Zaretskii
2020-10-14  8:07         ` Juri Linkov
2020-10-14 15:07           ` Eli Zaretskii
2020-10-14 19:40             ` Juri Linkov
2020-10-15  2:34               ` Eli Zaretskii
2020-10-19 20:45                 ` Juri Linkov
2020-10-19 23:12                   ` Stefan Kangas
2020-10-20 18:42                     ` Juri Linkov
2020-10-20 14:12                   ` Eli Zaretskii
2020-10-20 14:47                     ` Robert Pluim
2020-10-20 15:50                       ` Eli Zaretskii
2020-10-20 18:44                       ` Juri Linkov
2020-10-20 19:05                     ` Juri Linkov
2020-10-21  8:11                       ` Robert Pluim
2020-10-21 14:29                         ` Eli Zaretskii
2020-10-21 14:40                           ` Robert Pluim
2020-10-21 15:23                             ` Eli Zaretskii
2020-10-21 17:30                         ` Juri Linkov
2020-10-20 19:56                     ` Juri Linkov
2020-10-21 14:02                       ` Eli Zaretskii
2020-10-21 17:23                         ` Juri Linkov
2020-10-21 18:16                           ` Eli Zaretskii
2020-10-21 18:27                             ` Juri Linkov
2020-10-21 18:35                               ` Eli Zaretskii
2020-10-21 19:39                                 ` Juri Linkov
2020-10-22 12:59                                   ` Eli Zaretskii
2020-10-22 20:56                                     ` bug#44155: Print integers as characters Juri Linkov
2020-10-22 22:39                                       ` Andreas Schwab
2020-10-23  8:16                                         ` Juri Linkov
2020-10-23  8:32                                         ` Juri Linkov
2020-10-24 19:53                                           ` Juri Linkov
2020-10-25 17:22                                             ` Eli Zaretskii
2020-10-25 19:09                                               ` Juri Linkov
2020-10-25 19:53                                                 ` Eli Zaretskii
2020-10-27 20:08                                                   ` Juri Linkov
2020-10-28 15:51                                                     ` Eli Zaretskii
2020-10-28 19:41                                                       ` Juri Linkov
2020-10-29 14:20                                                         ` Eli Zaretskii
2020-10-29 21:00                                                           ` Juri Linkov
2020-10-30  7:35                                                             ` Eli Zaretskii
2020-10-31 20:11                                                               ` Juri Linkov
2020-10-31 23:27                                                                 ` Glenn Morris
2020-11-01  7:58                                                                   ` Juri Linkov
2020-11-01 15:13                                                                     ` Eli Zaretskii
2020-11-01 18:39                                                                       ` Juri Linkov
2020-11-01 18:51                                                                         ` Eli Zaretskii
2020-11-01 19:13                                                                           ` Juri Linkov
2020-11-01 19:41                                                                             ` Eli Zaretskii
2020-11-01 20:16                                                                               ` Juri Linkov
2020-11-01 12:03                                       ` Mattias Engdegård
2020-11-01 18:35                                         ` Juri Linkov
2020-11-01 20:52                                           ` Mattias Engdegård
2020-11-02 21:36                                             ` Juri Linkov
2020-11-02 23:03                                               ` Mattias Engdegård
2020-11-03  8:30                                                 ` Juri Linkov
2020-11-03 15:24                                                 ` Eli Zaretskii
2020-11-03 18:47                                                   ` Mattias Engdegård
2020-11-03 19:36                                                     ` Eli Zaretskii
2020-11-04 11:03                                                       ` Mattias Engdegård
2020-11-04 15:38                                                         ` Eli Zaretskii
2020-11-04 16:46                                                           ` Mattias Engdegård
2020-11-04 16:58                                                             ` Mattias Engdegård
2020-11-06 13:02                                                               ` Mattias Engdegård
2022-04-30 12:19                                     ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen
2022-04-30 12:29                                       ` Eli Zaretskii
2022-04-30 14:49                                         ` Lars Ingebrigtsen
2022-04-30 15:26                                           ` Eli Zaretskii
2022-04-30 18:49                                             ` Lars Ingebrigtsen
2022-05-29 13:35                                               ` Lars Ingebrigtsen
2020-10-15  3:52         ` Richard Stallman
2020-10-14  4:38       ` Richard Stallman
2020-10-14  8:11         ` Juri Linkov
2020-10-14 10:43         ` Robert Pluim
2020-10-15  3:54           ` Richard Stallman
2020-10-14 14:56         ` Eli Zaretskii
2020-10-08 15:23 ` Mattias Engdegård
2020-10-08 15:35   ` Robert Pluim
2020-10-08 16:22     ` Francesco Potortì
2020-10-08 15:42   ` Eli Zaretskii
2020-10-08 16:10   ` Francesco Potortì
2020-10-08 17:18     ` Robert Pluim
2020-10-08 17:28       ` Francesco Potortì
2020-10-08 17:59       ` Mattias Engdegård
2020-10-08 19:55         ` Francesco Potortì
2020-10-09  4:42         ` Lars Ingebrigtsen
2020-10-09 11:26           ` Mattias Engdegård
2020-10-09 11:53             ` Thien-Thi Nguyen
2020-10-09 12:45               ` Robert Pluim
2020-10-09 14:31                 ` Eli Zaretskii
2020-10-09 14:48                   ` Robert Pluim
2020-10-09 15:04                     ` Eli Zaretskii
2020-10-10 20:54                       ` Lars Ingebrigtsen
2020-10-12  9:26                         ` Robert Pluim
2020-10-09 15:05                   ` Mattias Engdegård
2020-10-09 15:08                     ` Robert Pluim
2020-10-09 15:28                       ` Mattias Engdegård
2020-10-09 15:10                     ` Eli Zaretskii
2020-10-09 15:21                       ` Robert Pluim

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).