* bug#43866: 26.3; italian postfix additions @ 2020-10-08 12:05 Francesco Potortì 2020-10-08 12:26 ` Eli Zaretskii 2020-10-08 15:23 ` Mattias Engdegård 0 siblings, 2 replies; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 12:05 UTC (permalink / raw) To: 43866 Since the inception of mule, amyyears ago, I have set up an environment where I switch between italian-postfix and american input methods. Now I realise that I have made long time ago an addition to italian that has never gone into emacs. The rationale is that in Italy latin-9 should be used insterad of latin1, which does not contain the euro symbol. And that italian-postfix should allow introducing the euro symbol. Here is what I use in all machines where I have emacs: ================ start ================ ;; Add the Euro symbol, use Latin-9 rather than Latin-1 (quail-define-package "italian-postfix" "Latin-9" "IT<" t "Italian (Italiano) input method with postfix modifiers a` -> à A` -> À e' -> é << -> « e` -> è E` -> È E' -> É >> -> » i` -> ì I` -> Ì E= -> € o_ -> º o` -> ò O` -> Ò a_ -> ª u` -> ù U` -> Ù Typewriter-style italian characters. Doubling the postfix separates the letter and postfix: e.g. a`` -> a` " nil t nil nil nil nil nil nil nil nil t) (quail-define-rules ("A`" ?À) ("a`" ?à) ("E`" ?È) ("E'" ?É) ("E=" ?€) ("e`" ?è) ("e'" ?é) ("I`" ?Ì) ("i`" ?ì) ("O`" ?Ò) ("o`" ?ò) ("U`" ?Ù) ("u`" ?ù) ("<<" ?«) (">>" ?») ("o_" ?º) ("a_" ?ª) ("A``" ["A`"]) ("a``" ["a`"]) ("E``" ["E`"]) ("E''" ["E'"]) ("e``" ["e`"]) ("e''" ["e'"]) ("I``" ["I`"]) ("i``" ["i`"]) ("O``" ["O`"]) ("o``" ["o`"]) ("U``" ["U`"]) ("u``" ["u`"]) ("<<<" ["<<"]) (">>>" [">>"]) ("o__" ["o_"]) ("a__" ["a_"]) ) ================ end ================ In GNU Emacs 26.3 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of 2020-05-17, modified by Debian built on x86-csail-01 Windowing system distributor 'The X.Org Foundation', version 11.0.12008000 System Description: Debian GNU/Linux bullseye/sid Important settings: value of $LC_COLLATE: it_IT.UTF-8 value of $LC_CTYPE: it_IT.UTF-8 value of $LC_NUMERIC: C value of $LANG: C.UTF-8 locale-coding-system: utf-8-unix Major mode: Mail Minor modes in effect: filladapt-mode: t diff-auto-refine-mode: t desktop-save-mode: t epa-global-mail-mode: t epa-mail-mode: t shell-dirtrack-mode: t openwith-mode: t xterm-mouse-mode: t display-time-mode: t tooltip-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t column-number-mode: t line-number-mode: t abbrev-mode: t Load-path shadows: ~/elisp/bhl hides /usr/share/emacs/site-lisp/bhl /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-autoloads hides /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/debian-autoloads /usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode-pkg /usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode /usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode-tests hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode-tests /usr/share/emacs/site-lisp/elpa/csv-mode-1.12/csv-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/csv-mode-1.12/csv-mode-autoloads /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-el hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-el /usr/share/emacs/site-lisp/elpa/debian-el-37/gnus-BTS hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/gnus-BTS /usr/share/emacs/site-lisp/elpa/debian-el-37/preseed hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/preseed /usr/share/emacs/site-lisp/elpa/debian-el-37/deb-view hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/deb-view /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-el-autoloads hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-el-autoloads /usr/share/emacs/site-lisp/elpa/debian-el-37/apt-utils hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/apt-utils /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-bug hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-bug /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-el-pkg hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-el-pkg /usr/share/emacs/site-lisp/elpa/debian-el-37/apt-sources hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/apt-sources /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-autoloads hides /usr/share/emacs/site-lisp/elpa-src/debian-el-37/debian-autoloads /usr/share/emacs/site-lisp/elpa/dictionary-1.10/dictionary hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/dictionary /usr/share/emacs/site-lisp/elpa/dictionary-1.10/link hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/link /usr/share/emacs/site-lisp/elpa/dictionary-1.10/dictionary-pkg hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/dictionary-pkg /usr/share/emacs/site-lisp/elpa/dictionary-1.10/dictionary-autoloads hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/dictionary-autoloads /usr/share/emacs/site-lisp/elpa/dictionary-1.10/connection hides /usr/share/emacs/site-lisp/elpa-src/dictionary-1.10/connection /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-mode-pkg /usr/share/emacs/site-lisp/elpa/debian-el-37/debian-autoloads hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/debian-autoloads /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-context hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-context /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-gui hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-gui /usr/share/emacs/site-lisp/elpa/gnuplot-mode-20141231/gnuplot-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/gnuplot-mode-20141231/gnuplot-mode-autoloads /usr/share/emacs/site-lisp/elpa/markdown-mode-2.4/markdown-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/markdown-mode-2.4/markdown-mode-autoloads /usr/share/emacs/site-lisp/elpa/markdown-mode-2.4/markdown-mode hides /usr/share/emacs/site-lisp/elpa-src/markdown-mode-2.4/markdown-mode /usr/share/emacs/site-lisp/elpa/markdown-mode-2.4/markdown-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/markdown-mode-2.4/markdown-mode-pkg /usr/share/emacs/site-lisp/flim/md4 hides /usr/share/emacs/26.3/lisp/md4 /usr/share/emacs/site-lisp/flim/hex-util hides /usr/share/emacs/26.3/lisp/hex-util ~/elisp/octave hides /usr/share/emacs/26.3/lisp/progmodes/octave /usr/share/emacs/site-lisp/flim/ntlm hides /usr/share/emacs/26.3/lisp/net/ntlm /usr/share/emacs/site-lisp/flim/hmac-md5 hides /usr/share/emacs/26.3/lisp/net/hmac-md5 /usr/share/emacs/site-lisp/flim/sasl-ntlm hides /usr/share/emacs/26.3/lisp/net/sasl-ntlm /usr/share/emacs/site-lisp/flim/sasl-digest hides /usr/share/emacs/26.3/lisp/net/sasl-digest /usr/share/emacs/site-lisp/flim/sasl hides /usr/share/emacs/26.3/lisp/net/sasl /usr/share/emacs/site-lisp/flim/sasl-cram hides /usr/share/emacs/26.3/lisp/net/sasl-cram /usr/share/emacs/site-lisp/flim/hmac-def hides /usr/share/emacs/26.3/lisp/net/hmac-def Features: (shadow emacsbug apropos vc-bzr mode-local calccomp calc-map calc-alg calc-vec calc-aent calc-menu calc-yank calc-ext reporter debian-bug anything-config anything woman cl etags two-column iso-transl org-rmail org-mhe org-irc org-info org-gnus nnir gnus-sum gnus-group gnus-undo gnus-start gnus-cloud nnimap nnmail mail-source utf7 netrc nnoo gnus-spec gnus-int gnus-range gnus-win gnus nnheader org-docview org-bibtex org-bbdb org-w3m org-element avl-tree generator org org-macro org-footnote org-pcomplete org-list org-faces org-entities org-version ob-emacs-lisp ob ob-tangle org-src ob-ref ob-lob ob-table ob-keys ob-exp ob-comint ob-core ob-eval org-compat org-macs org-loaddefs ispell xref project eieio-opt speedbar sb-image ezimage dframe completion dos-w32 find-cmd grep find-dired find-func pp cl-print help-fns radix-tree unrmail calc calc-loaddefs calc-macs deb-view network-stream starttls url-http tls gnutls url-gw nsm url-cache url-auth url url-proxy url-privacy url-expand url-methods url-history url-cookie w3m-filter w3m-form w3m-cookie url-domsuf w3m-bookmark w3m-tabmenu w3m-session w3m mailcap doc-view image-mode w3m-hist w3m-fb bookmark-w3m w3m-ems wid-edit w3m-ccl ccl w3m-favicon w3m-image w3m-proc w3m-util cal-move cal-x dabbrev arc-mode archive-mode macros locate edmacro kmacro rect tabify man shr-color timezone rmailsort rmailedit url-util shr svg xml browse-url add-log mailalias rmailout rmailkwd time-stamp cl-extra dired-aux wdired misearch multi-isearch make-mode jka-compr vc-git diff-mode markdown-mode subr-x noutline outline easy-mmode generic sh-script executable tex-mode compile vc-dir ewoc vc vc-dispatcher vc-svn json-mode rx bibtex-style vc-filewise vc-rcs octave texinfo pcase bibtex mhtml-mode css-mode smie color js json map imenu cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs sgml-mode dom qp rmailmm message rmc puny rfc822 mml mml-sec gnus-util mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader mail-parse rfc2231 desktop frameset elec-pair cal-julian solar cal-dst pot skeleton warnings rmailsum rmail rmail-loaddefs sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mime-compose epa-mail mail-utils epa derived epg view holidays hol-loaddefs appt diary-lib diary-loaddefs cal-menu calendar cal-loaddefs tramp tramp-compat tramp-loaddefs trampver ucs-normalize shell pcomplete comint ring parse-time format-spec advice bhl visual-fill-column switch-to-shell openwith hi-lock xt-mouse time-date ffap thingatpt scroll-in-place filladapt ansi-color time quail help-mode dired-x dired dired-loaddefs generic-x disp-table finder-inf info debian-el package easymenu epg-config url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache url-vars seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib w3m-load mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting font-render-setting x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 16 859419 104132) (symbols 48 61004 1) (miscs 40 1838 1970) (strings 32 207996 12197) (string-bytes 1 6110429) (vectors 16 83287) (vector-slots 8 2187469 111622) (floats 8 914 1312) (intervals 56 48134 1037) (buffers 992 180)) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì @ 2020-10-08 12:26 ` Eli Zaretskii 2020-10-08 12:34 ` Francesco Potortì 2020-10-08 12:39 ` Robert Pluim 2020-10-08 15:23 ` Mattias Engdegård 1 sibling, 2 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-08 12:26 UTC (permalink / raw) To: Francesco Potortì; +Cc: 43866 > From: Francesco Potortì <pot@gnu.org> > Date: Thu, 08 Oct 2020 14:05:55 +0200 > > Since the inception of mule, amyyears ago, I have set up an environment > where I switch between italian-postfix and american input methods. > > Now I realise that I have made long time ago an addition to italian that > has never gone into emacs. > > The rationale is that in Italy latin-9 should be used insterad of > latin1, which does not contain the euro symbol. And that > italian-postfix should allow introducing the euro symbol. The Latin-1 vs Latin-9 part is not important nowadays, since Emacs uses Unicode internally. As for the Euro symbol, I guess we need to add it to all the Latin input methods, not just the Italian one? Thanks. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:26 ` Eli Zaretskii @ 2020-10-08 12:34 ` Francesco Potortì 2020-10-08 12:39 ` Robert Pluim 1 sibling, 0 replies; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 12:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866 >> From: Francesco Potortì <pot@gnu.org> >> Date: Thu, 08 Oct 2020 14:05:55 +0200 >> >> Since the inception of mule, amyyears ago, I have set up an environment >> where I switch between italian-postfix and american input methods. >> >> Now I realise that I have made long time ago an addition to italian that >> has never gone into emacs. >> >> The rationale is that in Italy latin-9 should be used insterad of >> latin1, which does not contain the euro symbol. And that >> italian-postfix should allow introducing the euro symbol. > >The Latin-1 vs Latin-9 part is not important nowadays, since Emacs >uses Unicode internally. I don't know if that's ever used, maybe changing it is worth enyway. >As for the Euro symbol, I guess we need to add it to all the Latin >input methods, not just the Italian one? I guess yes. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:26 ` Eli Zaretskii 2020-10-08 12:34 ` Francesco Potortì @ 2020-10-08 12:39 ` Robert Pluim 2020-10-08 12:57 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-08 12:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866 >>>>> On Thu, 08 Oct 2020 15:26:16 +0300, Eli Zaretskii <eliz@gnu.org> said: >> From: Francesco Potortì <pot@gnu.org> >> Date: Thu, 08 Oct 2020 14:05:55 +0200 >> >> Since the inception of mule, amyyears ago, I have set up an environment >> where I switch between italian-postfix and american input methods. >> >> Now I realise that I have made long time ago an addition to italian that >> has never gone into emacs. >> >> The rationale is that in Italy latin-9 should be used insterad of >> latin1, which does not contain the euro symbol. And that >> italian-postfix should allow introducing the euro symbol. Eli> The Latin-1 vs Latin-9 part is not important nowadays, since Emacs Eli> uses Unicode internally. Eli> As for the Euro symbol, I guess we need to add it to all the Latin Eli> input methods, not just the Italian one? Itʼs already in latin-postfix and on C-x 8 * E, is that really necessary? Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:39 ` Robert Pluim @ 2020-10-08 12:57 ` Eli Zaretskii 2020-10-08 13:54 ` Robert Pluim 2020-10-08 13:26 ` Francesco Potortì 2020-10-13 20:07 ` Juri Linkov 2 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-08 12:57 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 > From: Robert Pluim <rpluim@gmail.com> > Cc: Francesco Potortì <pot@gnu.org>, 43866@debbugs.gnu.org > Date: Thu, 08 Oct 2020 14:39:15 +0200 > > Eli> As for the Euro symbol, I guess we need to add it to all the Latin > Eli> input methods, not just the Italian one? > > Itʼs already in latin-postfix and on C-x 8 * E, is that really > necessary? You mean, do people (besides Francesco) still use italian-postfix? I don't know. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:57 ` Eli Zaretskii @ 2020-10-08 13:54 ` Robert Pluim 2020-10-08 14:24 ` Robert Pluim 0 siblings, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-08 13:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866 >>>>> On Thu, 08 Oct 2020 15:57:14 +0300, Eli Zaretskii <eliz@gnu.org> said: >> From: Robert Pluim <rpluim@gmail.com> >> Cc: Francesco Potortì <pot@gnu.org>, 43866@debbugs.gnu.org >> Date: Thu, 08 Oct 2020 14:39:15 +0200 >> Eli> As for the Euro symbol, I guess we need to add it to all the Latin Eli> input methods, not just the Italian one? >> >> Itʼs already in latin-postfix and on C-x 8 * E, is that really >> necessary? Eli> You mean, do people (besides Francesco) still use italian-postfix? I Eli> don't know. What I meant is: everything is Unicode, and latin-postfix subsumes all <language>-postfix, as far as I can tell, so people could just use latin-postfix. Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 13:54 ` Robert Pluim @ 2020-10-08 14:24 ` Robert Pluim 2020-10-08 14:32 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-08 14:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866 >>>>> On Thu, 08 Oct 2020 15:54:37 +0200, Robert Pluim <rpluim@gmail.com> said: >>>>> On Thu, 08 Oct 2020 15:57:14 +0300, Eli Zaretskii <eliz@gnu.org> said: >>> From: Robert Pluim <rpluim@gmail.com> >>> Cc: Francesco Potortì <pot@gnu.org>, 43866@debbugs.gnu.org >>> Date: Thu, 08 Oct 2020 14:39:15 +0200 >>> Eli> As for the Euro symbol, I guess we need to add it to all the Latin Eli> input methods, not just the Italian one? >>> >>> Itʼs already in latin-postfix and on C-x 8 * E, is that really >>> necessary? Eli> You mean, do people (besides Francesco) still use italian-postfix? I Eli> don't know. Robert> What I meant is: everything is Unicode, and latin-postfix subsumes all Robert> <language>-postfix, as far as I can tell, so people could just use latin-postfix. As a practical question, if we do decide to add the euro to a bunch of latin input methods, where do we stop? Should we add it to all the greek-* methods on the grounds that Greece uses the Euro, but not to "british" since the UK uses the pound? (this is why I use C-x 8 * E to type €) Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 14:24 ` Robert Pluim @ 2020-10-08 14:32 ` Eli Zaretskii 0 siblings, 0 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-08 14:32 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 > From: Robert Pluim <rpluim@gmail.com> > Cc: 43866@debbugs.gnu.org > Date: Thu, 08 Oct 2020 16:24:05 +0200 > > As a practical question, if we do decide to add the euro to a bunch of > latin input methods, where do we stop? Should we add it to all the > greek-* methods on the grounds that Greece uses the Euro, but not to > "british" since the UK uses the pound? I only thought about the Latin-N ones, mainly because all the legacy encodings which didn't have a Euro sign at some point got upgraded to newer Latin-N encodings which do. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:39 ` Robert Pluim 2020-10-08 12:57 ` Eli Zaretskii @ 2020-10-08 13:26 ` Francesco Potortì 2020-10-08 14:00 ` Robert Pluim 2020-10-13 20:07 ` Juri Linkov 2 siblings, 1 reply; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 13:26 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 > >> The rationale is that in Italy latin-9 should be used insterad of > >> latin1, which does not contain the euro symbol. And that > >> italian-postfix should allow introducing the euro symbol. > > Eli> As for the Euro symbol, I guess we need to add it to all the Latin > Eli> input methods, not just the Italian one? > >Itʼs already in latin-postfix and on C-x 8 * E, is that really >necessary? I don't use latin-postfix, because it gets in the way: there are many more combinations than in italian postifix which change what I am writing - this is an added burden for a minuscul added benefit. This is why italian-postifix exists, in spite of the existence of latin-postfix. I don't use C-x 8 either, because I'd have to learn a lot of complex bindings: when I need some exotic char I use the terminal or the X combinations, which are easier and not specific to Emacs, unless I am forced to. This has little to do with italian-postfix, in fact. All in all, I don't get your objection. What would be the drawback of adding the E = keybinding? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 13:26 ` Francesco Potortì @ 2020-10-08 14:00 ` Robert Pluim 0 siblings, 0 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-08 14:00 UTC (permalink / raw) To: Francesco Potortì; +Cc: 43866 >>>>> On Thu, 08 Oct 2020 15:26:07 +0200, Francesco Potortì <pot@gnu.org> said: >> >> The rationale is that in Italy latin-9 should be used insterad of >> >> latin1, which does not contain the euro symbol. And that >> >> italian-postfix should allow introducing the euro symbol. >> Eli> As for the Euro symbol, I guess we need to add it to all the Latin Eli> input methods, not just the Italian one? >> >> Itʼs already in latin-postfix and on C-x 8 * E, is that really >> necessary? Francesco> I don't use latin-postfix, because it gets in the way: there are many Francesco> more combinations than in italian postifix which change what I am Francesco> writing - this is an added burden for a minuscul added benefit. OK Francesco> This is why italian-postifix exists, in spite of the existence of Francesco> latin-postfix. Francesco> I don't use C-x 8 either, because I'd have to learn a lot of complex Francesco> bindings: when I need some exotic char I use the terminal or the X Francesco> combinations, which are easier and not specific to Emacs, unless I am Francesco> forced to. This would not be for you: you already have your modified italian-postfix. Francesco> This has little to do with italian-postfix, in fact. Francesco> All in all, I don't get your objection. What would be the drawback of Francesco> adding the E = keybinding? Think of it more as 'do we really need to change X numbers of input methods, is there a simpler way'. And the answer appears to be 'no such way exists' Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:39 ` Robert Pluim 2020-10-08 12:57 ` Eli Zaretskii 2020-10-08 13:26 ` Francesco Potortì @ 2020-10-13 20:07 ` Juri Linkov 2020-10-14 2:31 ` Eli Zaretskii 2020-10-14 4:38 ` Richard Stallman 2 siblings, 2 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-13 20:07 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 > Itʼs already in latin-postfix and on C-x 8 * E, is that really > necessary? I wonder why C-x 8 provides key sequences that are not mnemonic and so hard to remember? Would it make sense to support exactly the same keys that are provided by the X11 compose method? I mean that are in the file /usr/share/X11/locale/en_US.UTF-8/Compose also available at https://help.ubuntu.com/community/ComposeKey and https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre For example, for every such line: <Multi_key> <equal> <E> : "€" EuroSign # EURO SIGN replace <Multi_key> with C-x 8, and bind such key sequences: C-x 8 = E => "€" and for all other keys as well, e.g. C-x 8 . . . => "…" (HORIZONTAL ELLIPSIS) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-13 20:07 ` Juri Linkov @ 2020-10-14 2:31 ` Eli Zaretskii 2020-10-14 8:07 ` Juri Linkov 2020-10-15 3:52 ` Richard Stallman 2020-10-14 4:38 ` Richard Stallman 1 sibling, 2 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-14 2:31 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: Eli Zaretskii <eliz@gnu.org>, 43866@debbugs.gnu.org > Date: Tue, 13 Oct 2020 23:07:13 +0300 > > I wonder why C-x 8 provides key sequences that are not mnemonic > and so hard to remember? > > Would it make sense to support exactly the same keys that are > provided by the X11 compose method? I mean that are in the file > /usr/share/X11/locale/en_US.UTF-8/Compose > also available at > https://help.ubuntu.com/community/ComposeKey > and > https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre How about making a new input method for those? It seems to me that C-x 8 is already too "fat". ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 2:31 ` Eli Zaretskii @ 2020-10-14 8:07 ` Juri Linkov 2020-10-14 15:07 ` Eli Zaretskii 2020-10-15 3:52 ` Richard Stallman 1 sibling, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-14 8:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 >> I wonder why C-x 8 provides key sequences that are not mnemonic >> and so hard to remember? >> >> Would it make sense to support exactly the same keys that are >> provided by the X11 compose method? I mean that are in the file >> /usr/share/X11/locale/en_US.UTF-8/Compose >> also available at >> https://help.ubuntu.com/community/ComposeKey >> and >> https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre > > How about making a new input method for those? It seems to me that > C-x 8 is already too "fat". Yes, a new method might be useful as well. But since such method can't be enabled all the time because such sequences as "= E" should be inserted literally in normal circumstances. So such method needs to be enabled temporarily, and it takes more time to enable/disable it, while it's useful only to insert a single special character sometimes, it would be much easier to type some prefix key before typing "= E" to insert € when such a need arises occasionally. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 8:07 ` Juri Linkov @ 2020-10-14 15:07 ` Eli Zaretskii 2020-10-14 19:40 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-14 15:07 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Wed, 14 Oct 2020 11:07:41 +0300 > > > How about making a new input method for those? It seems to me that > > C-x 8 is already too "fat". > > Yes, a new method might be useful as well. But since such method can't > be enabled all the time because such sequences as "= E" should be inserted > literally in normal circumstances. So such method needs to be enabled > temporarily, and it takes more time to enable/disable it, while it's > useful only to insert a single special character sometimes, it would be > much easier to type some prefix key before typing "= E" to insert € > when such a need arises occasionally. But turning an input method on and off is just 1 key, C-\, whereas C-x 8 is 2 keys, and not very convenient sequence to type, at least on QWERTY keyboards. So it looks like a dedicated input method will still be a win. I don't think it's right that the only Unicode input method we have is TeX -- that is great for TeX users, but many people don't use (La)TeX, and will find it unintuitive to type the TeX sequences. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 15:07 ` Eli Zaretskii @ 2020-10-14 19:40 ` Juri Linkov 2020-10-15 2:34 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-14 19:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 >> > How about making a new input method for those? It seems to me that >> > C-x 8 is already too "fat". >> >> Yes, a new method might be useful as well. But since such method can't >> be enabled all the time because such sequences as "= E" should be inserted >> literally in normal circumstances. So such method needs to be enabled >> temporarily, and it takes more time to enable/disable it, while it's >> useful only to insert a single special character sometimes, it would be >> much easier to type some prefix key before typing "= E" to insert € >> when such a need arises occasionally. > > But turning an input method on and off is just 1 key, C-\, whereas 1 key C-\ to enable, and 1 key C-\ to disable. Also might need to select another input method name from 'C-u C-\' when also using other input methods. > C-x 8 is 2 keys, and not very convenient sequence to type, at least on > QWERTY keyboards. I agree, C-x 8 is not easy to type. > So it looks like a dedicated input method will still be a win. A win for some users, not a win for other users, so adding both (an input method and a prefix key) would be fine for all. > I don't think it's right that the only Unicode input method we have is > TeX -- that is great for TeX users, but many people don't use (La)TeX, > and will find it unintuitive to type the TeX sequences. It seems the TeX input method requires typing whole Unicode names, or at least unambiguous parts of names, e.g. '\euro' inserts €, '\smile' inserts ⌣, but can't type '\smiling face with sunglasses'. Also I see a hex Unicode input method in uni-input.el that supports e.g. U<hex> or u<hex>, RFC1345 mnemonics in rfc1345.el, SGML entities in sgml-input.el. So adding a X11 Compose method would be handy. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 19:40 ` Juri Linkov @ 2020-10-15 2:34 ` Eli Zaretskii 2020-10-19 20:45 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-15 2:34 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Wed, 14 Oct 2020 22:40:48 +0300 > > >> Yes, a new method might be useful as well. But since such method can't > >> be enabled all the time because such sequences as "= E" should be inserted > >> literally in normal circumstances. So such method needs to be enabled > >> temporarily, and it takes more time to enable/disable it, while it's > >> useful only to insert a single special character sometimes, it would be > >> much easier to type some prefix key before typing "= E" to insert € > >> when such a need arises occasionally. > > > > But turning an input method on and off is just 1 key, C-\, whereas > > 1 key C-\ to enable, and 1 key C-\ to disable. Also might need to select > another input method name from 'C-u C-\' when also using other input methods. Btw, input methods that use =E or E= could (and in many cases do) have ==E and E== to insert just "=E" and "E=", so no toggling is needed. > > I don't think it's right that the only Unicode input method we have is > > TeX -- that is great for TeX users, but many people don't use (La)TeX, > > and will find it unintuitive to type the TeX sequences. > > It seems the TeX input method requires typing whole Unicode names, > or at least unambiguous parts of names, e.g. '\euro' inserts €, > '\smile' inserts ⌣, but can't type '\smiling face with sunglasses'. > Also I see a hex Unicode input method in uni-input.el that supports > e.g. U<hex> or u<hex>, RFC1345 mnemonics in rfc1345.el, > SGML entities in sgml-input.el. So adding a X11 Compose method would be handy. Agreed. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-15 2:34 ` Eli Zaretskii @ 2020-10-19 20:45 ` Juri Linkov 2020-10-19 23:12 ` Stefan Kangas 2020-10-20 14:12 ` Eli Zaretskii 0 siblings, 2 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-19 20:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 [-- Attachment #1: Type: text/plain, Size: 804 bytes --] >> It seems the TeX input method requires typing whole Unicode names, >> or at least unambiguous parts of names, e.g. '\euro' inserts €, >> '\smile' inserts ⌣, but can't type '\smiling face with sunglasses'. >> Also I see a hex Unicode input method in uni-input.el that supports >> e.g. U<hex> or u<hex>, RFC1345 mnemonics in rfc1345.el, >> SGML entities in sgml-input.el. So adding a X11 Compose method would be handy. > > Agreed. Here's is a working implementation. It binds all key sequences to the key 'C-+' that has the mnemonics of adding a character. 'C-+' is free because it can't be used to zoom text since its counterpart key 'C--' is already taken to input numeric arguments. 'C-+ C-+' is bound to 'insert-char' like the current longer key sequence 'C-x 8 RET' that is hard to type. [-- Attachment #2: x-compose.el --] [-- Type: application/emacs-lisp, Size: 1474 bytes --] ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-19 20:45 ` Juri Linkov @ 2020-10-19 23:12 ` Stefan Kangas 2020-10-20 18:42 ` Juri Linkov 2020-10-20 14:12 ` Eli Zaretskii 1 sibling, 1 reply; 109+ messages in thread From: Stefan Kangas @ 2020-10-19 23:12 UTC (permalink / raw) To: Juri Linkov, Eli Zaretskii; +Cc: rpluim, 43866 Juri Linkov <juri@linkov.net> writes: > Here's is a working implementation. It binds all key sequences to the key > 'C-+' that has the mnemonics of adding a character. 'C-+' is free because > it can't be used to zoom text since its counterpart key 'C--' is already > taken to input numeric arguments. Right, but the idea of using it still makes me feel a bit uneasy. May I suggest that we use a different key for this? A while back, RMS suggested that we could bind `C-+' to text-scale-adjust even if we can't bind `C-'. I was not super enthusiastic about this at the time, but perhaps that idea is the least bad option. One could imagine that in combination with, for example, optionally binding the numerical prefix argument only to `M--'. We could perhaps then consider enabling that in the "beginner friendly profile" we have been discussing on emacs-devel (but that no one has yet seriously worked on). ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-19 23:12 ` Stefan Kangas @ 2020-10-20 18:42 ` Juri Linkov 0 siblings, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-20 18:42 UTC (permalink / raw) To: Stefan Kangas; +Cc: 43866, rpluim > May I suggest that we use a different key for this? > > A while back, RMS suggested that we could bind `C-+' to > text-scale-adjust even if we can't bind `C-'. I was not super > enthusiastic about this at the time, but perhaps that idea is the least > bad option. > > One could imagine that in combination with, for example, optionally > binding the numerical prefix argument only to `M--'. We could perhaps > then consider enabling that in the "beginner friendly profile" we have > been discussing on emacs-devel (but that no one has yet seriously worked > on). Ah, C-+ could be suitable for a beginner profile indeed. What key other programs use for a Compose-like Multi_key? https://help.ubuntu.com/community/ComposeKey says that the default Compose Multi_key is Shift+AltGr, and the Unicode composition key is Shift+Ctrl+U. And indeed Shift+Ctrl+U works in all applications and xterm, but not in Emacs on X (in Emacs on tty Shift+Ctrl+U works). And here is more information for different systems: https://en.wikipedia.org/wiki/Compose_key https://en.wikipedia.org/wiki/Unicode_input ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-19 20:45 ` Juri Linkov 2020-10-19 23:12 ` Stefan Kangas @ 2020-10-20 14:12 ` Eli Zaretskii 2020-10-20 14:47 ` Robert Pluim ` (2 more replies) 1 sibling, 3 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-20 14:12 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Mon, 19 Oct 2020 23:45:48 +0300 > > Here's is a working implementation. It binds all key sequences to the key > 'C-+' that has the mnemonics of adding a character. 'C-+' is free because > it can't be used to zoom text since its counterpart key 'C--' is already > taken to input numeric arguments. 'C-+ C-+' is bound to 'insert-char' > like the current longer key sequence 'C-x 8 RET' that is hard to type. The implementation seems to rely on a file in the /usr/include tree that might not be there. This is a significant disadvantage, IMO. It means that, unlike all other similar facilities in Emacs, this one is not self-contained. Is it possible to lift this limitation? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 14:12 ` Eli Zaretskii @ 2020-10-20 14:47 ` Robert Pluim 2020-10-20 15:50 ` Eli Zaretskii 2020-10-20 18:44 ` Juri Linkov 2020-10-20 19:05 ` Juri Linkov 2020-10-20 19:56 ` Juri Linkov 2 siblings, 2 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-20 14:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866, Juri Linkov >>>>> On Tue, 20 Oct 2020 17:12:02 +0300, Eli Zaretskii <eliz@gnu.org> said: >> From: Juri Linkov <juri@linkov.net> >> Cc: rpluim@gmail.com, 43866@debbugs.gnu.org >> Date: Mon, 19 Oct 2020 23:45:48 +0300 >> >> Here's is a working implementation. It binds all key sequences to the key >> 'C-+' that has the mnemonics of adding a character. 'C-+' is free because >> it can't be used to zoom text since its counterpart key 'C--' is already >> taken to input numeric arguments. 'C-+ C-+' is bound to 'insert-char' >> like the current longer key sequence 'C-x 8 RET' that is hard to type. Eli> The implementation seems to rely on a file in the /usr/include tree Eli> that might not be there. This is a significant disadvantage, IMO. It Eli> means that, unlike all other similar facilities in Emacs, this one is Eli> not self-contained. Eli> Is it possible to lift this limitation? Aren't all those definitions in lisp/term/x-win.el anyway? Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 14:47 ` Robert Pluim @ 2020-10-20 15:50 ` Eli Zaretskii 2020-10-20 18:44 ` Juri Linkov 1 sibling, 0 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-20 15:50 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866, juri > From: Robert Pluim <rpluim@gmail.com> > Cc: Juri Linkov <juri@linkov.net>, 43866@debbugs.gnu.org > Date: Tue, 20 Oct 2020 16:47:12 +0200 > > Eli> The implementation seems to rely on a file in the /usr/include tree > Eli> that might not be there. This is a significant disadvantage, IMO. It > Eli> means that, unlike all other similar facilities in Emacs, this one is > Eli> not self-contained. > > Eli> Is it possible to lift this limitation? > > Aren't all those definitions in lisp/term/x-win.el anyway? Probably. But even that is sub-optimal (though better than reading a /usr/include file): it is only available on X. What about TTY sessions, what about w32? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 14:47 ` Robert Pluim 2020-10-20 15:50 ` Eli Zaretskii @ 2020-10-20 18:44 ` Juri Linkov 1 sibling, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-20 18:44 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 > Eli> The implementation seems to rely on a file in the /usr/include tree > Eli> that might not be there. This is a significant disadvantage, IMO. It > Eli> means that, unlike all other similar facilities in Emacs, this one is > Eli> not self-contained. > > Eli> Is it possible to lift this limitation? > > Aren't all those definitions in lisp/term/x-win.el anyway? It seems the list in lisp/term/x-win.el is not needed at run-time, since Eli want to pre-generate these keymappings, so at the time of generation, keysymdef.h can be used because we need such mappings as from XK_Aogonek to U+0104, not from 0x01a1 to U+0104 like in x-win.el. The only remaining problem with keysymdef.h is how to process such definitions in keysymdef.h: #define XK_KP_Enter 0xff8d /* Enter */ There is no Unicode character, even in x-win.el. I guess it should be hard-coded to map directly to [kp-enter]. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 14:12 ` Eli Zaretskii 2020-10-20 14:47 ` Robert Pluim @ 2020-10-20 19:05 ` Juri Linkov 2020-10-21 8:11 ` Robert Pluim 2020-10-20 19:56 ` Juri Linkov 2 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-20 19:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 > The implementation seems to rely on a file in the /usr/include tree > that might not be there. This is a significant disadvantage, IMO. It > means that, unlike all other similar facilities in Emacs, this one is > not self-contained. > > Is it possible to lift this limitation? Yes, this is easy to do. But I have one problem: /usr/share/X11/locale/en_US.UTF-8/Compose contains 83 lines where a key sequence maps to 2 characters, not to 1 character, e.g. <Multi_key> <acute> <Cyrillic_u> : "у́" # CYRILLIC SMALL LETTER U WITH COMBINING ACUTE ACCENT where "у́" is 2 characters: CYRILLIC SMALL LETTER U and COMBINING ACUTE ACCENT. iso-transl.el maps a key sequence to a single character only using (define-key map (apply 'vector '(?' ?у)) (vector ?у)) I don't know how to map a key sequence to 2 characters. When trying to map to 2 characters ?у and ?́ : (define-key map (apply 'vector '(?' ?у)) (vector ?у ?́ )) typing 'y inserts only the last character ?́ , not both ?у and ?́ . ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 19:05 ` Juri Linkov @ 2020-10-21 8:11 ` Robert Pluim 2020-10-21 14:29 ` Eli Zaretskii 2020-10-21 17:30 ` Juri Linkov 0 siblings, 2 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-21 8:11 UTC (permalink / raw) To: Juri Linkov; +Cc: 43866 >>>>> On Tue, 20 Oct 2020 22:05:31 +0300, Juri Linkov <juri@linkov.net> said: >> The implementation seems to rely on a file in the /usr/include tree >> that might not be there. This is a significant disadvantage, IMO. It >> means that, unlike all other similar facilities in Emacs, this one is >> not self-contained. >> >> Is it possible to lift this limitation? Juri> Yes, this is easy to do. But I have one problem: Juri> /usr/share/X11/locale/en_US.UTF-8/Compose contains 83 lines Juri> where a key sequence maps to 2 characters, not to 1 character, e.g. Juri> <Multi_key> <acute> <Cyrillic_u> : "у́" # CYRILLIC SMALL LETTER U WITH COMBINING ACUTE ACCENT Juri> where "у́" is 2 characters: CYRILLIC SMALL LETTER U and COMBINING ACUTE ACCENT. Juri> iso-transl.el maps a key sequence to a single character only using Juri> (define-key map (apply 'vector '(?' ?у)) (vector ?у)) Juri> I don't know how to map a key sequence to 2 characters. Juri> When trying to map to 2 characters ?у and ?́ : Juri> (define-key map (apply 'vector '(?' ?у)) (vector ?у ?́ )) Juri> typing 'y inserts only the last character ?́ , not both ?у and ?́ . Canʼt you pass a string containing ?y and ?́ as the last argument to define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX syntax to stop Emacs combining that U+0301 with the question mark) Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 8:11 ` Robert Pluim @ 2020-10-21 14:29 ` Eli Zaretskii 2020-10-21 14:40 ` Robert Pluim 2020-10-21 17:30 ` Juri Linkov 1 sibling, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-21 14:29 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866, juri > From: Robert Pluim <rpluim@gmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, 43866@debbugs.gnu.org > Date: Wed, 21 Oct 2020 10:11:55 +0200 > > Canʼt you pass a string containing ?y and ?́ as the last argument to > define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX > syntax to stop Emacs combining that U+0301 with the question mark) The character composition happens only on display, the buffer or string still have two codepoints. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 14:29 ` Eli Zaretskii @ 2020-10-21 14:40 ` Robert Pluim 2020-10-21 15:23 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-21 14:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866, juri >>>>> On Wed, 21 Oct 2020 17:29:58 +0300, Eli Zaretskii <eliz@gnu.org> said: >> From: Robert Pluim <rpluim@gmail.com> >> Cc: Eli Zaretskii <eliz@gnu.org>, 43866@debbugs.gnu.org >> Date: Wed, 21 Oct 2020 10:11:55 +0200 >> >> Canʼt you pass a string containing ?y and ?́ as the last argument to >> define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX >> syntax to stop Emacs combining that U+0301 with the question mark) Eli> The character composition happens only on display, the buffer or Eli> string still have two codepoints. Yes. But when looking at the code it would look like a single glyph, which would be confusing. Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 14:40 ` Robert Pluim @ 2020-10-21 15:23 ` Eli Zaretskii 0 siblings, 0 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-21 15:23 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866, juri > From: Robert Pluim <rpluim@gmail.com> > Cc: juri@linkov.net, 43866@debbugs.gnu.org > Date: Wed, 21 Oct 2020 16:40:47 +0200 > > Eli> The character composition happens only on display, the buffer or > Eli> string still have two codepoints. > > Yes. But when looking at the code it would look like a single glyph, > which would be confusing. We could have a comment about that. IMO, using the ?\N{NAME} for that is gross. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 8:11 ` Robert Pluim 2020-10-21 14:29 ` Eli Zaretskii @ 2020-10-21 17:30 ` Juri Linkov 1 sibling, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-21 17:30 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 >> I don't know how to map a key sequence to 2 characters. >> When trying to map to 2 characters ?у and ?́ : > >> (define-key map (apply 'vector '(?' ?у)) (vector ?у ?́ )) > >> typing 'y inserts only the last character ?́ , not both ?у and ?́ . > > Canʼt you pass a string containing ?y and ?́ as the last argument to > define-key? (although you might want to use the ?\N{NAME} or ?\uXXXX > syntax to stop Emacs combining that U+0301 with the question mark) I tried to use a string as the last argument to define-key, and the result is weird: the Help buffer says that the key binding is actually a keyboard macro, and invoking it does strange things. For example, when a binding is a string "ö", then typing its keys calls the command 'upcase-word'. No idea why it works this way. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 14:12 ` Eli Zaretskii 2020-10-20 14:47 ` Robert Pluim 2020-10-20 19:05 ` Juri Linkov @ 2020-10-20 19:56 ` Juri Linkov 2020-10-21 14:02 ` Eli Zaretskii 2 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-20 19:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 > The implementation seems to rely on a file in the /usr/include tree > that might not be there. This is a significant disadvantage, IMO. It > means that, unlike all other similar facilities in Emacs, this one is > not self-contained. > > Is it possible to lift this limitation? I tried to generate an output with a list of characters, but can't find a print-related variable that would print a number as a character. For example, currently (prin1 ?⌘ (current-buffer)) => 8984 prints the number 8984, but I need to print the character, i.e. (prin1 ?⌘ (current-buffer)) => ?⌘ There is the variable 'float-output-format' that affects the output of floating-point numbers, e.g. (let ((float-output-format "%.2f")) (prin1 12.345 (current-buffer))) but I can't find a variable to print characters instead of integers. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-20 19:56 ` Juri Linkov @ 2020-10-21 14:02 ` Eli Zaretskii 2020-10-21 17:23 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-21 14:02 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Tue, 20 Oct 2020 22:56:07 +0300 > > I tried to generate an output with a list of characters, > but can't find a print-related variable that would > print a number as a character. > > For example, currently > > (prin1 ?⌘ (current-buffer)) => 8984 > > prints the number 8984, but I need to print the character, i.e. > > (prin1 ?⌘ (current-buffer)) => ?⌘ I don't think I understand what you are looking for. Would using the %c format in a call to 'format' be okay? If not, why not? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 14:02 ` Eli Zaretskii @ 2020-10-21 17:23 ` Juri Linkov 2020-10-21 18:16 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-21 17:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 >> I tried to generate an output with a list of characters, >> but can't find a print-related variable that would >> print a number as a character. >> >> For example, currently >> >> (prin1 ?⌘ (current-buffer)) => 8984 >> >> prints the number 8984, but I need to print the character, i.e. >> >> (prin1 ?⌘ (current-buffer)) => ?⌘ > > I don't think I understand what you are looking for. Would using the > %c format in a call to 'format' be okay? If not, why not? The problem is that it's necessary to print a long list with vectors that contain characters. For example: (prin1 '(("'A" . [?Á]) ("'E" . [?É]) ("'I" . [?Í]) ("'O" . [?Ó]) ("'U" . [?Ú]) ("'Y" . [?Ý])) (current-buffer)) currently prints: (("'A" . [193]) ("'E" . [201]) ("'I" . [205]) ("'O" . [211]) ("'U" . [218]) ("'Y" . [221])) whereas it would be nicer to print characters as characters, not as integers: (("'A" . [?Á]) ("'E" . [?É]) ("'I" . [?Í]) ("'O" . [?Ó]) ("'U" . [?Ú]) ("'Y" . [?Ý])) I can't find a variable that could change the output format of integers to print them as characters. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 17:23 ` Juri Linkov @ 2020-10-21 18:16 ` Eli Zaretskii 2020-10-21 18:27 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-21 18:16 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Wed, 21 Oct 2020 20:23:51 +0300 > > > I don't think I understand what you are looking for. Would using the > > %c format in a call to 'format' be okay? If not, why not? > > The problem is that it's necessary to print a long list with vectors > that contain characters. For example: > > (prin1 '(("'A" . [?Á]) > ("'E" . [?É]) > ("'I" . [?Í]) > ("'O" . [?Ó]) > ("'U" . [?Ú]) > ("'Y" . [?Ý])) > (current-buffer)) Why do you have to use prin1? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 18:16 ` Eli Zaretskii @ 2020-10-21 18:27 ` Juri Linkov 2020-10-21 18:35 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-21 18:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 >> The problem is that it's necessary to print a long list with vectors >> that contain characters. For example: >> >> (prin1 '(("'A" . [?Á]) >> ("'E" . [?É]) >> ("'I" . [?Í]) >> ("'O" . [?Ó]) >> ("'U" . [?Ú]) >> ("'Y" . [?Ý])) >> (current-buffer)) > > Why do you have to use prin1? Actually I need to use pp-to-string to pretty-print the list, but pp-to-string calls '(prin1 object (current-buffer))'. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 18:27 ` Juri Linkov @ 2020-10-21 18:35 ` Eli Zaretskii 2020-10-21 19:39 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-21 18:35 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Wed, 21 Oct 2020 21:27:16 +0300 > > >> (prin1 '(("'A" . [?Á]) > >> ("'E" . [?É]) > >> ("'I" . [?Í]) > >> ("'O" . [?Ó]) > >> ("'U" . [?Ú]) > >> ("'Y" . [?Ý])) > >> (current-buffer)) > > > > Why do you have to use prin1? > > Actually I need to use pp-to-string to pretty-print the list, > but pp-to-string calls '(prin1 object (current-buffer))'. prin1 accepts a function as its 2nd argument; can you use that? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 18:35 ` Eli Zaretskii @ 2020-10-21 19:39 ` Juri Linkov 2020-10-22 12:59 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-21 19:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866 [-- Attachment #1: Type: text/plain, Size: 1084 bytes --] >> >> (prin1 '(("'A" . [?Á]) >> >> ("'E" . [?É]) >> >> ("'I" . [?Í]) >> >> ("'O" . [?Ó]) >> >> ("'U" . [?Ú]) >> >> ("'Y" . [?Ý])) >> >> (current-buffer)) >> > >> > Why do you have to use prin1? >> >> Actually I need to use pp-to-string to pretty-print the list, >> but pp-to-string calls '(prin1 object (current-buffer))'. > > prin1 accepts a function as its 2nd argument; can you use that? I tried to use a function in the 2nd argument, but it's called for every digit of the integer that represents a character, so I don't know what to do with these digits. However, do you think something like the following is a good idea? Let-binding a new variable 'print-integers-as-chars' to t: (let ((print-integers-as-chars t)) (pp '(("'A" . [?Á]) ("'E" . [?É]) ("'I" . [?Í]) ("'O" . [?Ó]) ("'U" . [?Ú]) ("'Y" . [?Ý])) (current-buffer))) prints integers as characters: (("'A" . [?Á]) ("'E" . [?É]) ("'I" . [?Í]) ("'O" . [?Ó]) ("'U" . [?Ú]) ("'Y" . [?Ý])) with this patch: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: print-integers-as-chars.patch --] [-- Type: text/x-diff, Size: 1244 bytes --] diff --git a/src/print.c b/src/print.c index dca095f281..1755eea738 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,16 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + if (!NILP (Vprint_integers_as_chars) && CHARACTERP (obj)) + { + int len = sprintf (buf, "%s", SDATA (call1 (intern ("prin1-char"), obj))); + strout (buf, len, len, printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2255,10 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars, + doc: /* Print integers as characters. */); + Vprint_integers_as_chars = Qnil; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-21 19:39 ` Juri Linkov @ 2020-10-22 12:59 ` Eli Zaretskii 2020-10-22 20:56 ` bug#44155: Print integers as characters Juri Linkov 2022-04-30 12:19 ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen 0 siblings, 2 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-22 12:59 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 > From: Juri Linkov <juri@linkov.net> > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Wed, 21 Oct 2020 22:39:08 +0300 > > However, do you think something like the following is a good idea? > > Let-binding a new variable 'print-integers-as-chars' to t: > > (let ((print-integers-as-chars t)) > (pp '(("'A" . [?Á]) > ("'E" . [?É]) > ("'I" . [?Í]) > ("'O" . [?Ó]) > ("'U" . [?Ú]) > ("'Y" . [?Ý])) > (current-buffer))) > > prints integers as characters: > > (("'A" . [?Á]) > ("'E" . [?É]) > ("'I" . [?Í]) > ("'O" . [?Ó]) > ("'U" . [?Ú]) > ("'Y" . [?Ý])) > > with this patch: The idea is fine, but I have a few comments about implementation: > case_Lisp_Int: > { > - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); > - strout (buf, len, len, printcharfun); > + if (!NILP (Vprint_integers_as_chars) && CHARACTERP (obj)) ^^^^^^^^^^^^^^^^^^^^^^^^ If this is supposed to be a boolean variable, please use DEFVAR_BOOL, with all the consequences. > + int len = sprintf (buf, "%s", SDATA (call1 (intern ("prin1-char"), obj))); Do we really need to call Lisp? I thought we were quite capable of printing characters from C, aren't we? > @@ -2247,6 +2255,10 @@ syms_of_print (void) > that represents the number without losing information. */); > Vfloat_output_format = Qnil; > > + DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars, > + doc: /* Print integers as characters. */); > + Vprint_integers_as_chars = Qnil; I wonder whether it wouldn't be cleaner to add another optional argument to prin1, and let it bind some internal variable so that print_object does this, instead of exposing this knob to Lisp. Because print_object is used all over the place, and who knows what will this do to other callers? Thanks. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-22 12:59 ` Eli Zaretskii @ 2020-10-22 20:56 ` Juri Linkov 2020-10-22 22:39 ` Andreas Schwab 2020-11-01 12:03 ` Mattias Engdegård 2022-04-30 12:19 ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen 1 sibling, 2 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-22 20:56 UTC (permalink / raw) To: 44155 [-- Attachment #1: Type: text/plain, Size: 2932 bytes --] Tags: patch [Creating a separate feature request from bug#43866] >> Let-binding a new variable 'print-integers-as-chars' to t: >> >> (let ((print-integers-as-chars t)) >> (pp '(("'A" . [?Á]) >> ("'E" . [?É]) >> ("'I" . [?Í]) >> ("'O" . [?Ó]) >> ("'U" . [?Ú]) >> ("'Y" . [?Ý])) >> (current-buffer))) >> >> prints integers as characters: >> >> (("'A" . [?Á]) >> ("'E" . [?É]) >> ("'I" . [?Í]) >> ("'O" . [?Ó]) >> ("'U" . [?Ú]) >> ("'Y" . [?Ý])) >> >> with this patch: > > The idea is fine, but I have a few comments about implementation: > >> case_Lisp_Int: >> { >> - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); >> - strout (buf, len, len, printcharfun); >> + if (!NILP (Vprint_integers_as_chars) && CHARACTERP (obj)) > ^^^^^^^^^^^^^^^^^^^^^^^^ > If this is supposed to be a boolean variable, please use DEFVAR_BOOL, > with all the consequences. Fixed in the next patch. >> + int len = sprintf (buf, "%s", SDATA (call1 (intern ("prin1-char"), obj))); > > Do we really need to call Lisp? I thought we were quite capable of > printing characters from C, aren't we? Thanks for the hint. Now the patch uses only C functions. (My initial idea was to use eval-expression-print-format as a base that has (let ((char-string (and (characterp value) (<= value eval-expression-print-maximum-character) (char-displayable-p value) (prin1-char value)))) but it seems only the condition 'characterp' is needed in C implementation.) >> @@ -2247,6 +2255,10 @@ syms_of_print (void) >> that represents the number without losing information. */); >> Vfloat_output_format = Qnil; >> >> + DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars, >> + doc: /* Print integers as characters. */); >> + Vprint_integers_as_chars = Qnil; > > I wonder whether it wouldn't be cleaner to add another optional > argument to prin1, and let it bind some internal variable so that > print_object does this, instead of exposing this knob to Lisp. > Because print_object is used all over the place, and who knows what > will this do to other callers? The variable 'print-integers-as-chars' is modeled after many similar variables that affect the prin1 output: - print-escape-control-characters - print-escape-newlines - print-escape-nonascii - print-escape-multibyte - print-length - print-level - print-quoted - print-circle - float-output-format But now this leads me to think that maybe the new variable should be like 'float-output-format', so it could be named 'integer-output-format' and support options for different integer formats: - 'character': print integers as characters; - 'decimal': the default format; - 'binary': print integers as e.g. #b010101; - 'octal': print integers as e.g. #o777; - 'hex': print integers as e.g. #x00ff; [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: print-integers-as-characters.patch --] [-- Type: text/x-diff, Size: 1219 bytes --] diff --git a/src/print.c b/src/print.c index dca095f281..909c55efed 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,16 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + if (print_integers_as_characters && CHARACTERP (obj)) + { + printchar ('?', printcharfun); + print_string (CALLN (Fstring, obj), printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2255,10 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, + doc: /* Print integers as characters. */); + print_integers_as_characters = 0; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-22 20:56 ` bug#44155: Print integers as characters Juri Linkov @ 2020-10-22 22:39 ` Andreas Schwab 2020-10-23 8:16 ` Juri Linkov 2020-10-23 8:32 ` Juri Linkov 2020-11-01 12:03 ` Mattias Engdegård 1 sibling, 2 replies; 109+ messages in thread From: Andreas Schwab @ 2020-10-22 22:39 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155 On Okt 22 2020, Juri Linkov wrote: > diff --git a/src/print.c b/src/print.c > index dca095f281..909c55efed 100644 > --- a/src/print.c > +++ b/src/print.c > @@ -1908,8 +1908,16 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) > { > case_Lisp_Int: > { > - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); > - strout (buf, len, len, printcharfun); > + if (print_integers_as_characters && CHARACTERP (obj)) > + { > + printchar ('?', printcharfun); > + print_string (CALLN (Fstring, obj), printcharfun); That will create ambigous output. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-22 22:39 ` Andreas Schwab @ 2020-10-23 8:16 ` Juri Linkov 2020-10-23 8:32 ` Juri Linkov 1 sibling, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-23 8:16 UTC (permalink / raw) To: Andreas Schwab; +Cc: 44155 >> + if (print_integers_as_characters && CHARACTERP (obj)) >> + { >> + printchar ('?', printcharfun); >> + print_string (CALLN (Fstring, obj), printcharfun); > > That will create ambigous output. No ambiguities found: (let ((strings (make-hash-table :test 'equal))) (dotimes (i (max-char)) (let ((s (string i))) (if (gethash s strings) (message "! %S %S" s (gethash s strings)) (puthash s i strings))))) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-22 22:39 ` Andreas Schwab 2020-10-23 8:16 ` Juri Linkov @ 2020-10-23 8:32 ` Juri Linkov 2020-10-24 19:53 ` Juri Linkov 1 sibling, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-23 8:32 UTC (permalink / raw) To: Andreas Schwab; +Cc: 44155 >> + if (print_integers_as_characters && CHARACTERP (obj)) >> + { >> + printchar ('?', printcharfun); >> + print_string (CALLN (Fstring, obj), printcharfun); > > That will create ambigous output. Or do you mean: (dotimes (i (max-char)) (condition-case err (unless (eq i (read (concat "?" (string i)))) (message "%d ?%s" i (string i))) (error (message "%d ?%s ;; %s" i (string i) (error-message-string err))))) 92 ?\ ;; End of file during parsing 4194176 ?\200 ... 4194302 ?\376 ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-23 8:32 ` Juri Linkov @ 2020-10-24 19:53 ` Juri Linkov 2020-10-25 17:22 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-24 19:53 UTC (permalink / raw) To: Andreas Schwab; +Cc: 44155 [-- Attachment #1: Type: text/plain, Size: 1457 bytes --] >>> + if (print_integers_as_characters && CHARACTERP (obj)) >>> + { >>> + printchar ('?', printcharfun); >>> + print_string (CALLN (Fstring, obj), printcharfun); >> >> That will create ambigous output. > > Or do you mean: > > (dotimes (i (max-char)) > (condition-case err > (unless (eq i (read (concat "?" (string i)))) > (message "%d ?%s" i (string i))) > (error (message "%d ?%s ;; %s" i (string i) (error-message-string err))))) > > 92 ?\ ;; End of file during parsing > 4194176 ?\200 > ... > 4194302 ?\376 Now the following patch on this code (let ((integer-output-format t)) (pp '(?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ 4194176) (current-buffer))) outputs (?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ 4194176) and no ambiguities found with (let ((integer-output-format t)) (dotimes (i (+ (max-char) 2)) (condition-case err (unless (eq i (read (format "%S" i))) (message "%d ?%s" i (string i))) (error (message "%d ?%s ;; %s" i (string i) (error-message-string err)))))) The list of escaped characters was taken from 'prin1-char', not from a similar list in 'print_object' in 'case Lisp_Symbol' branch. Also 'integer-output-format' prints integers in hex format when set to 16. (let ((integer-output-format 16)) (pp '(?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ 4194176) (current-buffer))) => (#x3b #x28 #x29 #x7b #x7d #x5b #x5d #x22 #x27 #x5c #x3fff80) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: integer-output-format.patch --] [-- Type: text/x-diff, Size: 1897 bytes --] diff --git a/src/print.c b/src/print.c index 53aa353769..53c8c4c91a 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,29 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + EMACS_INT c = XFIXNUM (obj); + + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && c < 4194176) + { + printchar ('?', printcharfun); + + if (escapeflag + && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' + || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + printchar ('\\', printcharfun); + print_string (Fchar_to_string (obj), printcharfun); + } + else if (INTEGERP (Vinteger_output_format) + && XFIXNUM (Vinteger_output_format) == 16 && c >= 0) + { + int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) c); + strout (buf, len, len, printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", c); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2268,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, + doc: /* The format used to print integers. +When 't', print integers as characters. +When a number 16, print numbers in hex format. +Otherwise, print integers in decimal format. */); + Vinteger_output_format = Qnil; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-24 19:53 ` Juri Linkov @ 2020-10-25 17:22 ` Eli Zaretskii 2020-10-25 19:09 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-25 17:22 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, schwab > From: Juri Linkov <juri@linkov.net> > Date: Sat, 24 Oct 2020 22:53:44 +0300 > Cc: 44155@debbugs.gnu.org > > + EMACS_INT c = XFIXNUM (obj); There's no need to use EMACS_INT, a character code is at most 22 bits, so it always fits into an 'int'. > + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && c < 4194176) ^^^^^^^ Please use MAX_5_BYTE_CHAR here. Or, better yet, CHAR_BYTE8_P. And, btw, why not allow raw bytes here as well? is there some problem? > + { > + printchar ('?', printcharfun); > + > + if (escapeflag > + && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' > + || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) > + printchar ('\\', printcharfun); > + print_string (Fchar_to_string (obj), printcharfun); Why are you using print_string here instead of printchar? IOW, what is the difference between printing a backslash and printing any other character, that you can use printchar for the former, but not for the latter? > + else if (INTEGERP (Vinteger_output_format) > + && XFIXNUM (Vinteger_output_format) == 16 && c >= 0) If you really want to allow Vinteger_output_format to be a bignum, you cannot use XFIXNUM with it, you need to use integer_to_intmax or somesuch. Otherwise, you should use FIXNUMP instead of INTEGERP. > + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, > + doc: /* The format used to print integers. > +When 't', print integers as characters. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ But only integers that are small enough, yes? > +When a number 16, print numbers in hex format. This immediately begs the question: why cannot the value be 8 or 2? Thanks. P.S. This will eventually need a NEWS entry. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-25 17:22 ` Eli Zaretskii @ 2020-10-25 19:09 ` Juri Linkov 2020-10-25 19:53 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-25 19:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab [-- Attachment #1: Type: text/plain, Size: 1914 bytes --] >> + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && c < 4194176) > ^^^^^^^ > > Please use MAX_5_BYTE_CHAR here. Or, better yet, CHAR_BYTE8_P. Thanks, fixed. > And, btw, why not allow raw bytes here as well? is there some problem? Because of ambiguity, both these return the same value: (read (concat "?" (string 128))) => 128 (read (concat "?" (string 4194176))) => 128 >> + print_string (Fchar_to_string (obj), printcharfun); > > Why are you using print_string here instead of printchar? IOW, what > is the difference between printing a backslash and printing any other > character, that you can use printchar for the former, but not for the > latter? It was needed in earlier versions, but not now; fixed. >> + else if (INTEGERP (Vinteger_output_format) >> + && XFIXNUM (Vinteger_output_format) == 16 && c >= 0) > > If you really want to allow Vinteger_output_format to be a bignum, you > cannot use XFIXNUM with it, you need to use integer_to_intmax or > somesuch. Otherwise, you should use FIXNUMP instead of INTEGERP. Fixed. >> + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, >> + doc: /* The format used to print integers. >> +When 't', print integers as characters. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > But only integers that are small enough, yes? Fixed the docstring as well. >> +When a number 16, print numbers in hex format. > > This immediately begs the question: why cannot the value be 8 or 2? Because octal and binary are not so widely used as hex. But variable makes room for further improvements to later support octal and binary too, and maybe string formats like in float-output-format. > P.S. This will eventually need a NEWS entry. And also updates in the Info manual will be in the final version of the patch. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: integer-output-format-2.patch --] [-- Type: text/x-diff, Size: 2008 bytes --] diff --git a/src/print.c b/src/print.c index 53aa353769..b04d5023f8 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + int c = XFIXNUM (obj); + intmax_t i; + + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) && ! CHAR_BYTE8_P (c)) + { + printchar ('?', printcharfun); + if (escapeflag + && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' + || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + printchar ('\\', printcharfun); + printchar (c, printcharfun); + } + else if (INTEGERP (Vinteger_output_format) + && integer_to_intmax (Vinteger_output_format, &i) + && i == 16 && XFIXNUM (obj) >= 0) + { + int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2269,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, + doc: /* The format used to print integers. +When 't', print characters from integers that represent characters. +When a number 16, print non-negative numbers in hex format. +Otherwise, print integers in decimal format. */); + Vinteger_output_format = Qnil; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-25 19:09 ` Juri Linkov @ 2020-10-25 19:53 ` Eli Zaretskii 2020-10-27 20:08 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-25 19:53 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, schwab > From: Juri Linkov <juri@linkov.net> > Cc: schwab@linux-m68k.org, 44155@debbugs.gnu.org > Date: Sun, 25 Oct 2020 21:09:07 +0200 > > > And, btw, why not allow raw bytes here as well? is there some problem? > > Because of ambiguity, both these return the same value: > > (read (concat "?" (string 128))) => 128 > (read (concat "?" (string 4194176))) => 128 And why is that a problem? Alternatively, we could print raw bytes in some special way. But not treating them as characters sounds some subtlety that will be hard to explain. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-25 19:53 ` Eli Zaretskii @ 2020-10-27 20:08 ` Juri Linkov 2020-10-28 15:51 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-27 20:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab [-- Attachment #1: Type: text/plain, Size: 728 bytes --] >> > And, btw, why not allow raw bytes here as well? is there some problem? >> >> Because of ambiguity, both these return the same value: >> >> (read (concat "?" (string 128))) => 128 >> (read (concat "?" (string 4194176))) => 128 > > And why is that a problem? I don't know, Andreas remarked that it creates ambiguous output, and I fixed the reported problem. > Alternatively, we could print raw bytes in some special way. But not > treating them as characters sounds some subtlety that will be hard to > explain. The existing 'prin1-char' used as a reference implementation doesn't print integers like 4194176 as characters, so the patch does the same. Anyway, here is a complete patch with tests and documentation: [-- Attachment #2: integer-output-format-3.patch --] [-- Type: text/x-diff, Size: 4855 bytes --] diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index 2cd61ad04f..f171f13779 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -902,3 +902,11 @@ Output Variables in the C function @code{sprintf}. For further restrictions on what you can use, see the variable's documentation string. @end defvar + +@defvar integer-output-format +This variable specifies how to print integer numbers. The default is +@code{nil}, meaning use the decimal format. When bound to @code{t}, +print integers as characters when an integer represents a character +(@pxref{Basic Char Syntax}). When bound to the number @code{16}, +print non-negative integers in the hexadecimal format. +@end defvar diff --git a/etc/NEWS b/etc/NEWS index a77c1c883e..2f7d08ad08 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1631,6 +1631,12 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 +** New variable 'integer-output-format' defines the format of integers. +When this variable is bound to the value 't', integers are printed by +printing functions as characters when an integer represents a character. +When bound to the number 16, non-negative integers are printed in the +hexadecimal format. + +++ ** 'define-globalized-minor-mode' now takes a :predicate parameter. This can be used to control which major modes the minor mode should be diff --git a/src/print.c b/src/print.c index 53aa353769..a5c56c6b48 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + int c; + intmax_t i; + + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) + { + printchar ('?', printcharfun); + if (escapeflag + && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' + || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + printchar ('\\', printcharfun); + printchar (c, printcharfun); + } + else if (INTEGERP (Vinteger_output_format) + && integer_to_intmax (Vinteger_output_format, &i) + && i == 16 && Fnatnump (obj)) + { + int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2270,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, + doc: /* The format used to print integers. +When t, print characters from integers that represent a character. +When a number 16, print non-negative integers in the hexadecimal format. +Otherwise, by default print integers in the decimal format. */); + Vinteger_output_format = Qnil; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); diff --git a/test/src/print-tests.el b/test/src/print-tests.el index eb9572dbdf..7b026b6b21 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,5 +383,25 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) +(print-tests--deftest print-integer-output-format () + ;; Bug#44155. + (let ((integer-output-format t) + (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) + (should (equal (read (print-tests--prin1-to-string syms)) syms)) + (should (equal (print-tests--prin1-to-string syms) + (concat "(" (mapconcat #'prin1-char syms " ") ")")))) + (let ((integer-output-format t) + (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) + (should (equal (read (print-tests--prin1-to-string syms)) syms))) + (let ((integer-output-format 16) + (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) + (should (equal (read (print-tests--prin1-to-string syms)) syms)) + (should (equal (print-tests--prin1-to-string syms) + (concat "(" (mapconcat + (lambda (i) + (if (and (>= i 0) (<= i most-positive-fixnum)) + (format "#x%x" i) (format "%d" i))) + syms " ") ")"))))) + (provide 'print-tests) ;;; print-tests.el ends here ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-27 20:08 ` Juri Linkov @ 2020-10-28 15:51 ` Eli Zaretskii 2020-10-28 19:41 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-28 15:51 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, schwab > From: Juri Linkov <juri@linkov.net> > Cc: schwab@linux-m68k.org, 44155@debbugs.gnu.org > Date: Tue, 27 Oct 2020 22:08:12 +0200 > > > Alternatively, we could print raw bytes in some special way. But not > > treating them as characters sounds some subtlety that will be hard to > > explain. > > The existing 'prin1-char' used as a reference implementation > doesn't print integers like 4194176 as characters, so the patch > does the same. I don't think it's right, FWIW. Displaying something like \100 would be better, IMO. > +@defvar integer-output-format > +This variable specifies how to print integer numbers. The default is > +@code{nil}, meaning use the decimal format. When bound to @code{t}, > +print integers as characters when an integer represents a character > +(@pxref{Basic Char Syntax}). When bound to the number @code{16}, > +print non-negative integers in the hexadecimal format. This should mention the functions affected by the variable. > +** New variable 'integer-output-format' defines the format of integers. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ "determines how to print integer values" > +When this variable is bound to the value 't', integers are printed by > +printing functions as characters when an integer represents a character. Please give at least one example of a function affected by this. Thanks. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-28 15:51 ` Eli Zaretskii @ 2020-10-28 19:41 ` Juri Linkov 2020-10-29 14:20 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-28 19:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab [-- Attachment #1: Type: text/plain, Size: 1353 bytes --] >> > Alternatively, we could print raw bytes in some special way. But not >> > treating them as characters sounds some subtlety that will be hard to >> > explain. >> >> The existing 'prin1-char' used as a reference implementation >> doesn't print integers like 4194176 as characters, so the patch >> does the same. > > I don't think it's right, FWIW. Displaying something like \100 would > be better, IMO. Sorry, I don't understand why 4194176 could be printed as \100. >> +@defvar integer-output-format >> +This variable specifies how to print integer numbers. The default is >> +@code{nil}, meaning use the decimal format. When bound to @code{t}, >> +print integers as characters when an integer represents a character >> +(@pxref{Basic Char Syntax}). When bound to the number @code{16}, >> +print non-negative integers in the hexadecimal format. > > This should mention the functions affected by the variable. > >> +** New variable 'integer-output-format' defines the format of integers. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > "determines how to print integer values" > >> +When this variable is bound to the value 't', integers are printed by >> +printing functions as characters when an integer represents a character. > > Please give at least one example of a function affected by this. Ok, fixed: [-- Attachment #2: integer-output-format-4.patch --] [-- Type: text/x-diff, Size: 4991 bytes --] diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index 2cd61ad04f..08d8032e6f 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -902,3 +902,12 @@ Output Variables in the C function @code{sprintf}. For further restrictions on what you can use, see the variable's documentation string. @end defvar + +@defvar integer-output-format +This variable specifies how to print integer numbers. The default is +@code{nil}, meaning use the decimal format. When bound to @code{t}, +print integers as characters when an integer represents a character +(@pxref{Basic Char Syntax}). When bound to the number @code{16}, +print non-negative integers in the hexadecimal format. +This variable affects all print functions. +@end defvar diff --git a/etc/NEWS b/etc/NEWS index 5e159480e0..202e449b16 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1641,6 +1641,12 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 +** New variable 'integer-output-format' determines how to print integer values. +When this variable is bound to the value 't', integers are printed by +printing functions as characters when an integer represents a character. +When bound to the number 16, non-negative integers are printed in the +hexadecimal format. + +++ ** 'define-globalized-minor-mode' now takes a :predicate parameter. This can be used to control which major modes the minor mode should be diff --git a/src/print.c b/src/print.c index 53aa353769..7b3dc61065 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + int c; + intmax_t i; + + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) + { + printchar ('?', printcharfun); + if (escapeflag + && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' + || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + printchar ('\\', printcharfun); + printchar (c, printcharfun); + } + else if (INTEGERP (Vinteger_output_format) + && integer_to_intmax (Vinteger_output_format, &i) + && i == 16 && Fnatnump (obj)) + { + int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2270,15 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, + doc: /* The format used to print integers. +When t, print characters from integers that represent a character. +When a number 16, print non-negative integers in the hexadecimal format. +Otherwise, by default print integers in the decimal format. +This variable affects all print functions, for example, such function +as `print'. */); + Vinteger_output_format = Qnil; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); diff --git a/test/src/print-tests.el b/test/src/print-tests.el index eb9572dbdf..7b026b6b21 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,5 +383,25 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) +(print-tests--deftest print-integer-output-format () + ;; Bug#44155. + (let ((integer-output-format t) + (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) + (should (equal (read (print-tests--prin1-to-string syms)) syms)) + (should (equal (print-tests--prin1-to-string syms) + (concat "(" (mapconcat #'prin1-char syms " ") ")")))) + (let ((integer-output-format t) + (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) + (should (equal (read (print-tests--prin1-to-string syms)) syms))) + (let ((integer-output-format 16) + (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) + (should (equal (read (print-tests--prin1-to-string syms)) syms)) + (should (equal (print-tests--prin1-to-string syms) + (concat "(" (mapconcat + (lambda (i) + (if (and (>= i 0) (<= i most-positive-fixnum)) + (format "#x%x" i) (format "%d" i))) + syms " ") ")"))))) + (provide 'print-tests) ;;; print-tests.el ends here ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-28 19:41 ` Juri Linkov @ 2020-10-29 14:20 ` Eli Zaretskii 2020-10-29 21:00 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-29 14:20 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, schwab > From: Juri Linkov <juri@linkov.net> > Cc: schwab@linux-m68k.org, 44155@debbugs.gnu.org > Date: Wed, 28 Oct 2020 21:41:46 +0200 > > >> The existing 'prin1-char' used as a reference implementation > >> doesn't print integers like 4194176 as characters, so the patch > >> does the same. > > > > I don't think it's right, FWIW. Displaying something like \100 would > > be better, IMO. > > Sorry, I don't understand why 4194176 could be printed as \100. I meant \200, sorry. That's the raw byte that 4194176 stands for. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-29 14:20 ` Eli Zaretskii @ 2020-10-29 21:00 ` Juri Linkov 2020-10-30 7:35 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-29 21:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab [-- Attachment #1: Type: text/plain, Size: 482 bytes --] >> >> The existing 'prin1-char' used as a reference implementation >> >> doesn't print integers like 4194176 as characters, so the patch >> >> does the same. >> > >> > I don't think it's right, FWIW. Displaying something like \100 would >> > be better, IMO. >> >> Sorry, I don't understand why 4194176 could be printed as \100. > > I meant \200, sorry. That's the raw byte that 4194176 stands for. OK, in this patch the condition !CHAR_BYTE8_P(c) is removed, so it prints \200: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: integer-output-format-4.patch --] [-- Type: text/x-diff, Size: 2037 bytes --] diff --git a/src/print.c b/src/print.c index 53aa353769..20841eba61 100644 --- a/src/print.c +++ b/src/print.c @@ -1908,8 +1908,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + int c; + intmax_t i; + + if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) + && (c = XFIXNUM (obj))) + { + printchar ('?', printcharfun); + if (escapeflag + && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' + || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + printchar ('\\', printcharfun); + printchar (c, printcharfun); + } + else if (INTEGERP (Vinteger_output_format) + && integer_to_intmax (Vinteger_output_format, &i) + && i == 16 && !NILP (Fnatnump (obj))) + { + int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } + else + { + int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + strout (buf, len, len, printcharfun); + } } break; @@ -2247,6 +2270,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; + DEFVAR_LISP ("integer-output-format", Vinteger_output_format, + doc: /* The format used to print integers. +When t, print characters from integers that represent a character. +When a number 16, print non-negative integers in the hexadecimal format. +Otherwise, by default print integers in the decimal format. */); + Vinteger_output_format = Qnil; + DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. A value of nil means no limit. See also `eval-expression-print-length'. */); ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-29 21:00 ` Juri Linkov @ 2020-10-30 7:35 ` Eli Zaretskii 2020-10-31 20:11 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-30 7:35 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, schwab > From: Juri Linkov <juri@linkov.net> > Cc: schwab@linux-m68k.org, 44155@debbugs.gnu.org > Date: Thu, 29 Oct 2020 23:00:48 +0200 > > > I meant \200, sorry. That's the raw byte that 4194176 stands for. > > OK, in this patch the condition !CHAR_BYTE8_P(c) is removed, so > it prints \200: Thanks. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-30 7:35 ` Eli Zaretskii @ 2020-10-31 20:11 ` Juri Linkov 2020-10-31 23:27 ` Glenn Morris 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-10-31 20:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab tags 44155 fixed close 44155 28.0.50 quit >> > I meant \200, sorry. That's the raw byte that 4194176 stands for. >> >> OK, in this patch the condition !CHAR_BYTE8_P(c) is removed, so >> it prints \200: > > Thanks. Now pushed to master and closed. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-31 20:11 ` Juri Linkov @ 2020-10-31 23:27 ` Glenn Morris 2020-11-01 7:58 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Glenn Morris @ 2020-10-31 23:27 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, schwab New test fails on some systems. Ref: https://hydra.nixos.org/build/129474379 Reproduced on CentOS 8.2. Test print-integer-output-format condition: (ert-test-failed ((should (equal (read ...) syms)) :form (equal (-1 0 1 80 4194175 128 255 4194304) (-1 0 1 80 4194175 4194176 4194303 4194304)) :value nil :explanation (list-elt 5 (different-atoms (128 "#x80" "?") (4194176 "#x3fff80" "?\200"))))) FAILED 19/39 print-integer-output-format (0.002202 sec) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-31 23:27 ` Glenn Morris @ 2020-11-01 7:58 ` Juri Linkov 2020-11-01 15:13 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-11-01 7:58 UTC (permalink / raw) To: Glenn Morris; +Cc: 44155, schwab > New test fails on some systems. > > (equal > (-1 0 1 80 4194175 128 255 4194304) > (-1 0 1 80 4194175 4194176 4194303 4194304)) > :value nil :explanation > (list-elt 5 > (different-atoms > (128 "#x80" "?") > (4194176 "#x3fff80" "?\200"))))) This is because 4194176 is printed as ?\200 that is parsed as 128. This patch should fix test failures by printing integers for ambiguous characters. I'm sure no user would complain that numbers between 4194176 and 4194303 are printed as integers. diff --git a/src/print.c b/src/print.c index fa65a3cb26..49daf753bd 100644 --- a/src/print.c +++ b/src/print.c @@ -1912,7 +1912,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) intmax_t i; if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) - && (c = XFIXNUM (obj))) + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) { printchar ('?', printcharfun); if (escapeflag ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 7:58 ` Juri Linkov @ 2020-11-01 15:13 ` Eli Zaretskii 2020-11-01 18:39 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-11-01 15:13 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, rgm, schwab > From: Juri Linkov <juri@linkov.net> > Cc: Eli Zaretskii <eliz@gnu.org>, 44155@debbugs.gnu.org, > schwab@linux-m68k.org > Date: Sun, 01 Nov 2020 09:58:25 +0200 > > > New test fails on some systems. > > > > (equal > > (-1 0 1 80 4194175 128 255 4194304) > > (-1 0 1 80 4194175 4194176 4194303 4194304)) > > :value nil :explanation > > (list-elt 5 > > (different-atoms > > (128 "#x80" "?") > > (4194176 "#x3fff80" "?\200"))))) > > This is because 4194176 is printed as ?\200 that is parsed as 128. > > This patch should fix test failures by printing integers > for ambiguous characters. I'm sure no user would complain > that numbers between 4194176 and 4194303 are printed as integers. > > diff --git a/src/print.c b/src/print.c > index fa65a3cb26..49daf753bd 100644 > --- a/src/print.c > +++ b/src/print.c > @@ -1912,7 +1912,7 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) > intmax_t i; > > if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) > - && (c = XFIXNUM (obj))) > + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) > { > printchar ('?', printcharfun); > if (escapeflag If a test fails, it is better to fix the test and not make the code less powerful, don't you agree? To produce 4194176 from ?\200, one way is this: (decode-char 'eight-bit ?\200) Can't this be used in the test? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 15:13 ` Eli Zaretskii @ 2020-11-01 18:39 ` Juri Linkov 2020-11-01 18:51 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-11-01 18:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, rgm, schwab >> This is because 4194176 is printed as ?\200 that is parsed as 128. >> >> This patch should fix test failures by printing integers >> for ambiguous characters. I'm sure no user would complain >> that numbers between 4194176 and 4194303 are printed as integers. >> >> if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) >> - && (c = XFIXNUM (obj))) >> + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) > > If a test fails, it is better to fix the test and not make the code > less powerful, don't you agree? This means sweeping the problems under the carpet. > To produce 4194176 from ?\200, one way is this: > > (decode-char 'eight-bit ?\200) > > Can't this be used in the test? Using this code in tests means that the users should use the same code in their programs. Thus 'print' should print '(33 4194176) as such ugly code: `(?! ,(decode-char 'eight-bit ?\200)) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 18:39 ` Juri Linkov @ 2020-11-01 18:51 ` Eli Zaretskii 2020-11-01 19:13 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-11-01 18:51 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, rgm, schwab > From: Juri Linkov <juri@linkov.net> > Cc: rgm@gnu.org, 44155@debbugs.gnu.org, schwab@linux-m68k.org > Date: Sun, 01 Nov 2020 20:39:48 +0200 > > >> if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) > >> - && (c = XFIXNUM (obj))) > >> + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) > > > > If a test fails, it is better to fix the test and not make the code > > less powerful, don't you agree? > > This means sweeping the problems under the carpet. Which problem? > > (decode-char 'eight-bit ?\200) > > > > Can't this be used in the test? > > Using this code in tests means that the users should use the same code > in their programs. Why would they need to do that? The test needs it because it wants to verify the result, but "normal" programs don't need to read back the values they printed. > Thus 'print' should print '(33 4194176) as such ugly code: > `(?! ,(decode-char 'eight-bit ?\200)) I don't see why. ?\200 and 4194176 are two forms of the same character. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 18:51 ` Eli Zaretskii @ 2020-11-01 19:13 ` Juri Linkov 2020-11-01 19:41 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-11-01 19:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, rgm, schwab >> >> if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) >> >> - && (c = XFIXNUM (obj))) >> >> + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) >> > >> > If a test fails, it is better to fix the test and not make the code >> > less powerful, don't you agree? >> >> This means sweeping the problems under the carpet. > > Which problem? Problem of ambiguous numbers 128 and 4194176 that are both printed as ?\200. >> > (decode-char 'eight-bit ?\200) >> > >> > Can't this be used in the test? >> >> Using this code in tests means that the users should use the same code >> in their programs. > > Why would they need to do that? The test needs it because it wants to > verify the result, but "normal" programs don't need to read back the > values they printed. Programs print the lists of characters, and other programs read them. >> Thus 'print' should print '(33 4194176) as such ugly code: >> `(?! ,(decode-char 'eight-bit ?\200)) > > I don't see why. ?\200 and 4194176 are two forms of the same > character. ?\200 and 128 are two forms of the same character too. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 19:13 ` Juri Linkov @ 2020-11-01 19:41 ` Eli Zaretskii 2020-11-01 20:16 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-11-01 19:41 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, rgm, schwab > From: Juri Linkov <juri@linkov.net> > Cc: rgm@gnu.org, 44155@debbugs.gnu.org, schwab@linux-m68k.org > Date: Sun, 01 Nov 2020 21:13:03 +0200 > > >> >> if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) > >> >> - && (c = XFIXNUM (obj))) > >> >> + && (c = XFIXNUM (obj)) && ! CHAR_BYTE8_P (c)) > >> > > >> > If a test fails, it is better to fix the test and not make the code > >> > less powerful, don't you agree? > >> > >> This means sweeping the problems under the carpet. > > > > Which problem? > > Problem of ambiguous numbers 128 and 4194176 that are both printed as ?\200. Octal escapes are generally a sign of a raw byte. This is not different from buffer display -- how do you know what does ?\200 mean inside buffer text? > ?\200 and 128 are two forms of the same character too. See my question above. I don't think what you say is true. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 19:41 ` Eli Zaretskii @ 2020-11-01 20:16 ` Juri Linkov 0 siblings, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-11-01 20:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, rgm, schwab >> Problem of ambiguous numbers 128 and 4194176 that are both printed as ?\200. > > Octal escapes are generally a sign of a raw byte. This is not > different from buffer display -- how do you know what does ?\200 mean > inside buffer text? > >> ?\200 and 128 are two forms of the same character too. > > See my question above. I don't think what you say is true. Typing 'C-x C-e' after ?\200 displays: 128 (#o200, #x80, ?\x80) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-10-22 20:56 ` bug#44155: Print integers as characters Juri Linkov 2020-10-22 22:39 ` Andreas Schwab @ 2020-11-01 12:03 ` Mattias Engdegård 2020-11-01 18:35 ` Juri Linkov 1 sibling, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-11-01 12:03 UTC (permalink / raw) To: Juri Linkov, Eli Zaretskii, Andreas Schwab; +Cc: 44155 reopen 44155 stop I don't mind the basic idea, but I'm reopening the bug since it looks like there is some unfinished business. Hope you don't mind. > When t, print characters from integers that represent a character. In what way does 't' suggest a character? Wouldn't something like 'character' be more suggestive? The variable isn't named 'print-integers-as-chars'. > When a number 16, print non-negative integers in the hexadecimal format. Doesn't work for bignums: (let ((integer-output-format 16)) (print 394583945873948753948539845)) 394583945873948753948539845 This must be a bug since there is no reason why bignums should be treated specially. In general we try hard not to. Since there is a read syntax for binary and octal numbers as well, why not permit 2 and 8? (And why not print negative numbers in the selected radix?) And C0/C1 controls aren't printed well: (let ((integer-output-format t)) (print 10) (print 127)) ? ?\x7f^? I strongly suggest that the controls that have special escapes, like \n, use them. What to use for the rest depends on the user's preference really -- for example, 31 might be printed as 31, ?\037, #o37 or #x1f. Whether to print 32 as ?‹SPACE› or ?\s is a matter of taste. For that matter, the variable name should perhaps start with 'print-' like other variables that control printing. Maybe we should separate the default radix and print integers as characters? Thus, we'd have: print-integer-radix -- 2, 8, 16, 10 or nil (which means 10) print-integers-as-characters -- nil or t ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 12:03 ` Mattias Engdegård @ 2020-11-01 18:35 ` Juri Linkov 2020-11-01 20:52 ` Mattias Engdegård 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-11-01 18:35 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 44155, Andreas Schwab > reopen 44155 > stop > > I don't mind the basic idea, but I'm reopening the bug since it looks > like there is some unfinished business. Hope you don't mind. Thanks for bringing a fresh perspective to this feature request. >> When t, print characters from integers that represent a character. > > In what way does 't' suggest a character? Wouldn't something like 'character' be more suggestive? > The variable isn't named 'print-integers-as-chars'. As the most frequent usage pattern, 't' is more convenient to use in code: (let ((integer-output-format t)) whereas this would be uglier and harder to type with: (let ((integer-output-format 'character)) >> When a number 16, print non-negative integers in the hexadecimal format. > > Doesn't work for bignums: > > (let ((integer-output-format 16)) > (print 394583945873948753948539845)) > > 394583945873948753948539845 Yes, this is known current limitation. > This must be a bug since there is no reason why bignums should be treated specially. > In general we try hard not to. I agree, support for big numbers should be added as well. > Since there is a read syntax for binary and octal numbers as well, why not permit 2 and 8? > (And why not print negative numbers in the selected radix?) 2 and 8 could be added as well. > And C0/C1 controls aren't printed well: > > (let ((integer-output-format t)) > (print 10) > (print 127)) > > ? > > > ?\x7f^? > > I strongly suggest that the controls that have special escapes, like > \n, use them. prin1-char uses more readable format, is this better? (prin1-char 10) ?\C-j (prin1-char 127) ?\C-? Or should 10 be printed as '?\n'? > What to use for the rest depends on the user's preference really -- > for example, 31 might be printed as 31, ?\037, #o37 or #x1f. Maybe more user choices should be supported by the variable? > Whether to print 32 as ?‹SPACE› or ?\s is a matter of taste. ?\s is less error-prone. > For that matter, the variable name should perhaps start with 'print-' > like other variables that control printing. Maybe we should separate > the default radix and print integers as characters? Thus, we'd have: The variable name was modeled after the similar variable float-output-format. > print-integer-radix -- 2, 8, 16, 10 or nil (which means 10) > > print-integers-as-characters -- nil or t What should be printed when both variables are bound to non-default values, e.g. print-integers-as-characters to t, and print-integer-radix to 16? Maybe to print with character syntax and the given radix, e.g. '?\x1f'. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 18:35 ` Juri Linkov @ 2020-11-01 20:52 ` Mattias Engdegård 2020-11-02 21:36 ` Juri Linkov 0 siblings, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-11-01 20:52 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, Andreas Schwab 1 nov. 2020 kl. 19.35 skrev Juri Linkov <juri@linkov.net>: > Thanks for bringing a fresh perspective to this feature request. You are very graceful. The devil is in the details, as always! > (prin1-char 10) ?\C-j > (prin1-char 127) ?\C-? > > Or should 10 be printed as '?\n'? Yes, I think ?\n is more useful. As a character, 10 is more commonly thought of as newline than as control-j. >> What to use for the rest depends on the user's preference really -- >> for example, 31 might be printed as 31, ?\037, #o37 or #x1f. > > Maybe more user choices should be supported by the variable? Maybe, but only if we can identify sensible such choices. Otherwise we should just try to pick the best representation in each case. Giving users too much choice isn't necessarily making them a favour! I'd suggest plain number syntax for control characters without named escapes, for several reasons: * Such numbers are less likely to represent characters and more likely to be, well, numbers. * It would allow a separate radix control to govern their output format. * Writing ?\x1f is no clearer than #x1f, and sometimes more confusing: \xff is a raw byte in a string, but ?\xff is always 255. Thus we would have 10 -> ?\n, 13 -> ?\r, 127 -> ?\d, 65 -> ?A, 255 -> ?ÿ, but 31 -> 31, 129 -> 129, 4194303 -> 4194303. >> Whether to print 32 as ?‹SPACE› or ?\s is a matter of taste. > > ?\s is less error-prone. Yes, I agree. (I prefer ?\s or 32 as characters, but " " in strings.) >> For that matter, the variable name should perhaps start with 'print-' >> like other variables that control printing. Maybe we should separate >> the default radix and print integers as characters? Thus, we'd have: > > The variable name was modeled after the similar variable float-output-format. I see, interesting! One possibility would be to use a string in the same way, thus "%x", "%c" etc, but it makes less sense for integers than floating-point: no precision field, and many format alternatives such as %#x do not produce valid Lisp read syntax. Better keep it simple. >> print-integer-radix -- 2, 8, 16, 10 or nil (which means 10) >> >> print-integers-as-characters -- nil or t > > What should be printed when both variables are bound to non-default values, > e.g. print-integers-as-characters to t, and print-integer-radix to 16? > Maybe to print with character syntax and the given radix, e.g. '?\x1f'. Well, it should clearly use character syntax for printable characters and the given radix for non-characters. As you correctly point out, what to use for non-printable characters (C0 and C1 controls, raw bytes) is less obvious. I'd probably just use the given radix; I see no readability advantage in printing ?\x1f to #x1f. Since your original motivation was to print characters in pretty-printed nested Lisp expressions, perhaps we should just define print-integers-as-characters as a Boolean and skip the radix for the time being? We could add a print radix control later on if desired. (That would save us the hassle to deal with bignums, for that matter.) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-01 20:52 ` Mattias Engdegård @ 2020-11-02 21:36 ` Juri Linkov 2020-11-02 23:03 ` Mattias Engdegård 0 siblings, 1 reply; 109+ messages in thread From: Juri Linkov @ 2020-11-02 21:36 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 44155, Andreas Schwab > Thus we would have 10 -> ?\n, 13 -> ?\r, 127 -> ?\d, 65 -> ?A, > 255 -> ?ÿ, but 31 -> 31, 129 -> 129, 4194303 -> 4194303. Hopefully, printing some characters as numbers will fix the currently broken test. > Since your original motivation was to print characters in pretty-printed > nested Lisp expressions, perhaps we should just define > print-integers-as-characters as a Boolean and skip the radix for the time > being? We could add a print radix control later on if desired. (That would > save us the hassle to deal with bignums, for that matter.) This was my intention - to start with something simple that does only what was needed (to print integers as characters), then extend it later when such a need arises as printing hex numbers. I added hex numbers only as a proof that the variable integer-output-format is extensible enough to support more formats in the future. But as you point out, this is achievable by adding another variable like print-integer-radix. PS: I notices inconsistency in these names: "integer" in print-integer-radix is singular, but "integers" in print-integers-as-characters is plural. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-02 21:36 ` Juri Linkov @ 2020-11-02 23:03 ` Mattias Engdegård 2020-11-03 8:30 ` Juri Linkov 2020-11-03 15:24 ` Eli Zaretskii 0 siblings, 2 replies; 109+ messages in thread From: Mattias Engdegård @ 2020-11-02 23:03 UTC (permalink / raw) To: Juri Linkov; +Cc: 44155, Andreas Schwab [-- Attachment #1: Type: text/plain, Size: 970 bytes --] 2 nov. 2020 kl. 22.36 skrev Juri Linkov <juri@linkov.net>: > >> Thus we would have 10 -> ?\n, 13 -> ?\r, 127 -> ?\d, 65 -> ?A, >> 255 -> ?ÿ, but 31 -> 31, 129 -> 129, 4194303 -> 4194303. > > Hopefully, printing some characters as numbers will fix > the currently broken test. It does! Here is a proposed patch. We could add a separate radix control later if you like. One detail that I'm undecided about is whether to remove the more obscure control escapes \f, \a, \v, \e and \d, on the grounds that they are less likely to be used as actual characters and that users may prefer to see them as numbers instead. C, and most languages inheriting them from C, lack \e or \d; \f and \a are rare today, and \v is an anachronism. > PS: I notices inconsistency in these names: "integer" in print-integer-radix > is singular, but "integers" in print-integers-as-characters is plural. Actually, 'integer' in 'integer radix' plays the part of adjective! [-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --] [-- Type: application/octet-stream, Size: 9171 bytes --] From 0dc27757cd53bca3e05c93f29ca96d0845a50ec2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Mon, 2 Nov 2020 23:37:16 +0100 Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters The variable now only controls whether characters are printed, not the radix. Control chars are printed in human-readable syntax such as ?\n if available, as numbers otherwise (bug#44155). Done in collaboration with Juri Linkov. * src/print.c (named_escape): New function. (print_object): Change semantics as described above. (syms_of_print): Rename integer-output-format. Update doc string. * doc/lispref/streams.texi (Output Variables): * etc/NEWS: * test/src/print-tests.el (print-integers-as-characters): Rename and update according to new semantics. The test now passes. --- doc/lispref/streams.texi | 13 ++++---- etc/NEWS | 11 ++++--- src/print.c | 65 ++++++++++++++++++++++++++-------------- test/src/print-tests.el | 34 ++++++++++----------- 4 files changed, 71 insertions(+), 52 deletions(-) diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index f171f13779..4bc97e4c48 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -903,10 +903,11 @@ Output Variables you can use, see the variable's documentation string. @end defvar -@defvar integer-output-format -This variable specifies how to print integer numbers. The default is -@code{nil}, meaning use the decimal format. When bound to @code{t}, -print integers as characters when an integer represents a character -(@pxref{Basic Char Syntax}). When bound to the number @code{16}, -print non-negative integers in the hexadecimal format. +@defvar print-integers-as-characters +When this variable is non-@code{nil}, integers that represent +printable characters or control characters with their own escape +syntax such as newline will be printed using Lisp character syntax +(@pxref{Basic Char Syntax}). Other numbers are printed the usual way. +For example, the list @code{(4 65 -1 10)} will be printed as +@samp{(4 ?A -1 ?\n)}. @end defvar diff --git a/etc/NEWS b/etc/NEWS index e11effc9e8..810d6794f2 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 -** New variable 'integer-output-format' determines how to print integer values. -When this variable is bound to the value 't', integers are printed by -printing functions as characters when an integer represents a character. -When bound to the number 16, non-negative integers are printed in the -hexadecimal format. - +++ ** 'define-globalized-minor-mode' now takes a ':predicate' parameter. This can be used to control which major modes the minor mode should be @@ -1887,6 +1881,11 @@ file can affect code in another. For details, see the manual section 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal' and 'play-sound-file'. ++++ +** New variable 'print-integers-as-characters' modifies integer printing. +When this variable is non-nil, integers representing characters are +printed using Lisp character syntax, such as '?*' for 42. + \f * Changes in Emacs 28.1 on Non-Free Operating Systems diff --git a/src/print.c b/src/print.c index fa65a3cb26..89efcb2006 100644 --- a/src/print.c +++ b/src/print.c @@ -1848,6 +1848,25 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag, return true; } +static char +named_escape (int i) +{ + switch (i) + { + case '\a': return 'a'; + case '\b': return 'b'; + case '\t': return 't'; + case '\n': return 'n'; + case '\v': return 'v'; + case '\f': return 'f'; + case '\r': return 'r'; + case 27: return 'e'; + case ' ': return 's'; + case 127: return 'd'; + } + return 0; +} + static void print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { @@ -1908,29 +1927,31 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int c; - intmax_t i; + EMACS_INT i = XFIXNUM (obj); + char escaped_name; - if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) - && (c = XFIXNUM (obj))) + if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR + && ((escaped_name = named_escape (i)) + || (i >= 32 && i <= 127) + || i >= 0xa0)) { printchar ('?', printcharfun); - if (escapeflag - && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' - || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + if (escaped_name) + { + printchar ('\\', printcharfun); + i = escaped_name; + } + else if (escapeflag + && (i == ';' || i == '\"' || i == '\'' || i == '\\' + || i == '(' || i == ')' + || i == '{' || i == '}' + || i == '[' || i == ']')) printchar ('\\', printcharfun); - printchar (c, printcharfun); - } - else if (INTEGERP (Vinteger_output_format) - && integer_to_intmax (Vinteger_output_format, &i) - && i == 16 && !NILP (Fnatnump (obj))) - { - int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + printchar (i, printcharfun); } else { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + int len = sprintf (buf, "%"pI"d", i); strout (buf, len, len, printcharfun); } } @@ -2270,12 +2291,12 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; - DEFVAR_LISP ("integer-output-format", Vinteger_output_format, - doc: /* The format used to print integers. -When t, print characters from integers that represent a character. -When a number 16, print non-negative integers in the hexadecimal format. -Otherwise, by default print integers in the decimal format. */); - Vinteger_output_format = Qnil; + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, + doc: /* Non-nil means integers are printed using characters syntax. +Only non-control characters, and control characters with named escape +sequences such as newline, are printed this way. Other integers, +including those corresponding to raw bytes, are not affected. */); + print_integers_as_characters = Qnil; DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. diff --git a/test/src/print-tests.el b/test/src/print-tests.el index 7b026b6b21..0053f3cac0 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,25 +383,23 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) -(print-tests--deftest print-integer-output-format () +(print-tests--deftest print-integers-as-characters () ;; Bug#44155. - (let ((integer-output-format t) - (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat #'prin1-char syms " ") ")")))) - (let ((integer-output-format t) - (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) - (should (equal (read (print-tests--prin1-to-string syms)) syms))) - (let ((integer-output-format 16) - (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat - (lambda (i) - (if (and (>= i 0) (<= i most-positive-fixnum)) - (format "#x%x" i) (format "%d" i))) - syms " ") ")"))))) + (let* ((print-integers-as-characters t) + (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32 + ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d)) + (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff)) + (printed-chars (print-tests--prin1-to-string chars)) + (printed-nums (print-tests--prin1-to-string nums))) + (should (equal (read printed-chars) chars)) + (should (equal + printed-chars + (concat + "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\" + " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f ?\\a ?\\v ?\\e ?\\d)"))) + (should (equal (read printed-nums) nums)) + (should (equal printed-nums + "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)")))) (provide 'print-tests) ;;; print-tests.el ends here -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-02 23:03 ` Mattias Engdegård @ 2020-11-03 8:30 ` Juri Linkov 2020-11-03 15:24 ` Eli Zaretskii 1 sibling, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-11-03 8:30 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 44155, Andreas Schwab >> Hopefully, printing some characters as numbers will fix >> the currently broken test. > > It does! Here is a proposed patch. We could add a separate radix control later if you like. Thanks, I like your patch, hope that Eli will like it too. > One detail that I'm undecided about is whether to remove the more obscure > control escapes \f, \a, \v, \e and \d, on the grounds that they are less > likely to be used as actual characters and that users may prefer to see > them as numbers instead. C, and most languages inheriting them from C, lack > \e or \d; \f and \a are rare today, and \v is an anachronism. I don't think that \f is rare, it's used as a page separator in many Emacs Lisp files. But it would be surprising to me to see 127 printed as ?\d, maybe because C lacks it. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-02 23:03 ` Mattias Engdegård 2020-11-03 8:30 ` Juri Linkov @ 2020-11-03 15:24 ` Eli Zaretskii 2020-11-03 18:47 ` Mattias Engdegård 1 sibling, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-11-03 15:24 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 44155, schwab, juri > From: Mattias Engdegård <mattiase@acm.org> > Date: Tue, 3 Nov 2020 00:03:31 +0100 > Cc: Eli Zaretskii <eliz@gnu.org>, Andreas Schwab <schwab@suse.de>, > 44155@debbugs.gnu.org > > +@defvar print-integers-as-characters > +When this variable is non-@code{nil}, integers that represent > +printable characters or control characters with their own escape > +syntax such as newline will be printed using Lisp character syntax What is meant by "printable characters" here? One could think you mean [:print:], but that doesn't seem to be what then code does. > + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, > + doc: /* Non-nil means integers are printed using characters syntax. > +Only non-control characters, and control characters with named escape > +sequences such as newline, are printed this way. Other integers, > +including those corresponding to raw bytes, are not affected. */); And here, what does "non-control characters" mean? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-03 15:24 ` Eli Zaretskii @ 2020-11-03 18:47 ` Mattias Engdegård 2020-11-03 19:36 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-11-03 18:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab, juri [-- Attachment #1: Type: text/plain, Size: 781 bytes --] 3 nov. 2020 kl. 16.24 skrev Eli Zaretskii <eliz@gnu.org>: > What is meant by "printable characters" here? One could think you > mean [:print:], but that doesn't seem to be what then code does. Non-control characters (characters other than control characters), in this case. I wanted to keep things simple and not involve the Unicode database in the printer. (For that matter, [:print:] is a regexp feature and doesn't really define the meaning of 'printable', but your question was valid.) On the other hand, printing all non-controls using the ?X syntax is maybe not ideal. Attached is a new patch that uses Unicode properties to select only printable base characters. This patch also removes \a, \v, \e and \d from the characters printed as escaped controls. [-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --] [-- Type: application/octet-stream, Size: 11404 bytes --] From 3da6d9055b0ae68fc7b3bbee52885113c8c30b6d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Mon, 2 Nov 2020 23:37:16 +0100 Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters The variable now only controls whether characters are printed, not the radix. Control chars are printed in human-readable syntax such as ?\n if available, as numbers otherwise (bug#44155). Done in collaboration with Juri Linkov. * src/character.c (printable_base_p): * src/print.c (named_escape): New functions. (print_object): Change semantics as described above. (syms_of_print): Rename integer-output-format. Update doc string. * doc/lispref/streams.texi (Output Variables): * etc/NEWS: * test/src/print-tests.el (print-integers-as-characters): Rename and update according to new semantics. The test now passes. --- doc/lispref/streams.texi | 13 +++++---- etc/NEWS | 11 ++++--- src/character.c | 21 ++++++++++++++ src/character.h | 1 + src/print.c | 63 ++++++++++++++++++++++++++-------------- test/src/print-tests.el | 39 +++++++++++++------------ 6 files changed, 96 insertions(+), 52 deletions(-) diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index f171f13779..4bc97e4c48 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -903,10 +903,11 @@ Output Variables you can use, see the variable's documentation string. @end defvar -@defvar integer-output-format -This variable specifies how to print integer numbers. The default is -@code{nil}, meaning use the decimal format. When bound to @code{t}, -print integers as characters when an integer represents a character -(@pxref{Basic Char Syntax}). When bound to the number @code{16}, -print non-negative integers in the hexadecimal format. +@defvar print-integers-as-characters +When this variable is non-@code{nil}, integers that represent +printable characters or control characters with their own escape +syntax such as newline will be printed using Lisp character syntax +(@pxref{Basic Char Syntax}). Other numbers are printed the usual way. +For example, the list @code{(4 65 -1 10)} will be printed as +@samp{(4 ?A -1 ?\n)}. @end defvar diff --git a/etc/NEWS b/etc/NEWS index e11effc9e8..384c64a91e 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 -** New variable 'integer-output-format' determines how to print integer values. -When this variable is bound to the value 't', integers are printed by -printing functions as characters when an integer represents a character. -When bound to the number 16, non-negative integers are printed in the -hexadecimal format. - +++ ** 'define-globalized-minor-mode' now takes a ':predicate' parameter. This can be used to control which major modes the minor mode should be @@ -1887,6 +1881,11 @@ file can affect code in another. For details, see the manual section 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal' and 'play-sound-file'. ++++ +** New variable 'print-integers-as-characters' modifies integer printing. +When this variable is non-nil, character syntax is used for printing +numbers for which this makes sense, such as '?*' for 42. + \f * Changes in Emacs 28.1 on Non-Free Operating Systems diff --git a/src/character.c b/src/character.c index 5860f6a0c8..6d18e78f26 100644 --- a/src/character.c +++ b/src/character.c @@ -982,6 +982,27 @@ printablep (int c) || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */ } +/* Return true if C is a printable independent character. */ +bool +printable_base_p (int c) +{ + Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c); + if (! FIXNUMP (category)) + return false; + EMACS_INT gen_cat = XFIXNUM (category); + + /* See UTS #18. */ + return (!(gen_cat == UNICODE_CATEGORY_Mn /* mark, nonspacing */ + || gen_cat == UNICODE_CATEGORY_Mc /* mark, combining */ + || gen_cat == UNICODE_CATEGORY_Me /* mark, enclosing */ + || gen_cat == UNICODE_CATEGORY_Zl /* separator, line */ + || gen_cat == UNICODE_CATEGORY_Zp /* separator, paragraph */ + || gen_cat == UNICODE_CATEGORY_Cc /* other, control */ + || gen_cat == UNICODE_CATEGORY_Cs /* other, surrogate */ + || gen_cat == UNICODE_CATEGORY_Cf /* other, format */ + || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */ +} + /* Return true if C is a horizontal whitespace character, as defined by https://www.unicode.org/reports/tr18/tr18-19.html#blank. */ bool diff --git a/src/character.h b/src/character.h index af5023f77c..260c550108 100644 --- a/src/character.h +++ b/src/character.h @@ -583,6 +583,7 @@ char_surrogate_p (int c) extern bool graphicp (int); extern bool printablep (int); extern bool blankp (int); +extern bool printable_base_p (int); /* Look up the element in char table OBJ at index CH, and return it as an integer. If the element is not a character, return CH itself. */ diff --git a/src/print.c b/src/print.c index fa65a3cb26..f7158dbac0 100644 --- a/src/print.c +++ b/src/print.c @@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag, return true; } +static char +named_escape (int i) +{ + switch (i) + { + case '\b': return 'b'; + case '\t': return 't'; + case '\n': return 'n'; + case '\f': return 'f'; + case '\r': return 'r'; + case ' ': return 's'; + /* \a, \v, \e and \d are excluded from printing as escapes since + they are somewhat rare as characters and more likely to be + plain integers. */ + } + return 0; +} + static void print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { @@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int c; - intmax_t i; + EMACS_INT i = XFIXNUM (obj); + char escaped_name; - if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) - && (c = XFIXNUM (obj))) + if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR + && ((escaped_name = named_escape (i)) + || printable_base_p (i))) { printchar ('?', printcharfun); - if (escapeflag - && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' - || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + if (escaped_name) + { + printchar ('\\', printcharfun); + i = escaped_name; + } + else if (escapeflag + && (i == ';' || i == '\"' || i == '\'' || i == '\\' + || i == '(' || i == ')' + || i == '{' || i == '}' + || i == '[' || i == ']')) printchar ('\\', printcharfun); - printchar (c, printcharfun); - } - else if (INTEGERP (Vinteger_output_format) - && integer_to_intmax (Vinteger_output_format, &i) - && i == 16 && !NILP (Fnatnump (obj))) - { - int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + printchar (i, printcharfun); } else { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + int len = sprintf (buf, "%"pI"d", i); strout (buf, len, len, printcharfun); } } @@ -2270,12 +2289,12 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; - DEFVAR_LISP ("integer-output-format", Vinteger_output_format, - doc: /* The format used to print integers. -When t, print characters from integers that represent a character. -When a number 16, print non-negative integers in the hexadecimal format. -Otherwise, by default print integers in the decimal format. */); - Vinteger_output_format = Qnil; + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, + doc: /* Non-nil means integers are printed using characters syntax. +Only printable characters, and control characters with named escape +sequences such as newline, are printed this way. Other integers, +including those corresponding to raw bytes, are not affected. */); + print_integers_as_characters = Qnil; DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. diff --git a/test/src/print-tests.el b/test/src/print-tests.el index 7b026b6b21..05b1e4e6e4 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,25 +383,28 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) -(print-tests--deftest print-integer-output-format () +(print-tests--deftest print-integers-as-characters () ;; Bug#44155. - (let ((integer-output-format t) - (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat #'prin1-char syms " ") ")")))) - (let ((integer-output-format t) - (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) - (should (equal (read (print-tests--prin1-to-string syms)) syms))) - (let ((integer-output-format 16) - (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat - (lambda (i) - (if (and (>= i 0) (<= i most-positive-fixnum)) - (format "#x%x" i) (format "%d" i))) - syms " ") ")"))))) + (let* ((print-integers-as-characters t) + (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32 + ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d)) + (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff)) + (nonprints '(#xd800 #xdfff #x030a #xffff #x200c)) + (printed-chars (print-tests--prin1-to-string chars)) + (printed-nums (print-tests--prin1-to-string nums)) + (printed-nonprints (print-tests--prin1-to-string nonprints))) + (should (equal (read printed-chars) chars)) + (should (equal + printed-chars + (concat + "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\" + " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)"))) + (should (equal (read printed-nums) nums)) + (should (equal printed-nums + "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)")) + (should (equal (read printed-nonprints) nonprints)) + (should (equal printed-nonprints + "(55296 57343 778 65535 8204)")))) (provide 'print-tests) ;;; print-tests.el ends here -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-03 18:47 ` Mattias Engdegård @ 2020-11-03 19:36 ` Eli Zaretskii 2020-11-04 11:03 ` Mattias Engdegård 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-11-03 19:36 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 44155, schwab, juri > From: Mattias Engdegård <mattiase@acm.org> > Date: Tue, 3 Nov 2020 19:47:17 +0100 > Cc: juri@linkov.net, schwab@suse.de, 44155@debbugs.gnu.org > > > What is meant by "printable characters" here? One could think you > > mean [:print:], but that doesn't seem to be what then code does. > > Non-control characters (characters other than control characters), in this case. I wanted to keep things simple and not involve the Unicode database in the printer. > > (For that matter, [:print:] is a regexp feature and doesn't really define the meaning of 'printable', but your question was valid.) > > On the other hand, printing all non-controls using the ?X syntax is maybe not ideal. Attached is a new patch that uses Unicode properties to select only printable base characters. Thanks, but my main question is still not answered. I asked it from the POV of documentation: we should provide a more specific description of which characters will be printed as characters, so that users are not surprised. The text in NEWS still says "printable characters" without defining that term, and so does the doc string of print-integers-as-characters. And now there's another question, which is what caused you to filter characters like you did? E.g., what's wrong with combining classes? why not simply use graphicp? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-03 19:36 ` Eli Zaretskii @ 2020-11-04 11:03 ` Mattias Engdegård 2020-11-04 15:38 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-11-04 11:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab, juri [-- Attachment #1: Type: text/plain, Size: 1879 bytes --] 3 nov. 2020 kl. 20.36 skrev Eli Zaretskii <eliz@gnu.org>: > Thanks, but my main question is still not answered. I asked it from > the POV of documentation: we should provide a more specific > description of which characters will be printed as characters, so that > users are not surprised. The text in NEWS still says "printable > characters" without defining that term, and so does the doc string of > print-integers-as-characters. 'Printable' was used informally, not in an exact technical meaning. Intuitively, it should be the set of characters that make sense to print using the '?X' syntax. I initially thought that 'graphic' was too technical but it is more precise. 'Independently printable graphic character' is descriptive but a mouthful; perhaps 'independent graphic char' would do? > And now there's another question, which is what caused you to filter > characters like you did? E.g., what's wrong with combining classes? > why not simply use graphicp? For the ?X syntax to make sense, X must be visible; thus controls are out, and so are formatting chars (language tags etc). Spaces should probably have been excluded as well since it's typically not possible to see what kind of space follows the '?' (SPC is explicitly rendered as ?\s). Furthermore, X must be independent since it isn't a grapheme cluster but a single code point. Therefore combining chars cannot be included as they would attach to the '?'. 'graphicp' cannot be used because it includes combining, enclosing and nonspacing marks (M) and formats (Cf); otherwise it's fine. While we could put the exact list of excluded general categories in the documentation, it is not very important because the selection only matters for usability and aesthetics, not (realistically) for code behaviour. The attached patch excludes spaces (Zs) and revises the terminology. [-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --] [-- Type: application/octet-stream, Size: 11560 bytes --] From fd24ef7e7b71308ff29b8d1b2f7be64254469521 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Mon, 2 Nov 2020 23:37:16 +0100 Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters The variable now only controls whether characters are printed, not the radix. Control chars are printed in human-readable syntax only when special escapes such as ?\n are available. Spaces, formatting and combining chars are excluded (bug#44155). Done in collaboration with Juri Linkov. * src/character.c (graphic_base_p): * src/print.c (named_escape): New functions. (print_object): Change semantics as described above. (syms_of_print): Rename integer-output-format. Update doc string. * doc/lispref/streams.texi (Output Variables): * etc/NEWS: * test/src/print-tests.el (print-integers-as-characters): Rename and update according to new semantics. The test now passes. --- doc/lispref/streams.texi | 13 ++++---- etc/NEWS | 11 ++++--- src/character.c | 21 +++++++++++++ src/character.h | 1 + src/print.c | 64 ++++++++++++++++++++++++++-------------- test/src/print-tests.el | 39 +++++++++++++----------- 6 files changed, 97 insertions(+), 52 deletions(-) diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index f171f13779..799d35b070 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -903,10 +903,11 @@ Output Variables you can use, see the variable's documentation string. @end defvar -@defvar integer-output-format -This variable specifies how to print integer numbers. The default is -@code{nil}, meaning use the decimal format. When bound to @code{t}, -print integers as characters when an integer represents a character -(@pxref{Basic Char Syntax}). When bound to the number @code{16}, -print non-negative integers in the hexadecimal format. +@defvar print-integers-as-characters +When this variable is non-@code{nil}, integers that represent +independent graphic characters or control characters with their own +escape syntax such as newline will be printed using Lisp character +syntax (@pxref{Basic Char Syntax}). Other numbers are printed the +usual way. For example, the list @code{(4 65 -1 10)} will be printed +as @samp{(4 ?A -1 ?\n)}. @end defvar diff --git a/etc/NEWS b/etc/NEWS index e11effc9e8..384c64a91e 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 -** New variable 'integer-output-format' determines how to print integer values. -When this variable is bound to the value 't', integers are printed by -printing functions as characters when an integer represents a character. -When bound to the number 16, non-negative integers are printed in the -hexadecimal format. - +++ ** 'define-globalized-minor-mode' now takes a ':predicate' parameter. This can be used to control which major modes the minor mode should be @@ -1887,6 +1881,11 @@ file can affect code in another. For details, see the manual section 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal' and 'play-sound-file'. ++++ +** New variable 'print-integers-as-characters' modifies integer printing. +When this variable is non-nil, character syntax is used for printing +numbers for which this makes sense, such as '?*' for 42. + \f * Changes in Emacs 28.1 on Non-Free Operating Systems diff --git a/src/character.c b/src/character.c index 5860f6a0c8..00b73293a3 100644 --- a/src/character.c +++ b/src/character.c @@ -982,6 +982,27 @@ printablep (int c) || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */ } +/* Return true if C is graphic character that can be printed independently. */ +bool +graphic_base_p (int c) +{ + Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c); + if (! FIXNUMP (category)) + return false; + EMACS_INT gen_cat = XFIXNUM (category); + + return (!(gen_cat == UNICODE_CATEGORY_Mn /* mark, nonspacing */ + || gen_cat == UNICODE_CATEGORY_Mc /* mark, combining */ + || gen_cat == UNICODE_CATEGORY_Me /* mark, enclosing */ + || gen_cat == UNICODE_CATEGORY_Zs /* separator, space */ + || gen_cat == UNICODE_CATEGORY_Zl /* separator, line */ + || gen_cat == UNICODE_CATEGORY_Zp /* separator, paragraph */ + || gen_cat == UNICODE_CATEGORY_Cc /* other, control */ + || gen_cat == UNICODE_CATEGORY_Cs /* other, surrogate */ + || gen_cat == UNICODE_CATEGORY_Cf /* other, format */ + || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */ +} + /* Return true if C is a horizontal whitespace character, as defined by https://www.unicode.org/reports/tr18/tr18-19.html#blank. */ bool diff --git a/src/character.h b/src/character.h index af5023f77c..cbf43097ae 100644 --- a/src/character.h +++ b/src/character.h @@ -583,6 +583,7 @@ char_surrogate_p (int c) extern bool graphicp (int); extern bool printablep (int); extern bool blankp (int); +extern bool graphic_base_p (int); /* Look up the element in char table OBJ at index CH, and return it as an integer. If the element is not a character, return CH itself. */ diff --git a/src/print.c b/src/print.c index fa65a3cb26..f2e2dd131d 100644 --- a/src/print.c +++ b/src/print.c @@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag, return true; } +static char +named_escape (int i) +{ + switch (i) + { + case '\b': return 'b'; + case '\t': return 't'; + case '\n': return 'n'; + case '\f': return 'f'; + case '\r': return 'r'; + case ' ': return 's'; + /* \a, \v, \e and \d are excluded from printing as escapes since + they are somewhat rare as characters and more likely to be + plain integers. */ + } + return 0; +} + static void print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { @@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int c; - intmax_t i; + EMACS_INT i = XFIXNUM (obj); + char escaped_name; - if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) - && (c = XFIXNUM (obj))) + if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR + && ((escaped_name = named_escape (i)) + || graphic_base_p (i))) { printchar ('?', printcharfun); - if (escapeflag - && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' - || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + if (escaped_name) + { + printchar ('\\', printcharfun); + i = escaped_name; + } + else if (escapeflag + && (i == ';' || i == '\"' || i == '\'' || i == '\\' + || i == '(' || i == ')' + || i == '{' || i == '}' + || i == '[' || i == ']')) printchar ('\\', printcharfun); - printchar (c, printcharfun); - } - else if (INTEGERP (Vinteger_output_format) - && integer_to_intmax (Vinteger_output_format, &i) - && i == 16 && !NILP (Fnatnump (obj))) - { - int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + printchar (i, printcharfun); } else { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + int len = sprintf (buf, "%"pI"d", i); strout (buf, len, len, printcharfun); } } @@ -2270,12 +2289,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; - DEFVAR_LISP ("integer-output-format", Vinteger_output_format, - doc: /* The format used to print integers. -When t, print characters from integers that represent a character. -When a number 16, print non-negative integers in the hexadecimal format. -Otherwise, by default print integers in the decimal format. */); - Vinteger_output_format = Qnil; + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, + doc: /* Non-nil means integers are printed using characters syntax. +Only independent graphic characters, and control characters with named +escape sequences such as newline, are printed this way. Other +integers, including those corresponding to raw bytes, are printed +affected. */); + print_integers_as_characters = Qnil; DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. diff --git a/test/src/print-tests.el b/test/src/print-tests.el index 7b026b6b21..202555adb3 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,25 +383,28 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) -(print-tests--deftest print-integer-output-format () +(print-tests--deftest print-integers-as-characters () ;; Bug#44155. - (let ((integer-output-format t) - (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat #'prin1-char syms " ") ")")))) - (let ((integer-output-format t) - (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) - (should (equal (read (print-tests--prin1-to-string syms)) syms))) - (let ((integer-output-format 16) - (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat - (lambda (i) - (if (and (>= i 0) (<= i most-positive-fixnum)) - (format "#x%x" i) (format "%d" i))) - syms " ") ")"))))) + (let* ((print-integers-as-characters t) + (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32 + ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d)) + (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff)) + (nonprints '(#xd800 #xdfff #x030a #xffff #x2002 #x200c)) + (printed-chars (print-tests--prin1-to-string chars)) + (printed-nums (print-tests--prin1-to-string nums)) + (printed-nonprints (print-tests--prin1-to-string nonprints))) + (should (equal (read printed-chars) chars)) + (should (equal + printed-chars + (concat + "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\" + " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)"))) + (should (equal (read printed-nums) nums)) + (should (equal printed-nums + "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)")) + (should (equal (read printed-nonprints) nonprints)) + (should (equal printed-nonprints + "(55296 57343 778 65535 8194 8204)")))) (provide 'print-tests) ;;; print-tests.el ends here -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-04 11:03 ` Mattias Engdegård @ 2020-11-04 15:38 ` Eli Zaretskii 2020-11-04 16:46 ` Mattias Engdegård 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-11-04 15:38 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 44155, schwab, juri > From: Mattias Engdegård <mattiase@acm.org> > Date: Wed, 4 Nov 2020 12:03:32 +0100 > Cc: juri@linkov.net, schwab@suse.de, 44155@debbugs.gnu.org > > 'Printable' was used informally, not in an exact technical meaning. Intuitively, it should be the set of characters that make sense to print using the '?X' syntax. I initially thought that 'graphic' was too technical but it is more precise. 'Independently printable graphic character' is descriptive but a mouthful; perhaps 'independent graphic char' would do? I'm not sure. I think we should use something more familiar, or explain it in more detail. We already mention Unicode properties elsewhere in the manual, so we could define this in those terms, and send the reader there for the details, for example. > For the ?X syntax to make sense, X must be visible; thus controls are out, and so are formatting chars (language tags etc). Spaces should probably have been excluded as well since it's typically not possible to see what kind of space follows the '?' (SPC is explicitly rendered as ?\s). > > Furthermore, X must be independent since it isn't a grapheme cluster but a single code point. Therefore combining chars cannot be included as they would attach to the '?'. > > 'graphicp' cannot be used because it includes combining, enclosing and nonspacing marks (M) and formats (Cf); otherwise it's fine. > > While we could put the exact list of excluded general categories in the documentation, it is not very important because the selection only matters for usability and aesthetics, not (realistically) for code behaviour. > > The attached patch excludes spaces (Zs) and revises the terminology. I'm not going to argue about this aspect, but just FTR: whether to include combining characters is a decision that we make here, it is not a necessity. Because we are perfectly capable of displaying combining characters without risking them to become composed with surrounding characters: we could either precede them with U+25CC DOTTED CIRCLE, or use the technique describe-char-padded-string in descr-text.el uses. Thanks. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-04 15:38 ` Eli Zaretskii @ 2020-11-04 16:46 ` Mattias Engdegård 2020-11-04 16:58 ` Mattias Engdegård 0 siblings, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-11-04 16:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab, juri [-- Attachment #1: Type: text/plain, Size: 918 bytes --] 4 nov. 2020 kl. 16.38 skrev Eli Zaretskii <eliz@gnu.org>: > I'm not sure. I think we should use something more familiar, or > explain it in more detail. We already mention Unicode properties > elsewhere in the manual, so we could define this in those terms, and > send the reader there for the details, for example. Thanks for the review. Please look at the revised patch below with your requested changes. > I'm not going to argue about this aspect, but just FTR: whether to > include combining characters is a decision that we make here, it is > not a necessity. Because we are perfectly capable of displaying > combining characters without risking them to become composed with > surrounding characters: we could either precede them with U+25CC > DOTTED CIRCLE, or use the technique describe-char-padded-string in > descr-text.el uses. No we cannot, because the output must be valid Lisp. [-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --] [-- Type: application/octet-stream, Size: 11560 bytes --] From aadbdd31b85e8b4459d903ef1bed1fdf8272588f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Mon, 2 Nov 2020 23:37:16 +0100 Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters The variable now only controls whether characters are printed, not the radix. Control chars are printed in human-readable syntax only when special escapes such as ?\n are available. Spaces, formatting and combining chars are excluded (bug#44155). Done in collaboration with Juri Linkov. * src/character.c (graphic_base_p): * src/print.c (named_escape): New functions. (print_object): Change semantics as described above. (syms_of_print): Rename integer-output-format. Update doc string. * doc/lispref/streams.texi (Output Variables): * etc/NEWS: * test/src/print-tests.el (print-integers-as-characters): Rename and update according to new semantics. The test now passes. --- doc/lispref/streams.texi | 13 ++++---- etc/NEWS | 11 ++++--- src/character.c | 21 +++++++++++++ src/character.h | 1 + src/print.c | 64 ++++++++++++++++++++++++++-------------- test/src/print-tests.el | 39 +++++++++++++----------- 6 files changed, 97 insertions(+), 52 deletions(-) diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index f171f13779..799d35b070 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -903,10 +903,11 @@ Output Variables you can use, see the variable's documentation string. @end defvar -@defvar integer-output-format -This variable specifies how to print integer numbers. The default is -@code{nil}, meaning use the decimal format. When bound to @code{t}, -print integers as characters when an integer represents a character -(@pxref{Basic Char Syntax}). When bound to the number @code{16}, -print non-negative integers in the hexadecimal format. +@defvar print-integers-as-characters +When this variable is non-@code{nil}, integers that represent +independent graphic characters or control characters with their own +escape syntax such as newline will be printed using Lisp character +syntax (@pxref{Basic Char Syntax}). Other numbers are printed the +usual way. For example, the list @code{(4 65 -1 10)} will be printed +as @samp{(4 ?A -1 ?\n)}. @end defvar diff --git a/etc/NEWS b/etc/NEWS index d15f3ed1ae..e3ac15f7e3 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1697,12 +1697,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 -** New variable 'integer-output-format' determines how to print integer values. -When this variable is bound to the value 't', integers are printed by -printing functions as characters when an integer represents a character. -When bound to the number 16, non-negative integers are printed in the -hexadecimal format. - +++ ** 'define-globalized-minor-mode' now takes a ':predicate' parameter. This can be used to control which major modes the minor mode should be @@ -1895,6 +1889,11 @@ file can affect code in another. For details, see the manual section 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal' and 'play-sound-file'. ++++ +** New variable 'print-integers-as-characters' modifies integer printing. +When this variable is non-nil, character syntax is used for printing +numbers for which this makes sense, such as '?*' for 42. + \f * Changes in Emacs 28.1 on Non-Free Operating Systems diff --git a/src/character.c b/src/character.c index 5860f6a0c8..00b73293a3 100644 --- a/src/character.c +++ b/src/character.c @@ -982,6 +982,27 @@ printablep (int c) || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */ } +/* Return true if C is graphic character that can be printed independently. */ +bool +graphic_base_p (int c) +{ + Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c); + if (! FIXNUMP (category)) + return false; + EMACS_INT gen_cat = XFIXNUM (category); + + return (!(gen_cat == UNICODE_CATEGORY_Mn /* mark, nonspacing */ + || gen_cat == UNICODE_CATEGORY_Mc /* mark, combining */ + || gen_cat == UNICODE_CATEGORY_Me /* mark, enclosing */ + || gen_cat == UNICODE_CATEGORY_Zs /* separator, space */ + || gen_cat == UNICODE_CATEGORY_Zl /* separator, line */ + || gen_cat == UNICODE_CATEGORY_Zp /* separator, paragraph */ + || gen_cat == UNICODE_CATEGORY_Cc /* other, control */ + || gen_cat == UNICODE_CATEGORY_Cs /* other, surrogate */ + || gen_cat == UNICODE_CATEGORY_Cf /* other, format */ + || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */ +} + /* Return true if C is a horizontal whitespace character, as defined by https://www.unicode.org/reports/tr18/tr18-19.html#blank. */ bool diff --git a/src/character.h b/src/character.h index af5023f77c..cbf43097ae 100644 --- a/src/character.h +++ b/src/character.h @@ -583,6 +583,7 @@ char_surrogate_p (int c) extern bool graphicp (int); extern bool printablep (int); extern bool blankp (int); +extern bool graphic_base_p (int); /* Look up the element in char table OBJ at index CH, and return it as an integer. If the element is not a character, return CH itself. */ diff --git a/src/print.c b/src/print.c index fa65a3cb26..f2e2dd131d 100644 --- a/src/print.c +++ b/src/print.c @@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag, return true; } +static char +named_escape (int i) +{ + switch (i) + { + case '\b': return 'b'; + case '\t': return 't'; + case '\n': return 'n'; + case '\f': return 'f'; + case '\r': return 'r'; + case ' ': return 's'; + /* \a, \v, \e and \d are excluded from printing as escapes since + they are somewhat rare as characters and more likely to be + plain integers. */ + } + return 0; +} + static void print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { @@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int c; - intmax_t i; + EMACS_INT i = XFIXNUM (obj); + char escaped_name; - if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) - && (c = XFIXNUM (obj))) + if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR + && ((escaped_name = named_escape (i)) + || graphic_base_p (i))) { printchar ('?', printcharfun); - if (escapeflag - && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' - || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + if (escaped_name) + { + printchar ('\\', printcharfun); + i = escaped_name; + } + else if (escapeflag + && (i == ';' || i == '\"' || i == '\'' || i == '\\' + || i == '(' || i == ')' + || i == '{' || i == '}' + || i == '[' || i == ']')) printchar ('\\', printcharfun); - printchar (c, printcharfun); - } - else if (INTEGERP (Vinteger_output_format) - && integer_to_intmax (Vinteger_output_format, &i) - && i == 16 && !NILP (Fnatnump (obj))) - { - int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + printchar (i, printcharfun); } else { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + int len = sprintf (buf, "%"pI"d", i); strout (buf, len, len, printcharfun); } } @@ -2270,12 +2289,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; - DEFVAR_LISP ("integer-output-format", Vinteger_output_format, - doc: /* The format used to print integers. -When t, print characters from integers that represent a character. -When a number 16, print non-negative integers in the hexadecimal format. -Otherwise, by default print integers in the decimal format. */); - Vinteger_output_format = Qnil; + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, + doc: /* Non-nil means integers are printed using characters syntax. +Only independent graphic characters, and control characters with named +escape sequences such as newline, are printed this way. Other +integers, including those corresponding to raw bytes, are printed +affected. */); + print_integers_as_characters = Qnil; DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. diff --git a/test/src/print-tests.el b/test/src/print-tests.el index 7b026b6b21..202555adb3 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,25 +383,28 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) -(print-tests--deftest print-integer-output-format () +(print-tests--deftest print-integers-as-characters () ;; Bug#44155. - (let ((integer-output-format t) - (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat #'prin1-char syms " ") ")")))) - (let ((integer-output-format t) - (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) - (should (equal (read (print-tests--prin1-to-string syms)) syms))) - (let ((integer-output-format 16) - (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat - (lambda (i) - (if (and (>= i 0) (<= i most-positive-fixnum)) - (format "#x%x" i) (format "%d" i))) - syms " ") ")"))))) + (let* ((print-integers-as-characters t) + (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32 + ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d)) + (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff)) + (nonprints '(#xd800 #xdfff #x030a #xffff #x2002 #x200c)) + (printed-chars (print-tests--prin1-to-string chars)) + (printed-nums (print-tests--prin1-to-string nums)) + (printed-nonprints (print-tests--prin1-to-string nonprints))) + (should (equal (read printed-chars) chars)) + (should (equal + printed-chars + (concat + "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\" + " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)"))) + (should (equal (read printed-nums) nums)) + (should (equal printed-nums + "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)")) + (should (equal (read printed-nonprints) nonprints)) + (should (equal printed-nonprints + "(55296 57343 778 65535 8194 8204)")))) (provide 'print-tests) ;;; print-tests.el ends here -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-04 16:46 ` Mattias Engdegård @ 2020-11-04 16:58 ` Mattias Engdegård 2020-11-06 13:02 ` Mattias Engdegård 0 siblings, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-11-04 16:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155, schwab, juri [-- Attachment #1: Type: text/plain, Size: 85 bytes --] The last patch was incorrect; here is the right one. Apologies for the confusion. [-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --] [-- Type: application/octet-stream, Size: 11788 bytes --] From 2a7bd3b8393f182e42d77e929d5e02a137c8e89b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org> Date: Mon, 2 Nov 2020 23:37:16 +0100 Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters The variable now only controls whether characters are printed, not the radix. Control chars are printed in human-readable syntax only when special escapes such as ?\n are available. Spaces, formatting and combining chars are excluded (bug#44155). Done in collaboration with Juri Linkov. * src/character.c (graphic_base_p): * src/print.c (named_escape): New functions. (print_object): Change semantics as described above. (syms_of_print): Rename integer-output-format. Update doc string. * doc/lispref/streams.texi (Output Variables): * etc/NEWS: * test/src/print-tests.el (print-integers-as-characters): Rename and update according to new semantics. The test now passes. --- doc/lispref/streams.texi | 18 +++++++---- etc/NEWS | 11 ++++--- src/character.c | 21 +++++++++++++ src/character.h | 1 + src/print.c | 64 ++++++++++++++++++++++++++-------------- test/src/print-tests.el | 39 +++++++++++++----------- 6 files changed, 102 insertions(+), 52 deletions(-) diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi index f171f13779..0534afb67f 100644 --- a/doc/lispref/streams.texi +++ b/doc/lispref/streams.texi @@ -903,10 +903,16 @@ Output Variables you can use, see the variable's documentation string. @end defvar -@defvar integer-output-format -This variable specifies how to print integer numbers. The default is -@code{nil}, meaning use the decimal format. When bound to @code{t}, -print integers as characters when an integer represents a character -(@pxref{Basic Char Syntax}). When bound to the number @code{16}, -print non-negative integers in the hexadecimal format. +@defvar print-integers-as-characters +When this variable is non-@code{nil}, integers that represent +graphic base characters will be printed using Lisp character syntax +(@pxref{Basic Char Syntax}). Other numbers are printed the usual way. +For example, the list @code{(4 65 -1 10)} would be printed as +@samp{(4 ?A -1 ?\n)}. + +More precisely, values printed in character syntax are those +representing characters belonging to the Unicode general categories +Letter, Number, Punctuation, Symbol and Private-use +(@pxref{Character Properties}), as well as the control characters +having their own escape syntax such as newline. @end defvar diff --git a/etc/NEWS b/etc/NEWS index d15f3ed1ae..9dcdcc3079 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1697,12 +1697,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el. \f * Lisp Changes in Emacs 28.1 -** New variable 'integer-output-format' determines how to print integer values. -When this variable is bound to the value 't', integers are printed by -printing functions as characters when an integer represents a character. -When bound to the number 16, non-negative integers are printed in the -hexadecimal format. - +++ ** 'define-globalized-minor-mode' now takes a ':predicate' parameter. This can be used to control which major modes the minor mode should be @@ -1895,6 +1889,11 @@ file can affect code in another. For details, see the manual section 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal' and 'play-sound-file'. ++++ +** New variable 'print-integers-as-characters' modifies integer printing. +If this variable is non-nil, character syntax is used for printing +numbers when this makes sense, such as '?A' for 65. + \f * Changes in Emacs 28.1 on Non-Free Operating Systems diff --git a/src/character.c b/src/character.c index 5860f6a0c8..00b73293a3 100644 --- a/src/character.c +++ b/src/character.c @@ -982,6 +982,27 @@ printablep (int c) || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */ } +/* Return true if C is graphic character that can be printed independently. */ +bool +graphic_base_p (int c) +{ + Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c); + if (! FIXNUMP (category)) + return false; + EMACS_INT gen_cat = XFIXNUM (category); + + return (!(gen_cat == UNICODE_CATEGORY_Mn /* mark, nonspacing */ + || gen_cat == UNICODE_CATEGORY_Mc /* mark, combining */ + || gen_cat == UNICODE_CATEGORY_Me /* mark, enclosing */ + || gen_cat == UNICODE_CATEGORY_Zs /* separator, space */ + || gen_cat == UNICODE_CATEGORY_Zl /* separator, line */ + || gen_cat == UNICODE_CATEGORY_Zp /* separator, paragraph */ + || gen_cat == UNICODE_CATEGORY_Cc /* other, control */ + || gen_cat == UNICODE_CATEGORY_Cs /* other, surrogate */ + || gen_cat == UNICODE_CATEGORY_Cf /* other, format */ + || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */ +} + /* Return true if C is a horizontal whitespace character, as defined by https://www.unicode.org/reports/tr18/tr18-19.html#blank. */ bool diff --git a/src/character.h b/src/character.h index af5023f77c..cbf43097ae 100644 --- a/src/character.h +++ b/src/character.h @@ -583,6 +583,7 @@ char_surrogate_p (int c) extern bool graphicp (int); extern bool printablep (int); extern bool blankp (int); +extern bool graphic_base_p (int); /* Look up the element in char table OBJ at index CH, and return it as an integer. If the element is not a character, return CH itself. */ diff --git a/src/print.c b/src/print.c index fa65a3cb26..f2e2dd131d 100644 --- a/src/print.c +++ b/src/print.c @@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag, return true; } +static char +named_escape (int i) +{ + switch (i) + { + case '\b': return 'b'; + case '\t': return 't'; + case '\n': return 'n'; + case '\f': return 'f'; + case '\r': return 'r'; + case ' ': return 's'; + /* \a, \v, \e and \d are excluded from printing as escapes since + they are somewhat rare as characters and more likely to be + plain integers. */ + } + return 0; +} + static void print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { @@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag) { case_Lisp_Int: { - int c; - intmax_t i; + EMACS_INT i = XFIXNUM (obj); + char escaped_name; - if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj) - && (c = XFIXNUM (obj))) + if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR + && ((escaped_name = named_escape (i)) + || graphic_base_p (i))) { printchar ('?', printcharfun); - if (escapeflag - && (c == ';' || c == '(' || c == ')' || c == '{' || c == '}' - || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\')) + if (escaped_name) + { + printchar ('\\', printcharfun); + i = escaped_name; + } + else if (escapeflag + && (i == ';' || i == '\"' || i == '\'' || i == '\\' + || i == '(' || i == ')' + || i == '{' || i == '}' + || i == '[' || i == ']')) printchar ('\\', printcharfun); - printchar (c, printcharfun); - } - else if (INTEGERP (Vinteger_output_format) - && integer_to_intmax (Vinteger_output_format, &i) - && i == 16 && !NILP (Fnatnump (obj))) - { - int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj)); - strout (buf, len, len, printcharfun); + printchar (i, printcharfun); } else { - int len = sprintf (buf, "%"pI"d", XFIXNUM (obj)); + int len = sprintf (buf, "%"pI"d", i); strout (buf, len, len, printcharfun); } } @@ -2270,12 +2289,13 @@ syms_of_print (void) that represents the number without losing information. */); Vfloat_output_format = Qnil; - DEFVAR_LISP ("integer-output-format", Vinteger_output_format, - doc: /* The format used to print integers. -When t, print characters from integers that represent a character. -When a number 16, print non-negative integers in the hexadecimal format. -Otherwise, by default print integers in the decimal format. */); - Vinteger_output_format = Qnil; + DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters, + doc: /* Non-nil means integers are printed using characters syntax. +Only independent graphic characters, and control characters with named +escape sequences such as newline, are printed this way. Other +integers, including those corresponding to raw bytes, are printed +affected. */); + print_integers_as_characters = Qnil; DEFVAR_LISP ("print-length", Vprint_length, doc: /* Maximum length of list to print before abbreviating. diff --git a/test/src/print-tests.el b/test/src/print-tests.el index 7b026b6b21..202555adb3 100644 --- a/test/src/print-tests.el +++ b/test/src/print-tests.el @@ -383,25 +383,28 @@ print-hash-table-test (let ((print-length 1)) (format "%S" h)))))) -(print-tests--deftest print-integer-output-format () +(print-tests--deftest print-integers-as-characters () ;; Bug#44155. - (let ((integer-output-format t) - (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat #'prin1-char syms " ") ")")))) - (let ((integer-output-format t) - (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char))))) - (should (equal (read (print-tests--prin1-to-string syms)) syms))) - (let ((integer-output-format 16) - (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum)))) - (should (equal (read (print-tests--prin1-to-string syms)) syms)) - (should (equal (print-tests--prin1-to-string syms) - (concat "(" (mapconcat - (lambda (i) - (if (and (>= i 0) (<= i most-positive-fixnum)) - (format "#x%x" i) (format "%d" i))) - syms " ") ")"))))) + (let* ((print-integers-as-characters t) + (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32 + ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d)) + (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff)) + (nonprints '(#xd800 #xdfff #x030a #xffff #x2002 #x200c)) + (printed-chars (print-tests--prin1-to-string chars)) + (printed-nums (print-tests--prin1-to-string nums)) + (printed-nonprints (print-tests--prin1-to-string nonprints))) + (should (equal (read printed-chars) chars)) + (should (equal + printed-chars + (concat + "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\" + " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)"))) + (should (equal (read printed-nums) nums)) + (should (equal printed-nums + "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)")) + (should (equal (read printed-nonprints) nonprints)) + (should (equal printed-nonprints + "(55296 57343 778 65535 8194 8204)")))) (provide 'print-tests) ;;; print-tests.el ends here -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* bug#44155: Print integers as characters 2020-11-04 16:58 ` Mattias Engdegård @ 2020-11-06 13:02 ` Mattias Engdegård 0 siblings, 0 replies; 109+ messages in thread From: Mattias Engdegård @ 2020-11-06 13:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44155-done, Andreas Schwab, Juri Linkov 4 nov. 2020 kl. 17.58 skrev Mattias Engdegård <mattiase@acm.org>: > The last patch was incorrect; here is the right one. Apologies for the confusion. Pushed to master, since there wasn't much left to discuss. As usual, it can be modified or reverted as needed. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-22 12:59 ` Eli Zaretskii 2020-10-22 20:56 ` bug#44155: Print integers as characters Juri Linkov @ 2022-04-30 12:19 ` Lars Ingebrigtsen 2022-04-30 12:29 ` Eli Zaretskii 1 sibling, 1 reply; 109+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 12:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866, Juri Linkov Eli Zaretskii <eliz@gnu.org> writes: >> + DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars, >> + doc: /* Print integers as characters. */); >> + Vprint_integers_as_chars = Qnil; > > I wonder whether it wouldn't be cleaner to add another optional > argument to prin1, and let it bind some internal variable so that > print_object does this, instead of exposing this knob to Lisp. > Because print_object is used all over the place, and who knows what > will this do to other callers? There's also prin1-to-string, and adding a parameter to these functions just for this doesn't seem quite right. However, I agree with you that adding a new print-* variable is bad, too (because users will invariably set them in .emacs and then things break in some obscure package). So I wonder whether we could come up with a new convention for print variables like this, which would allow us to extend printing more without adding new print variables. What about -- adding a new parameter to prin1 and prin1-to-string that's a plist of printing features? That is, something like: (prin1 object nil '(length 20 integers-as-chars t)) And this would allow us to introduce a special value for that parameter, like t, which means "use the standard values for everything". That means we could get rid of the gazillion places where we have (let ((print-length nil) (print-level nil)) (prin1 object)) That would instead just be (prin1 object nil t). (And the same with prin1-to-string.) This would hopefully be less error-prone than what we have today -- we've had so many bug reports from packages forgetting to bind one or the other when saving data. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2022-04-30 12:19 ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen @ 2022-04-30 12:29 ` Eli Zaretskii 2022-04-30 14:49 ` Lars Ingebrigtsen 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2022-04-30 12:29 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: rpluim, 43866, juri > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Juri Linkov <juri@linkov.net>, rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Sat, 30 Apr 2022 14:19:32 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> + DEFVAR_LISP ("print-integers-as-chars", Vprint_integers_as_chars, > >> + doc: /* Print integers as characters. */); > >> + Vprint_integers_as_chars = Qnil; > > > > I wonder whether it wouldn't be cleaner to add another optional > > argument to prin1, and let it bind some internal variable so that > > print_object does this, instead of exposing this knob to Lisp. > > Because print_object is used all over the place, and who knows what > > will this do to other callers? > > There's also prin1-to-string, and adding a parameter to these functions > just for this doesn't seem quite right. > > However, I agree with you that adding a new print-* variable is bad, too > (because users will invariably set them in .emacs and then things break > in some obscure package). > > So I wonder whether we could come up with a new convention for print > variables like this, which would allow us to extend printing more > without adding new print variables. > > What about -- adding a new parameter to prin1 and prin1-to-string that's > a plist of printing features? That is, something like: > > (prin1 object nil '(length 20 integers-as-chars t)) My worries were mainly because this new variable affected print_object directly, and because print_object is called in many places unrelated to prin1 etc. I'm okay with what you propose, but I don't see how would that eliminate the reasons for my worries. The implementation of the effect of this argument is still in print_object, so the question that is of interest to me is how will we communicate these arguments to print_object? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2022-04-30 12:29 ` Eli Zaretskii @ 2022-04-30 14:49 ` Lars Ingebrigtsen 2022-04-30 15:26 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 14:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866, juri Eli Zaretskii <eliz@gnu.org> writes: > I'm okay with what you propose, but I don't see how would that > eliminate the reasons for my worries. The implementation of the > effect of this argument is still in print_object, so the question that > is of interest to me is how will we communicate these arguments to > print_object? I was thinking that prin1* would just set/bind a new global variable (but one that isn't visible to the Lisp level). -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2022-04-30 14:49 ` Lars Ingebrigtsen @ 2022-04-30 15:26 ` Eli Zaretskii 2022-04-30 18:49 ` Lars Ingebrigtsen 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2022-04-30 15:26 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: rpluim, 43866, juri > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: juri@linkov.net, rpluim@gmail.com, 43866@debbugs.gnu.org > Date: Sat, 30 Apr 2022 16:49:14 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I'm okay with what you propose, but I don't see how would that > > eliminate the reasons for my worries. The implementation of the > > effect of this argument is still in print_object, so the question that > > is of interest to me is how will we communicate these arguments to > > print_object? > > I was thinking that prin1* would just set/bind a new global variable > (but one that isn't visible to the Lisp level). Then this sounds almost exactly like what I suggested back then, so I agree, of course. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2022-04-30 15:26 ` Eli Zaretskii @ 2022-04-30 18:49 ` Lars Ingebrigtsen 2022-05-29 13:35 ` Lars Ingebrigtsen 0 siblings, 1 reply; 109+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 18:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866, juri Heh, `print-integers-as-characters' already exists -- it was added in 2020. Anyway, I still think adding a parameter like described to prin1 would be nice, but it's not necessary for this feature, which somehow had something to do with Italian postfix. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2022-04-30 18:49 ` Lars Ingebrigtsen @ 2022-05-29 13:35 ` Lars Ingebrigtsen 0 siblings, 0 replies; 109+ messages in thread From: Lars Ingebrigtsen @ 2022-05-29 13:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866, juri Lars Ingebrigtsen <larsi@gnus.org> writes: > Anyway, I still think adding a parameter like described to prin1 would > be nice, but it's not necessary for this feature, which somehow had > something to do with Italian postfix. Re-skimming this bug thread, I think the original issue was fixed by Eli at the time -- E= was added (for euro sign) -- but then the discussion went on to whether we should have an input method/command based on /usr/share/X11/locale/en_US.UTF-8/Compose So I'm closing this bug report and opening a new one that's about that. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 2:31 ` Eli Zaretskii 2020-10-14 8:07 ` Juri Linkov @ 2020-10-15 3:52 ` Richard Stallman 1 sibling, 0 replies; 109+ messages in thread From: Richard Stallman @ 2020-10-15 3:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rpluim, 43866, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Would it make sense to support exactly the same keys that are > > provided by the X11 compose method? I mean that are in the file > > /usr/share/X11/locale/en_US.UTF-8/Compose > > also available at > > https://help.ubuntu.com/community/ComposeKey > > and > > https://cgit.freedesktop.org/xorg/lib/libX11/plain/nls/en_US.UTF-8/Compose.pre > How about making a new input method for those? It seems to me that > C-x 8 is already too "fat". That may be useful, but it has a drawback compared with C-x 8. It is inconvenient to change input methods just for one character and then change back. C-x 8 avoids that inconvenience; you can use it to enter one character, any one character, without changing the current input method. -- Dr Richard Stallman Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-13 20:07 ` Juri Linkov 2020-10-14 2:31 ` Eli Zaretskii @ 2020-10-14 4:38 ` Richard Stallman 2020-10-14 8:11 ` Juri Linkov ` (2 more replies) 1 sibling, 3 replies; 109+ messages in thread From: Richard Stallman @ 2020-10-14 4:38 UTC (permalink / raw) To: Juri Linkov; +Cc: rpluim, 43866 [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Would it make sense to support exactly the same keys that are > provided by the X11 compose method? That might be a good idea. Also, I wonder if we could make that command more self-documenting. Maybe C-h in the argument for C-x 8 could display a buffer which displays characters you can choose. Each character would be followed by the sequence to type to choose that character. This should include all the characters Emacs supports, divided clearly into Unicode code blocks, with their unicode names. Not just the ones that have specific short C-x 8 sequences definied in Emacs. It would be nice to have a prefix more mnemonic than C-x 8. But I have nothing to suggest. It would be good to shorten C-x 8 RET. That is my go-to method of inserting characters for which I don't know a sequence. Currently, 8 upper-case letters are valid after C-h 8, and 6 lower-case. Suppose we free up one case -- either the upper-case letters or the lower-case letters. Then we could make typing a letter of that case throw you into the minibuffer. In this way, we could replace C-x 8 RET UNICODE-NAME RET with C-x 8 UNICODE-NAME RET. Also, why not change the Unicode character names to lower-case? They would look nicer that way, I think. -- Dr Richard Stallman Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 4:38 ` Richard Stallman @ 2020-10-14 8:11 ` Juri Linkov 2020-10-14 10:43 ` Robert Pluim 2020-10-14 14:56 ` Eli Zaretskii 2 siblings, 0 replies; 109+ messages in thread From: Juri Linkov @ 2020-10-14 8:11 UTC (permalink / raw) To: Richard Stallman; +Cc: rpluim, 43866 > > Would it make sense to support exactly the same keys that are > > provided by the X11 compose method? > > That might be a good idea. Do we have the rights to copy all key definitions from the X11 compose method? I guess there are no licensing restrictions? > Also, I wonder if we could make that command more self-documenting. > Maybe C-h in the argument for C-x 8 could display a buffer > which displays characters you can choose. > Each character would be followed by the sequence to type to choose that > character. Yes, displaying a separate buffer would be useful. Then maybe displaying these keys could be moved from the Help buffer of 'C-h b' that currently displays a very long list of 'C-x 8' keys at the beginning of the Help buffer, so it's very difficult to see the keys of the current mode that are at the end of the long Help buffer. > This should include all the characters Emacs supports, divided clearly > into Unicode code blocks, with their unicode names. Not just the ones > that have specific short C-x 8 sequences definied in Emacs. Maybe also 'C-u C-x =' could suggest how to input characters using C-x 8 mnemonics. > It would be nice to have a prefix more mnemonic than C-x 8. > But I have nothing to suggest. Yes, to find a more mnemonic and shorter key would be useful. Maybe this question could be asked on emacs-devel where someone might have ideas for such a key. > It would be good to shorten C-x 8 RET. That is my go-to method > of inserting characters for which I don't know a sequence. > > Currently, 8 upper-case letters are valid after C-h 8, and 6 > lower-case. Suppose we free up one case -- either the upper-case > letters or the lower-case letters. Then we could make typing > a letter of that case throw you into the minibuffer. Sorry, I don't understand. I tried to type 'C-h 8', and it's undefined. > In this way, we could replace C-x 8 RET UNICODE-NAME RET with > C-x 8 UNICODE-NAME RET. > > Also, why not change the Unicode character names to lower-case? > They would look nicer that way, I think. I don't know why the Unicode standard uses upper-case, but I see no problem in Emacs with upper-case letters when case-fold is non-nil, so you can type lower-case letters in completions. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 4:38 ` Richard Stallman 2020-10-14 8:11 ` Juri Linkov @ 2020-10-14 10:43 ` Robert Pluim 2020-10-15 3:54 ` Richard Stallman 2020-10-14 14:56 ` Eli Zaretskii 2 siblings, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-14 10:43 UTC (permalink / raw) To: Richard Stallman; +Cc: 43866, Juri Linkov >>>>> On Wed, 14 Oct 2020 00:38:48 -0400, Richard Stallman <rms@gnu.org> said: >> Would it make sense to support exactly the same keys that are >> provided by the X11 compose method? Richard> That might be a good idea. Could we provide this as an input method? Richard> Also, I wonder if we could make that command more self-documenting. Richard> Maybe C-h in the argument for C-x 8 could display a buffer Richard> which displays characters you can choose. Richard> Each character would be followed by the sequence to type to choose that Richard> character. The problem is that such a list is very long. 'C-h b' after 'C-x 8 RET' will display the bindings, but it does not currently contain the character names, and TAB after 'C-x 8 RET' will list all the names but not the sequences for entering them. There are completion frameworks that have solved this, eg with helm you can start typing right after 'C-x 8 RET' and it will narrow the list down automatically. Iʼm sure we could do something similar. Richard> This should include all the characters Emacs supports, divided clearly Richard> into Unicode code blocks, with their unicode names. Not just the ones Richard> that have specific short C-x 8 sequences definied in Emacs. Why does it matter which code block a character is in? Richard> It would be nice to have a prefix more mnemonic than C-x 8. Richard> But I have nothing to suggest. Richard> It would be good to shorten C-x 8 RET. That is my go-to method Richard> of inserting characters for which I don't know a sequence. Where would you put it? Note that if you do know the sequence you can use Alt or a dead accent key instead of 'C-x 8' (someone did suggest freeing up F2 recently) Richard> Currently, 8 upper-case letters are valid after C-h 8, and 6 Richard> lower-case. Suppose we free up one case -- either the upper-case Richard> letters or the lower-case letters. Then we could make typing Richard> a letter of that case throw you into the minibuffer. I think itʼs a tossup as to which of them would be easier to free up. The lower case bindings have one fewer prefix key, so perhaps lower case. Or perhaps a completely different binding. Richard> In this way, we could replace C-x 8 RET UNICODE-NAME RET with Richard> C-x 8 UNICODE-NAME RET. Richard> Also, why not change the Unicode character names to lower-case? Richard> They would look nicer that way, I think. The Unicode character names are always described in upper case, but I guess we could add a configuration option so that 'ucs-names' downcased them (the completion in C-x 8 RET is case insensitive) Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 10:43 ` Robert Pluim @ 2020-10-15 3:54 ` Richard Stallman 0 siblings, 0 replies; 109+ messages in thread From: Richard Stallman @ 2020-10-15 3:54 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Richard> Maybe C-h in the argument for C-x 8 could display a buffer > Richard> which displays characters you can choose. > Richard> Each character would be followed by the sequence to type to choose that > Richard> character. > The problem is that such a list is very long. 'C-h b' after 'C-x 8 > RET' will display the bindings, but it does not currently contain the > character names, and TAB after 'C-x 8 RET' will list all the names but > not the sequences for entering them. It would be a problem if they are displayed in an inconvenient way, not designed specifically for this purpose. My idea is to display them in a buffer which is divided into pages, so you could use C-x ] and C-x [ to move around in it, as well as search commands. > Why does it matter which code block a character is in? Organizing the buffer by code blocks makes it feasible to navigate through the long list of all the Unicode characters and find the one you want, without knowing its name in advance. -- Dr Richard Stallman Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-14 4:38 ` Richard Stallman 2020-10-14 8:11 ` Juri Linkov 2020-10-14 10:43 ` Robert Pluim @ 2020-10-14 14:56 ` Eli Zaretskii 2 siblings, 0 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-14 14:56 UTC (permalink / raw) To: rms; +Cc: rpluim, 43866, juri > From: Richard Stallman <rms@gnu.org> > Date: Wed, 14 Oct 2020 00:38:48 -0400 > Cc: rpluim@gmail.com, 43866@debbugs.gnu.org > > Also, I wonder if we could make that command more self-documenting. > Maybe C-h in the argument for C-x 8 could display a buffer > which displays characters you can choose. I don't understand: "C-x 8 C-h" already shows such a buffer. > Also, why not change the Unicode character names to lower-case? > They would look nicer that way, I think. You can type in lower-case, then TAB will upcase them for you. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì 2020-10-08 12:26 ` Eli Zaretskii @ 2020-10-08 15:23 ` Mattias Engdegård 2020-10-08 15:35 ` Robert Pluim ` (2 more replies) 1 sibling, 3 replies; 109+ messages in thread From: Mattias Engdegård @ 2020-10-08 15:23 UTC (permalink / raw) To: Francesco Potortì, Eli Zaretskii, Robert Pluim; +Cc: 43866 > E= -> € > > Typewriter-style italian characters. If they really are typewriter-style, wouldn't C= make more sense? E overstruck with = would just be a smudgy mess even if typed on a beautiful Olivetti. Both C= and E= seem to work as X11 compose pairs. We could include both. That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 15:23 ` Mattias Engdegård @ 2020-10-08 15:35 ` Robert Pluim 2020-10-08 16:22 ` Francesco Potortì 2020-10-08 15:42 ` Eli Zaretskii 2020-10-08 16:10 ` Francesco Potortì 2 siblings, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-08 15:35 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866 >>>>> On Thu, 8 Oct 2020 17:23:35 +0200, Mattias Engdegård <mattiase@acm.org> said: >> E= -> € >> >> Typewriter-style italian characters. Mattias> If they really are typewriter-style, wouldn't C= make more sense? E overstruck with = would just be a smudgy mess even if typed on a beautiful Olivetti. Mattias> Both C= and E= seem to work as X11 compose pairs. We could include both. Mattias> That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases. C= would make more sense, but E= (and of course =E for prefix input methods) is more mnemonic. I can never remember how to type € on a mac because itʼs not on something obvious like Option-E. Thereʼs precedence from C-x 8 * as well, which uses E (though perhaps it should have 'e' as well). And for total coverage, we *must* add 'C-x 8 2 0 a c' ;-) Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 15:35 ` Robert Pluim @ 2020-10-08 16:22 ` Francesco Potortì 0 siblings, 0 replies; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 16:22 UTC (permalink / raw) To: Robert Pluim; +Cc: Mattias Engdegård, 43866 >And for total coverage, we *must* add 'C-x 8 2 0 a c' ;-) Wow! But I can't imagine what that could produce :) ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 15:23 ` Mattias Engdegård 2020-10-08 15:35 ` Robert Pluim @ 2020-10-08 15:42 ` Eli Zaretskii 2020-10-08 16:10 ` Francesco Potortì 2 siblings, 0 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-08 15:42 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866, rpluim > From: Mattias Engdegård <mattiase@acm.org> > Date: Thu, 8 Oct 2020 17:23:35 +0200 > Cc: 43866@debbugs.gnu.org > > Both C= and E= seem to work as X11 compose pairs. We could include both. I think you are right. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 15:23 ` Mattias Engdegård 2020-10-08 15:35 ` Robert Pluim 2020-10-08 15:42 ` Eli Zaretskii @ 2020-10-08 16:10 ` Francesco Potortì 2020-10-08 17:18 ` Robert Pluim 2 siblings, 1 reply; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 16:10 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866, Robert Pluim >> E= -> € >> >> Typewriter-style italian characters. > >If they really are typewriter-style, wouldn't C= make more sense? E >overstruck with = would just be a smudgy mess even if typed on a >beautiful Olivetti. The euro sign under E is on many Italian keyboards, so that sounds natural to me. Never seen C= as a shortcut, my X understands e= and E= but not c= or C=. > >Both C= and E= seem to work as X11 compose pairs. We could include both. >That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases. I would not include c= and C=. I would not include e= either. These things are very annoying unless they are really useful, so better just include what's really needed and nothing more. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 16:10 ` Francesco Potortì @ 2020-10-08 17:18 ` Robert Pluim 2020-10-08 17:28 ` Francesco Potortì 2020-10-08 17:59 ` Mattias Engdegård 0 siblings, 2 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-08 17:18 UTC (permalink / raw) To: Francesco Potortì; +Cc: Mattias Engdegård, 43866 >>>>> On Thu, 08 Oct 2020 18:10:34 +0200, Francesco Potortì <pot@gnu.org> said: >> Both C= and E= seem to work as X11 compose pairs. We could include both. >> That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases. Francesco> I would not include c= and C=. I would not include e= either. These Francesco> things are very annoying unless they are really useful, so better just Francesco> include what's really needed and nothing more. The advantage of e= is that (on a US-type keyboard at least) it doesnʼt involve any modifiers. But then again adding it would increase the chances of someone typing it by mistake. Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 17:18 ` Robert Pluim @ 2020-10-08 17:28 ` Francesco Potortì 2020-10-08 17:59 ` Mattias Engdegård 1 sibling, 0 replies; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 17:28 UTC (permalink / raw) To: Robert Pluim; +Cc: Mattias Engdegård, 43866 >>>>>> On Thu, 08 Oct 2020 18:10:34 +0200, Francesco Potortì <pot@gnu.org> said: > > >> Both C= and E= seem to work as X11 compose pairs. We could include both. > >> That is, c=, C=, e=, E=, and c==, C==, e==, E== as the literal cases. > > Francesco> I would not include c= and C=. I would not include e= either. These > Francesco> things are very annoying unless they are really useful, so better just > Francesco> include what's really needed and nothing more. > >The advantage of e= is that (on a US-type keyboard at least) it >doesnʼt involve any modifiers. But then again adding it would increase >the chances of someone typing it by mistake. Yes. That's exactly the point. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 17:18 ` Robert Pluim 2020-10-08 17:28 ` Francesco Potortì @ 2020-10-08 17:59 ` Mattias Engdegård 2020-10-08 19:55 ` Francesco Potortì 2020-10-09 4:42 ` Lars Ingebrigtsen 1 sibling, 2 replies; 109+ messages in thread From: Mattias Engdegård @ 2020-10-08 17:59 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866 8 okt. 2020 kl. 19.18 skrev Robert Pluim <rpluim@gmail.com>: > Francesco> I would not include c= and C=. I would not include e= either. These > Francesco> things are very annoying unless they are really useful, so better just > Francesco> include what's really needed and nothing more. > > The advantage of e= is that (on a US-type keyboard at least) it > doesnʼt involve any modifiers. But then again adding it would increase > the chances of someone typing it by mistake. As you noted, e= is already in latin-postfix. It seems odd to require E= in one mode and e= in another. It makes more sense for italian-postfix to be a subset of latin-postfix, so that an Italian user who needs foreign letters can switch without relearning composition pairs. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 17:59 ` Mattias Engdegård @ 2020-10-08 19:55 ` Francesco Potortì 2020-10-09 4:42 ` Lars Ingebrigtsen 1 sibling, 0 replies; 109+ messages in thread From: Francesco Potortì @ 2020-10-08 19:55 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866, Robert Pluim Francesco> I would not include c= and C=. I would not include e= either. These Francesco> things are very annoying unless they are really useful, so better just Francesco> include what's really needed and nothing more. Robert: >> The advantage of e= is that (on a US-type keyboard at least) it >> doesnʼt involve any modifiers. But then again adding it would increase >> the chances of someone typing it by mistake. Mattias: >As you noted, e= is already in latin-postfix. It seems odd to require >E= in one mode and e= in another. It makes more sense for >italian-postfix to be a subset of latin-postfix, so that an Italian >user who needs foreign letters can switch without relearning >composition pairs. I once tried using latin-postfix and I soon stopped, as it creates many more artifacts that you don't want than ones you want. Having a lowcase e for making the euro sign is just one more reason why I wouldn't use latin-postfix. If the agreed-upon solution is to put 'e =' on all European postfix languages, I think that's better leaving them all as they are now. If the only blocking issue here is inconsistency with latin-postfix, then better use 'E =' in place of 'e =' on latin-postfix and adding the same to all European languages. The only drawback would be for those using latin-postfix, but after my experience using it, I don't think it is really used in practice. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-08 17:59 ` Mattias Engdegård 2020-10-08 19:55 ` Francesco Potortì @ 2020-10-09 4:42 ` Lars Ingebrigtsen 2020-10-09 11:26 ` Mattias Engdegård 1 sibling, 1 reply; 109+ messages in thread From: Lars Ingebrigtsen @ 2020-10-09 4:42 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866, Robert Pluim Mattias Engdegård <mattiase@acm.org> writes: > As you noted, e= is already in latin-postfix. It seems odd to require > E= in one mode and e= in another. It makes more sense for > italian-postfix to be a subset of latin-postfix, so that an Italian > user who needs foreign letters can switch without relearning > composition pairs. I'm not sure I agree. An input method specialised to a specific language doesn't have to be a superset of the larger group -- a less-specific input method may be intended for users that have less experience with the language, and therefore be "sloppier"; i.e., have more input methods so that the user will find the character easier (but have more false positives that then will have to be fixed manually). So I think Francesco is right here -- just add E=, and do nothing else here (for the Euro). -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 4:42 ` Lars Ingebrigtsen @ 2020-10-09 11:26 ` Mattias Engdegård 2020-10-09 11:53 ` Thien-Thi Nguyen 0 siblings, 1 reply; 109+ messages in thread From: Mattias Engdegård @ 2020-10-09 11:26 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 43866, Robert Pluim 9 okt. 2020 kl. 06.42 skrev Lars Ingebrigtsen <larsi@gnus.org>: > I'm not sure I agree. An input method specialised to a specific > language doesn't have to be a superset of the larger group -- a > less-specific input method may be intended for users that have less > experience with the language, and therefore be "sloppier"; i.e., have > more input methods so that the user will find the character easier (but > have more false positives that then will have to be fixed manually). The choice of input method is not about proficiency in the language but rather that a more constrained method is more efficient for monolingual use. The many input sequences of 'latin-postfix' are not there to help the learner stumble upon the right one by luck, but to allow the entry of more characters. However, Francesco is right in that latin-postfix is too heavily loaded for smooth use, and I certainly understand why he prefers italian-postfix. 'latin-alt-postfix' is somewhat more practical (and uses e=). > So I think Francesco is right here -- just add E=, and do nothing else > here (for the Euro). I wouldn't mind, although we may be straying a bit into tailoring parts of Emacs to a single user. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 11:26 ` Mattias Engdegård @ 2020-10-09 11:53 ` Thien-Thi Nguyen 2020-10-09 12:45 ` Robert Pluim 0 siblings, 1 reply; 109+ messages in thread From: Thien-Thi Nguyen @ 2020-10-09 11:53 UTC (permalink / raw) To: Mattias Engdegård; +Cc: Lars Ingebrigtsen, 43866, Robert Pluim [-- Attachment #1: Type: text/plain, Size: 749 bytes --] () Mattias Engdegård <mattiase@acm.org> () Fri, 9 Oct 2020 13:26:12 +0200 > So I think Francesco is right here -- just add E=, and do > nothing else here (for the Euro). I wouldn't mind, although we may be straying a bit into tailoring parts of Emacs to a single user. FWIW, i use italian-postfix, too, and would welcome this (E= only) change. -- Thien-Thi Nguyen ----------------------------------------------- (defun responsep (query) ; (2020) Software Libero (pcase (context query) ; = Dissenso Etico (`(technical ,ml) (correctp ml)) ...)) 748E A0E8 1CB8 A748 9BFA --------------------------------------- 6CE4 6703 2224 4C80 7502 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 219 bytes --] ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 11:53 ` Thien-Thi Nguyen @ 2020-10-09 12:45 ` Robert Pluim 2020-10-09 14:31 ` Eli Zaretskii 0 siblings, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-09 12:45 UTC (permalink / raw) To: Thien-Thi Nguyen; +Cc: Mattias Engdegård, Lars Ingebrigtsen, 43866 >>>>> On Fri, 09 Oct 2020 07:53:58 -0400, Thien-Thi Nguyen <ttn@gnuvola.org> said: Thien-Thi> () Mattias Engdegård <mattiase@acm.org> Thien-Thi> () Fri, 9 Oct 2020 13:26:12 +0200 >> So I think Francesco is right here -- just add E=, and do >> nothing else here (for the Euro). Thien-Thi> I wouldn't mind, although we may be straying a bit into Thien-Thi> tailoring parts of Emacs to a single user. Thien-Thi> FWIW, i use italian-postfix, too, and would welcome this (E= Thien-Thi> only) change. I guess Iʼm the weirdo here: I use latin-prefix :-) (it has € on ~e, which is not a great choice: various other latin-prefix methods use ~e and ~E for other codepoints. Perhaps we should add =E (or =e) to latin-prefix and maybe the other latin-N-prefix methods) Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 12:45 ` Robert Pluim @ 2020-10-09 14:31 ` Eli Zaretskii 2020-10-09 14:48 ` Robert Pluim 2020-10-09 15:05 ` Mattias Engdegård 0 siblings, 2 replies; 109+ messages in thread From: Eli Zaretskii @ 2020-10-09 14:31 UTC (permalink / raw) To: Robert Pluim; +Cc: mattiase, larsi, 43866, ttn > From: Robert Pluim <rpluim@gmail.com> > Date: Fri, 09 Oct 2020 14:45:34 +0200 > Cc: Mattias Engdegård <mattiase@acm.org>, > Lars Ingebrigtsen <larsi@gnus.org>, 43866@debbugs.gnu.org > > Thien-Thi> FWIW, i use italian-postfix, too, and would welcome this (E= > Thien-Thi> only) change. > > I guess Iʼm the weirdo here: I use latin-prefix :-) > > (it has € on ~e, which is not a great choice: various other > latin-prefix methods use ~e and ~E for other codepoints. Perhaps we > should add =E (or =e) to latin-prefix and maybe the other > latin-N-prefix methods) Based on the discussion, I've decided to make a minimal change, so I added E= as a sequence for the Euro sign to Latin-1 language input methods (on the master branch). Any reason not to close this bug report now? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 14:31 ` Eli Zaretskii @ 2020-10-09 14:48 ` Robert Pluim 2020-10-09 15:04 ` Eli Zaretskii 2020-10-09 15:05 ` Mattias Engdegård 1 sibling, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-09 14:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, larsi, 43866, ttn >>>>> On Fri, 09 Oct 2020 17:31:17 +0300, Eli Zaretskii <eliz@gnu.org> said: Eli> Based on the discussion, I've decided to make a minimal change, so I Eli> added E= as a sequence for the Euro sign to Latin-1 language input Eli> methods (on the master branch). Eli> Any reason not to close this bug report now? If you've decided that the prefix methods donʼt get a similar treatment then we can close it. Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 14:48 ` Robert Pluim @ 2020-10-09 15:04 ` Eli Zaretskii 2020-10-10 20:54 ` Lars Ingebrigtsen 0 siblings, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-09 15:04 UTC (permalink / raw) To: Robert Pluim; +Cc: mattiase, larsi, 43866, ttn > From: Robert Pluim <rpluim@gmail.com> > Cc: ttn@gnuvola.org, mattiase@acm.org, larsi@gnus.org, 43866@debbugs.gnu.org > Date: Fri, 09 Oct 2020 16:48:07 +0200 > > Eli> Any reason not to close this bug report now? > > If you've decided that the prefix methods donʼt get a similar > treatment then we can close it. You mean, use =E for the Euro? I don't mind if there are no objections. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 15:04 ` Eli Zaretskii @ 2020-10-10 20:54 ` Lars Ingebrigtsen 2020-10-12 9:26 ` Robert Pluim 0 siblings, 1 reply; 109+ messages in thread From: Lars Ingebrigtsen @ 2020-10-10 20:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: mattiase, Robert Pluim, 43866, ttn Eli Zaretskii <eliz@gnu.org> writes: >> If you've decided that the prefix methods donʼt get a similar >> treatment then we can close it. > > You mean, use =E for the Euro? I don't mind if there are no > objections. I think it sounds logical, but I don't think we should make such a change without it being requested by somebody using those input methods. Perhaps =E would be annoying for them? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-10 20:54 ` Lars Ingebrigtsen @ 2020-10-12 9:26 ` Robert Pluim 0 siblings, 0 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-12 9:26 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: mattiase, 43866, ttn >>>>> On Sat, 10 Oct 2020 22:54:14 +0200, Lars Ingebrigtsen <larsi@gnus.org> said: Lars> Eli Zaretskii <eliz@gnu.org> writes: >>> If you've decided that the prefix methods donʼt get a similar >>> treatment then we can close it. >> >> You mean, use =E for the Euro? I don't mind if there are no >> objections. Lars> I think it sounds logical, but I don't think we should make such a Lars> change without it being requested by somebody using those input Lars> methods. Perhaps =E would be annoying for them? I donʼt think it would be annoying, but I agree thereʼs probably no need to start adding things people haven't requested (and ~e already exists). Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 14:31 ` Eli Zaretskii 2020-10-09 14:48 ` Robert Pluim @ 2020-10-09 15:05 ` Mattias Engdegård 2020-10-09 15:08 ` Robert Pluim 2020-10-09 15:10 ` Eli Zaretskii 1 sibling, 2 replies; 109+ messages in thread From: Mattias Engdegård @ 2020-10-09 15:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 43866, larsi, Robert Pluim, ttn 9 okt. 2020 kl. 16.31 skrev Eli Zaretskii <eliz@gnu.org>: > Based on the discussion, I've decided to make a minimal change, so I > added E= as a sequence for the Euro sign to Latin-1 language input > methods (on the master branch). The minimal change would be to do it for italian-postfix only but perhaps it doesn't hurt too much elsewhere. (I don't think prefix methods need it.) > Any reason not to close this bug report now? Maybe it merits a NEWS entry? ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 15:05 ` Mattias Engdegård @ 2020-10-09 15:08 ` Robert Pluim 2020-10-09 15:28 ` Mattias Engdegård 2020-10-09 15:10 ` Eli Zaretskii 1 sibling, 1 reply; 109+ messages in thread From: Robert Pluim @ 2020-10-09 15:08 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866, larsi, ttn >>>>> On Fri, 9 Oct 2020 17:05:23 +0200, Mattias Engdegård <mattiase@acm.org> said: Mattias> 9 okt. 2020 kl. 16.31 skrev Eli Zaretskii <eliz@gnu.org>: >> Based on the discussion, I've decided to make a minimal change, so I >> added E= as a sequence for the Euro sign to Latin-1 language input >> methods (on the master branch). Mattias> The minimal change would be to do it for italian-postfix only but perhaps it doesn't hurt too much elsewhere. Mattias> (I don't think prefix methods need it.) Why? If eg french-postfix has it, why not french-prefix? >> Any reason not to close this bug report now? Mattias> Maybe it merits a NEWS entry? Itʼs a user-visible change, so I guess so. Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 15:08 ` Robert Pluim @ 2020-10-09 15:28 ` Mattias Engdegård 0 siblings, 0 replies; 109+ messages in thread From: Mattias Engdegård @ 2020-10-09 15:28 UTC (permalink / raw) To: Robert Pluim; +Cc: 43866, larsi, ttn 9 okt. 2020 kl. 17.08 skrev Robert Pluim <rpluim@gmail.com>: > Why? If eg french-postfix has it, why not french-prefix? We would have to be rather sure about what it should be; each new sequence will be a potential point of annoyance. The prefix methods that define € use ~e, but our users apparently didn't want us to follow existing practice for the postfix methods (e=). ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 15:05 ` Mattias Engdegård 2020-10-09 15:08 ` Robert Pluim @ 2020-10-09 15:10 ` Eli Zaretskii 2020-10-09 15:21 ` Robert Pluim 1 sibling, 1 reply; 109+ messages in thread From: Eli Zaretskii @ 2020-10-09 15:10 UTC (permalink / raw) To: Mattias Engdegård; +Cc: 43866, larsi, rpluim, ttn > From: Mattias Engdegård <mattiase@acm.org> > Date: Fri, 9 Oct 2020 17:05:23 +0200 > Cc: Robert Pluim <rpluim@gmail.com>, ttn@gnuvola.org, larsi@gnus.org, > 43866@debbugs.gnu.org > > 9 okt. 2020 kl. 16.31 skrev Eli Zaretskii <eliz@gnu.org>: > > > Based on the discussion, I've decided to make a minimal change, so I > > added E= as a sequence for the Euro sign to Latin-1 language input > > methods (on the master branch). > > The minimal change would be to do it for italian-postfix only but perhaps it doesn't hurt too much elsewhere. I couldn't explain to myself why Italian should have it, but, say, German or French shouldn't. > (I don't think prefix methods need it.) OK. > > Any reason not to close this bug report now? > > Maybe it merits a NEWS entry? Sounds too small to announce, but if others think it should be in NEWS, I won't object. ^ permalink raw reply [flat|nested] 109+ messages in thread
* bug#43866: 26.3; italian postfix additions 2020-10-09 15:10 ` Eli Zaretskii @ 2020-10-09 15:21 ` Robert Pluim 0 siblings, 0 replies; 109+ messages in thread From: Robert Pluim @ 2020-10-09 15:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Mattias Engdegård, larsi, 43866, ttn >>>>> On Fri, 09 Oct 2020 18:10:48 +0300, Eli Zaretskii <eliz@gnu.org> said: Eli> Sounds too small to announce, but if others think it should be in Eli> NEWS, I won't object. git never forgets :-) commit 3409fe0362c52127c52f854a7300f4dde4b8fffe Author: Eli Zaretskii <eliz@gnu.org> Date: Thu Mar 29 19:45:13 2018 +0300 Support Capital sharp S in German input methods * lisp/leim/quail/latin-post.el ("german-postfix"): * lisp/leim/quail/latin-pre.el ("german-prefix"): Add Capital sharp S. (Bug#30988) * etc/NEWS: Mention the support of Capital sharp S. Robert -- ^ permalink raw reply [flat|nested] 109+ messages in thread
end of thread, other threads:[~2022-05-29 13:35 UTC | newest] Thread overview: 109+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì 2020-10-08 12:26 ` Eli Zaretskii 2020-10-08 12:34 ` Francesco Potortì 2020-10-08 12:39 ` Robert Pluim 2020-10-08 12:57 ` Eli Zaretskii 2020-10-08 13:54 ` Robert Pluim 2020-10-08 14:24 ` Robert Pluim 2020-10-08 14:32 ` Eli Zaretskii 2020-10-08 13:26 ` Francesco Potortì 2020-10-08 14:00 ` Robert Pluim 2020-10-13 20:07 ` Juri Linkov 2020-10-14 2:31 ` Eli Zaretskii 2020-10-14 8:07 ` Juri Linkov 2020-10-14 15:07 ` Eli Zaretskii 2020-10-14 19:40 ` Juri Linkov 2020-10-15 2:34 ` Eli Zaretskii 2020-10-19 20:45 ` Juri Linkov 2020-10-19 23:12 ` Stefan Kangas 2020-10-20 18:42 ` Juri Linkov 2020-10-20 14:12 ` Eli Zaretskii 2020-10-20 14:47 ` Robert Pluim 2020-10-20 15:50 ` Eli Zaretskii 2020-10-20 18:44 ` Juri Linkov 2020-10-20 19:05 ` Juri Linkov 2020-10-21 8:11 ` Robert Pluim 2020-10-21 14:29 ` Eli Zaretskii 2020-10-21 14:40 ` Robert Pluim 2020-10-21 15:23 ` Eli Zaretskii 2020-10-21 17:30 ` Juri Linkov 2020-10-20 19:56 ` Juri Linkov 2020-10-21 14:02 ` Eli Zaretskii 2020-10-21 17:23 ` Juri Linkov 2020-10-21 18:16 ` Eli Zaretskii 2020-10-21 18:27 ` Juri Linkov 2020-10-21 18:35 ` Eli Zaretskii 2020-10-21 19:39 ` Juri Linkov 2020-10-22 12:59 ` Eli Zaretskii 2020-10-22 20:56 ` bug#44155: Print integers as characters Juri Linkov 2020-10-22 22:39 ` Andreas Schwab 2020-10-23 8:16 ` Juri Linkov 2020-10-23 8:32 ` Juri Linkov 2020-10-24 19:53 ` Juri Linkov 2020-10-25 17:22 ` Eli Zaretskii 2020-10-25 19:09 ` Juri Linkov 2020-10-25 19:53 ` Eli Zaretskii 2020-10-27 20:08 ` Juri Linkov 2020-10-28 15:51 ` Eli Zaretskii 2020-10-28 19:41 ` Juri Linkov 2020-10-29 14:20 ` Eli Zaretskii 2020-10-29 21:00 ` Juri Linkov 2020-10-30 7:35 ` Eli Zaretskii 2020-10-31 20:11 ` Juri Linkov 2020-10-31 23:27 ` Glenn Morris 2020-11-01 7:58 ` Juri Linkov 2020-11-01 15:13 ` Eli Zaretskii 2020-11-01 18:39 ` Juri Linkov 2020-11-01 18:51 ` Eli Zaretskii 2020-11-01 19:13 ` Juri Linkov 2020-11-01 19:41 ` Eli Zaretskii 2020-11-01 20:16 ` Juri Linkov 2020-11-01 12:03 ` Mattias Engdegård 2020-11-01 18:35 ` Juri Linkov 2020-11-01 20:52 ` Mattias Engdegård 2020-11-02 21:36 ` Juri Linkov 2020-11-02 23:03 ` Mattias Engdegård 2020-11-03 8:30 ` Juri Linkov 2020-11-03 15:24 ` Eli Zaretskii 2020-11-03 18:47 ` Mattias Engdegård 2020-11-03 19:36 ` Eli Zaretskii 2020-11-04 11:03 ` Mattias Engdegård 2020-11-04 15:38 ` Eli Zaretskii 2020-11-04 16:46 ` Mattias Engdegård 2020-11-04 16:58 ` Mattias Engdegård 2020-11-06 13:02 ` Mattias Engdegård 2022-04-30 12:19 ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen 2022-04-30 12:29 ` Eli Zaretskii 2022-04-30 14:49 ` Lars Ingebrigtsen 2022-04-30 15:26 ` Eli Zaretskii 2022-04-30 18:49 ` Lars Ingebrigtsen 2022-05-29 13:35 ` Lars Ingebrigtsen 2020-10-15 3:52 ` Richard Stallman 2020-10-14 4:38 ` Richard Stallman 2020-10-14 8:11 ` Juri Linkov 2020-10-14 10:43 ` Robert Pluim 2020-10-15 3:54 ` Richard Stallman 2020-10-14 14:56 ` Eli Zaretskii 2020-10-08 15:23 ` Mattias Engdegård 2020-10-08 15:35 ` Robert Pluim 2020-10-08 16:22 ` Francesco Potortì 2020-10-08 15:42 ` Eli Zaretskii 2020-10-08 16:10 ` Francesco Potortì 2020-10-08 17:18 ` Robert Pluim 2020-10-08 17:28 ` Francesco Potortì 2020-10-08 17:59 ` Mattias Engdegård 2020-10-08 19:55 ` Francesco Potortì 2020-10-09 4:42 ` Lars Ingebrigtsen 2020-10-09 11:26 ` Mattias Engdegård 2020-10-09 11:53 ` Thien-Thi Nguyen 2020-10-09 12:45 ` Robert Pluim 2020-10-09 14:31 ` Eli Zaretskii 2020-10-09 14:48 ` Robert Pluim 2020-10-09 15:04 ` Eli Zaretskii 2020-10-10 20:54 ` Lars Ingebrigtsen 2020-10-12 9:26 ` Robert Pluim 2020-10-09 15:05 ` Mattias Engdegård 2020-10-09 15:08 ` Robert Pluim 2020-10-09 15:28 ` Mattias Engdegård 2020-10-09 15:10 ` Eli Zaretskii 2020-10-09 15:21 ` Robert Pluim
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.