unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 44155@debbugs.gnu.org, schwab@suse.de, juri@linkov.net
Subject: bug#44155: Print integers as characters
Date: Tue, 3 Nov 2020 19:47:17 +0100	[thread overview]
Message-ID: <650DFF04-509F-4B8C-9C53-F38DC10B9F97@acm.org> (raw)
In-Reply-To: <83ft5qcvl6.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 781 bytes --]

3 nov. 2020 kl. 16.24 skrev Eli Zaretskii <eliz@gnu.org>:

> What is meant by "printable characters" here?  One could think you
> mean [:print:], but that doesn't seem to be what then code does.

Non-control characters (characters other than control characters), in this case. I wanted to keep things simple and not involve the Unicode database in the printer.

(For that matter, [:print:] is a regexp feature and doesn't really define the meaning of 'printable', but your question was valid.)

On the other hand, printing all non-controls using the ?X syntax is maybe not ideal. Attached is a new patch that uses Unicode properties to select only printable base characters.

This patch also removes \a, \v, \e and \d from the characters printed as escaped controls.


[-- Attachment #2: 0001-Reduce-integer-output-format-to-print-integers-as-ch.patch --]
[-- Type: application/octet-stream, Size: 11404 bytes --]

From 3da6d9055b0ae68fc7b3bbee52885113c8c30b6d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Mon, 2 Nov 2020 23:37:16 +0100
Subject: [PATCH] Reduce integer-output-format to print-integers-as-characters

The variable now only controls whether characters are printed, not
the radix.  Control chars are printed in human-readable syntax
such as ?\n if available, as numbers otherwise (bug#44155).
Done in collaboration with Juri Linkov.

* src/character.c (printable_base_p):
* src/print.c (named_escape): New functions.
(print_object): Change semantics as described above.
(syms_of_print): Rename integer-output-format.  Update doc string.
* doc/lispref/streams.texi (Output Variables):
* etc/NEWS:
* test/src/print-tests.el (print-integers-as-characters):
Rename and update according to new semantics.  The test now passes.
---
 doc/lispref/streams.texi | 13 +++++----
 etc/NEWS                 | 11 ++++---
 src/character.c          | 21 ++++++++++++++
 src/character.h          |  1 +
 src/print.c              | 63 ++++++++++++++++++++++++++--------------
 test/src/print-tests.el  | 39 +++++++++++++------------
 6 files changed, 96 insertions(+), 52 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index f171f13779..4bc97e4c48 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -903,10 +903,11 @@ Output Variables
 you can use, see the variable's documentation string.
 @end defvar
 
-@defvar integer-output-format
-This variable specifies how to print integer numbers.  The default is
-@code{nil}, meaning use the decimal format.  When bound to @code{t},
-print integers as characters when an integer represents a character
-(@pxref{Basic Char Syntax}).  When bound to the number @code{16},
-print non-negative integers in the hexadecimal format.
+@defvar print-integers-as-characters
+When this variable is non-@code{nil}, integers that represent
+printable characters or control characters with their own escape
+syntax such as newline will be printed using Lisp character syntax
+(@pxref{Basic Char Syntax}).  Other numbers are printed the usual way.
+For example, the list @code{(4 65 -1 10)} will be printed as
+@samp{(4 ?A -1 ?\n)}.
 @end defvar
diff --git a/etc/NEWS b/etc/NEWS
index e11effc9e8..384c64a91e 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1689,12 +1689,6 @@ ledit.el, lmenu.el, lucid.el and old-whitespace.el.
 \f
 * Lisp Changes in Emacs 28.1
 
-** New variable 'integer-output-format' determines how to print integer values.
-When this variable is bound to the value 't', integers are printed by
-printing functions as characters when an integer represents a character.
-When bound to the number 16, non-negative integers are printed in the
-hexadecimal format.
-
 +++
 ** 'define-globalized-minor-mode' now takes a ':predicate' parameter.
 This can be used to control which major modes the minor mode should be
@@ -1887,6 +1881,11 @@ file can affect code in another.  For details, see the manual section
 'replace-regexp-in-string', 'catch', 'throw', 'error', 'signal'
 and 'play-sound-file'.
 
++++
+** New variable 'print-integers-as-characters' modifies integer printing.
+When this variable is non-nil, character syntax is used for printing
+numbers for which this makes sense, such as '?*' for 42.
+
 \f
 * Changes in Emacs 28.1 on Non-Free Operating Systems
 
diff --git a/src/character.c b/src/character.c
index 5860f6a0c8..6d18e78f26 100644
--- a/src/character.c
+++ b/src/character.c
@@ -982,6 +982,27 @@ printablep (int c)
 	    || gen_cat == UNICODE_CATEGORY_Cn)); /* unassigned */
 }
 
+/* Return true if C is a printable independent character.  */
+bool
+printable_base_p (int c)
+{
+  Lisp_Object category = CHAR_TABLE_REF (Vunicode_category_table, c);
+  if (! FIXNUMP (category))
+    return false;
+  EMACS_INT gen_cat = XFIXNUM (category);
+
+  /* See UTS #18.  */
+  return (!(gen_cat == UNICODE_CATEGORY_Mn       /* mark, nonspacing */
+            || gen_cat == UNICODE_CATEGORY_Mc    /* mark, combining */
+            || gen_cat == UNICODE_CATEGORY_Me    /* mark, enclosing */
+            || gen_cat == UNICODE_CATEGORY_Zl    /* separator, line */
+            || gen_cat == UNICODE_CATEGORY_Zp    /* separator, paragraph */
+            || gen_cat == UNICODE_CATEGORY_Cc    /* other, control */
+            || gen_cat == UNICODE_CATEGORY_Cs    /* other, surrogate */
+            || gen_cat == UNICODE_CATEGORY_Cf    /* other, format */
+            || gen_cat == UNICODE_CATEGORY_Cn)); /* other, unassigned */
+}
+
 /* Return true if C is a horizontal whitespace character, as defined
    by https://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
 bool
diff --git a/src/character.h b/src/character.h
index af5023f77c..260c550108 100644
--- a/src/character.h
+++ b/src/character.h
@@ -583,6 +583,7 @@ char_surrogate_p (int c)
 extern bool graphicp (int);
 extern bool printablep (int);
 extern bool blankp (int);
+extern bool printable_base_p (int);
 
 /* Look up the element in char table OBJ at index CH, and return it as
    an integer.  If the element is not a character, return CH itself.  */
diff --git a/src/print.c b/src/print.c
index fa65a3cb26..f7158dbac0 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1848,6 +1848,24 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
   return true;
 }
 
+static char
+named_escape (int i)
+{
+  switch (i)
+    {
+    case '\b': return 'b';
+    case '\t': return 't';
+    case '\n': return 'n';
+    case '\f': return 'f';
+    case '\r': return 'r';
+    case ' ':  return 's';
+      /* \a, \v, \e and \d are excluded from printing as escapes since
+         they are somewhat rare as characters and more likely to be
+         plain integers. */
+    }
+  return 0;
+}
+
 static void
 print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
 {
@@ -1908,29 +1926,30 @@ print_object (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag)
     {
     case_Lisp_Int:
       {
-	int c;
-	intmax_t i;
+        EMACS_INT i = XFIXNUM (obj);
+        char escaped_name;
 
-	if (EQ (Vinteger_output_format, Qt) && CHARACTERP (obj)
-	    && (c = XFIXNUM (obj)))
+	if (print_integers_as_characters && i >= 0 && i <= MAX_UNICODE_CHAR
+            && ((escaped_name = named_escape (i))
+                || printable_base_p (i)))
 	  {
 	    printchar ('?', printcharfun);
-	    if (escapeflag
-		&& (c == ';' || c == '(' || c == ')' || c == '{' || c == '}'
-		    || c == '[' || c == ']' || c == '\"' || c == '\'' || c == '\\'))
+            if (escaped_name)
+              {
+                printchar ('\\', printcharfun);
+                i = escaped_name;
+              }
+            else if (escapeflag
+                     && (i == ';' || i == '\"' || i == '\'' || i == '\\'
+                         || i == '(' || i == ')'
+                         || i == '{' || i == '}'
+                         || i == '[' || i == ']'))
 	      printchar ('\\', printcharfun);
-	    printchar (c, printcharfun);
-	  }
-	else if (INTEGERP (Vinteger_output_format)
-		 && integer_to_intmax (Vinteger_output_format, &i)
-		 && i == 16 && !NILP (Fnatnump (obj)))
-	  {
-	    int len = sprintf (buf, "#x%"pI"x", (EMACS_UINT) XFIXNUM (obj));
-	    strout (buf, len, len, printcharfun);
+	    printchar (i, printcharfun);
 	  }
 	else
 	  {
-	    int len = sprintf (buf, "%"pI"d", XFIXNUM (obj));
+	    int len = sprintf (buf, "%"pI"d", i);
 	    strout (buf, len, len, printcharfun);
 	  }
       }
@@ -2270,12 +2289,12 @@ syms_of_print (void)
 that represents the number without losing information.  */);
   Vfloat_output_format = Qnil;
 
-  DEFVAR_LISP ("integer-output-format", Vinteger_output_format,
-	       doc: /* The format used to print integers.
-When t, print characters from integers that represent a character.
-When a number 16, print non-negative integers in the hexadecimal format.
-Otherwise, by default print integers in the decimal format.  */);
-  Vinteger_output_format = Qnil;
+  DEFVAR_BOOL ("print-integers-as-characters", print_integers_as_characters,
+	       doc: /* Non-nil means integers are printed using characters syntax.
+Only printable characters, and control characters with named escape
+sequences such as newline, are printed this way.  Other integers,
+including those corresponding to raw bytes, are not affected.  */);
+  print_integers_as_characters = Qnil;
 
   DEFVAR_LISP ("print-length", Vprint_length,
 	       doc: /* Maximum length of list to print before abbreviating.
diff --git a/test/src/print-tests.el b/test/src/print-tests.el
index 7b026b6b21..05b1e4e6e4 100644
--- a/test/src/print-tests.el
+++ b/test/src/print-tests.el
@@ -383,25 +383,28 @@ print-hash-table-test
       (let ((print-length 1))
         (format "%S" h))))))
 
-(print-tests--deftest print-integer-output-format ()
+(print-tests--deftest print-integers-as-characters ()
   ;; Bug#44155.
-  (let ((integer-output-format t)
-        (syms (list ?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?Á)))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat #'prin1-char syms " ") ")"))))
-  (let ((integer-output-format t)
-        (syms (list -1 0 1 ?\120 4194175 4194176 (max-char) (1+ (max-char)))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms)))
-  (let ((integer-output-format 16)
-        (syms (list -1 0 1 most-positive-fixnum (1+ most-positive-fixnum))))
-    (should (equal (read (print-tests--prin1-to-string syms)) syms))
-    (should (equal (print-tests--prin1-to-string syms)
-                   (concat "(" (mapconcat
-                                (lambda (i)
-                                  (if (and (>= i 0) (<= i most-positive-fixnum))
-                                      (format "#x%x" i) (format "%d" i)))
-                                syms " ") ")")))))
+  (let* ((print-integers-as-characters t)
+         (chars '(?? ?\; ?\( ?\) ?\{ ?\} ?\[ ?\] ?\" ?\' ?\\ ?f ?~ ?Á 32
+                  ?\n ?\r ?\t ?\b ?\f ?\a ?\v ?\e ?\d))
+         (nums '(-1 -65 0 1 31 #x80 #x9f #x110000 #x3fff80 #x3fffff))
+         (nonprints '(#xd800 #xdfff #x030a #xffff #x200c))
+         (printed-chars (print-tests--prin1-to-string chars))
+         (printed-nums (print-tests--prin1-to-string nums))
+         (printed-nonprints (print-tests--prin1-to-string nonprints)))
+    (should (equal (read printed-chars) chars))
+    (should (equal
+             printed-chars
+             (concat
+              "(?? ?\\; ?\\( ?\\) ?\\{ ?\\} ?\\[ ?\\] ?\\\" ?\\' ?\\\\"
+              " ?f ?~ ?Á ?\\s ?\\n ?\\r ?\\t ?\\b ?\\f 7 11 27 127)")))
+    (should (equal (read printed-nums) nums))
+    (should (equal printed-nums
+                   "(-1 -65 0 1 31 128 159 1114112 4194176 4194303)"))
+    (should (equal (read printed-nonprints) nonprints))
+    (should (equal printed-nonprints
+                   "(55296 57343 778 65535 8204)"))))
 
 (provide 'print-tests)
 ;;; print-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


  reply	other threads:[~2020-11-03 18:47 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-08 12:05 bug#43866: 26.3; italian postfix additions Francesco Potortì
2020-10-08 12:26 ` Eli Zaretskii
2020-10-08 12:34   ` Francesco Potortì
2020-10-08 12:39   ` Robert Pluim
2020-10-08 12:57     ` Eli Zaretskii
2020-10-08 13:54       ` Robert Pluim
2020-10-08 14:24         ` Robert Pluim
2020-10-08 14:32           ` Eli Zaretskii
2020-10-08 13:26     ` Francesco Potortì
2020-10-08 14:00       ` Robert Pluim
2020-10-13 20:07     ` Juri Linkov
2020-10-14  2:31       ` Eli Zaretskii
2020-10-14  8:07         ` Juri Linkov
2020-10-14 15:07           ` Eli Zaretskii
2020-10-14 19:40             ` Juri Linkov
2020-10-15  2:34               ` Eli Zaretskii
2020-10-19 20:45                 ` Juri Linkov
2020-10-19 23:12                   ` Stefan Kangas
2020-10-20 18:42                     ` Juri Linkov
2020-10-20 14:12                   ` Eli Zaretskii
2020-10-20 14:47                     ` Robert Pluim
2020-10-20 15:50                       ` Eli Zaretskii
2020-10-20 18:44                       ` Juri Linkov
2020-10-20 19:05                     ` Juri Linkov
2020-10-21  8:11                       ` Robert Pluim
2020-10-21 14:29                         ` Eli Zaretskii
2020-10-21 14:40                           ` Robert Pluim
2020-10-21 15:23                             ` Eli Zaretskii
2020-10-21 17:30                         ` Juri Linkov
2020-10-20 19:56                     ` Juri Linkov
2020-10-21 14:02                       ` Eli Zaretskii
2020-10-21 17:23                         ` Juri Linkov
2020-10-21 18:16                           ` Eli Zaretskii
2020-10-21 18:27                             ` Juri Linkov
2020-10-21 18:35                               ` Eli Zaretskii
2020-10-21 19:39                                 ` Juri Linkov
2020-10-22 12:59                                   ` Eli Zaretskii
2020-10-22 20:56                                     ` bug#44155: Print integers as characters Juri Linkov
2020-10-22 22:39                                       ` Andreas Schwab
2020-10-23  8:16                                         ` Juri Linkov
2020-10-23  8:32                                         ` Juri Linkov
2020-10-24 19:53                                           ` Juri Linkov
2020-10-25 17:22                                             ` Eli Zaretskii
2020-10-25 19:09                                               ` Juri Linkov
2020-10-25 19:53                                                 ` Eli Zaretskii
2020-10-27 20:08                                                   ` Juri Linkov
2020-10-28 15:51                                                     ` Eli Zaretskii
2020-10-28 19:41                                                       ` Juri Linkov
2020-10-29 14:20                                                         ` Eli Zaretskii
2020-10-29 21:00                                                           ` Juri Linkov
2020-10-30  7:35                                                             ` Eli Zaretskii
2020-10-31 20:11                                                               ` Juri Linkov
2020-10-31 23:27                                                                 ` Glenn Morris
2020-11-01  7:58                                                                   ` Juri Linkov
2020-11-01 15:13                                                                     ` Eli Zaretskii
2020-11-01 18:39                                                                       ` Juri Linkov
2020-11-01 18:51                                                                         ` Eli Zaretskii
2020-11-01 19:13                                                                           ` Juri Linkov
2020-11-01 19:41                                                                             ` Eli Zaretskii
2020-11-01 20:16                                                                               ` Juri Linkov
2020-11-01 12:03                                       ` Mattias Engdegård
2020-11-01 18:35                                         ` Juri Linkov
2020-11-01 20:52                                           ` Mattias Engdegård
2020-11-02 21:36                                             ` Juri Linkov
2020-11-02 23:03                                               ` Mattias Engdegård
2020-11-03  8:30                                                 ` Juri Linkov
2020-11-03 15:24                                                 ` Eli Zaretskii
2020-11-03 18:47                                                   ` Mattias Engdegård [this message]
2020-11-03 19:36                                                     ` Eli Zaretskii
2020-11-04 11:03                                                       ` Mattias Engdegård
2020-11-04 15:38                                                         ` Eli Zaretskii
2020-11-04 16:46                                                           ` Mattias Engdegård
2020-11-04 16:58                                                             ` Mattias Engdegård
2020-11-06 13:02                                                               ` Mattias Engdegård
2022-04-30 12:19                                     ` bug#43866: 26.3; italian postfix additions Lars Ingebrigtsen
2022-04-30 12:29                                       ` Eli Zaretskii
2022-04-30 14:49                                         ` Lars Ingebrigtsen
2022-04-30 15:26                                           ` Eli Zaretskii
2022-04-30 18:49                                             ` Lars Ingebrigtsen
2022-05-29 13:35                                               ` Lars Ingebrigtsen
2020-10-15  3:52         ` Richard Stallman
2020-10-14  4:38       ` Richard Stallman
2020-10-14  8:11         ` Juri Linkov
2020-10-14 10:43         ` Robert Pluim
2020-10-15  3:54           ` Richard Stallman
2020-10-14 14:56         ` Eli Zaretskii
2020-10-08 15:23 ` Mattias Engdegård
2020-10-08 15:35   ` Robert Pluim
2020-10-08 16:22     ` Francesco Potortì
2020-10-08 15:42   ` Eli Zaretskii
2020-10-08 16:10   ` Francesco Potortì
2020-10-08 17:18     ` Robert Pluim
2020-10-08 17:28       ` Francesco Potortì
2020-10-08 17:59       ` Mattias Engdegård
2020-10-08 19:55         ` Francesco Potortì
2020-10-09  4:42         ` Lars Ingebrigtsen
2020-10-09 11:26           ` Mattias Engdegård
2020-10-09 11:53             ` Thien-Thi Nguyen
2020-10-09 12:45               ` Robert Pluim
2020-10-09 14:31                 ` Eli Zaretskii
2020-10-09 14:48                   ` Robert Pluim
2020-10-09 15:04                     ` Eli Zaretskii
2020-10-10 20:54                       ` Lars Ingebrigtsen
2020-10-12  9:26                         ` Robert Pluim
2020-10-09 15:05                   ` Mattias Engdegård
2020-10-09 15:08                     ` Robert Pluim
2020-10-09 15:28                       ` Mattias Engdegård
2020-10-09 15:10                     ` Eli Zaretskii
2020-10-09 15:21                       ` Robert Pluim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=650DFF04-509F-4B8C-9C53-F38DC10B9F97@acm.org \
    --to=mattiase@acm.org \
    --cc=44155@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=juri@linkov.net \
    --cc=schwab@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).