unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Noam Postavsky <npostavs@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 32939@debbugs.gnu.org
Subject: bug#32939: 27.0.50; Don't reject confusable quotes, just fontify & mention them in error messages
Date: Mon, 29 Oct 2018 21:30:43 -0400	[thread overview]
Message-ID: <87woq0i5ks.fsf@gmail.com> (raw)
In-Reply-To: <838t2joh34.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 27 Oct 2018 12:48:47 +0300")

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

> However, perhaps we should mention this behavior in "Basic Char
> Syntax" and the related use of this face in "Faces for Font Lock".
> WDYT?

I've tried adding some mentions of it in those places (unfilled to
simplify diff).  Though I'm not sure if it makes sense to talk about
this somewhat obscure case.

>> -(ert-deftest lread-tests--funny-quote-symbols ()

> Should we replace this test by one that tests the fontification?

Right, good point, added.  And I had missed disabling fontification for
backslash escaped characters, now fixed.


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 11745 bytes --]

From b19dbd60af4282ab8dcd1512f6910da6671d8a4a Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sat, 10 Mar 2018 18:12:55 -0500
Subject: [PATCH v3 2/2] Improve errors & warnings due to fancy quoted vars
 (Bug#32939)

Add some hints to the message for byte compiler free & unused variable
warnings, and 'void-variable' errors where the variable has confusable
quote characters in it.
* lisp/help.el (uni-confusables), uni-confusables-regexp): New
constants.
(help-command-error-confusable-suggestions): New function, added to
`command-error-function'.
(help-uni-confusable-suggestions): New function.
* lisp/emacs-lisp/bytecomp.el (byte-compile-variable-ref):
* lisp/emacs-lisp/cconv.el (cconv--analyze-use): Use it.

* lisp/emacs-lisp/lisp-mode.el
(lisp--match-confusable-symbol-character): New function.
(lisp-fdefs): Use it to fontify confusable characters with
font-lock-warning-face when they occur in symbol names.
* doc/lispref/modes.texi (Faces for Font Lock):
* doc/lispref/objects.texi (Basic Char Syntax): Recommend backslash
escaping of confusable characters, and mention new fontification.
* etc/NEWS: Announce the new fontification behavior.
* test/lisp/emacs-lisp/lisp-mode-tests.el (lisp-fontify-confusables):
New test.
---
 doc/lispref/modes.texi                  |  3 +-
 doc/lispref/objects.texi                |  7 ++++-
 etc/NEWS                                |  4 +++
 lisp/emacs-lisp/bytecomp.el             | 10 +++++--
 lisp/emacs-lisp/cconv.el                |  6 ++--
 lisp/emacs-lisp/lisp-mode.el            | 18 +++++++++++-
 lisp/help.el                            | 49 +++++++++++++++++++++++++++++++++
 test/lisp/emacs-lisp/lisp-mode-tests.el | 24 ++++++++++++++++
 8 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/doc/lispref/modes.texi b/doc/lispref/modes.texi
index 49b7e1ea3f..243e535988 100644
--- a/doc/lispref/modes.texi
+++ b/doc/lispref/modes.texi
@@ -3190,7 +3190,8 @@ Faces for Font Lock
 @table @code
 @item font-lock-warning-face
 @vindex font-lock-warning-face
-for a construct that is peculiar, or that greatly changes the meaning of
+for a construct that is peculiar (e.g., an unescaped confusable quote
+in an Emacs Lisp symbol like @samp{‘foo}), or that greatly changes the meaning of
 other text, like @samp{;;;###autoload} in Emacs Lisp and @samp{#error}
 in C.
 
diff --git a/doc/lispref/objects.texi b/doc/lispref/objects.texi
index a0940032ee..f82ee7c49a 100644
--- a/doc/lispref/objects.texi
+++ b/doc/lispref/objects.texi
@@ -347,7 +347,12 @@ Basic Char Syntax
 characters.  However, you must add a backslash before any of the
 characters @samp{()[]\;"}, and you should add a backslash before any
 of the characters @samp{|'`#.,} to avoid confusing the Emacs commands
-for editing Lisp code.  You can also add a backslash before whitespace
+for editing Lisp code.  You should also add a backslash before Unicode
+characters which resemble the previously mentioned @acronym{ASCII}
+ones, to avoid confusing people reading your code.  Emacs will
+highlight some non-escaped commonly confused characters such as
+@samp{‘} to encourage this.
+                        You can also add a backslash before whitespace
 characters such as space, tab, newline and formfeed.  However, it is
 cleaner to use one of the easily readable escape sequences, such as
 @samp{\t} or @samp{\s}, instead of an actual whitespace character such
diff --git a/etc/NEWS b/etc/NEWS
index 3530e04467..395169253d 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1089,6 +1089,10 @@ and if the new behavior breaks your code please email
 32252@debbugs.gnu.org.  Because %o and %x can now format signed
 integers, they now support the + and space flags.
 
+** In Emacs Lisp mode, symbols with confusable quotes are highlighted.
+For example, the first character in '‘foo' would be highlighted in
+'font-lock-warning-face'.
+
 +++
 ** Omitting variables after '&optional' and '&rest' is now allowed.
 For example (defun foo (&optional)) is no longer an error.  This is
diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index 8b67e1007d..31d589ab02 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -3407,7 +3407,10 @@ byte-compile-variable-ref
 		  (boundp var)
 		  (memq var byte-compile-bound-variables)
 		  (memq var byte-compile-free-references))
-	(byte-compile-warn "reference to free variable `%S'" var)
+        (let* ((varname (prin1-to-string var))
+               (suggestions (help-uni-confusable-suggestions varname)))
+          (byte-compile-warn "reference to free variable `%s'%s" varname
+                             (if suggestions (concat "\n  " suggestions) "")))
 	(push var byte-compile-free-references))
       (byte-compile-dynamic-variable-op 'byte-varref var))))
 
@@ -3423,7 +3426,10 @@ byte-compile-variable-set
 		  (boundp var)
 		  (memq var byte-compile-bound-variables)
 		  (memq var byte-compile-free-assignments))
-	(byte-compile-warn "assignment to free variable `%s'" var)
+        (let* ((varname (prin1-to-string var))
+               (suggestions (help-uni-confusable-suggestions varname)))
+          (byte-compile-warn "assignment to free variable `%s'%s" varname
+                             (if suggestions (concat "\n  " suggestions) "")))
 	(push var byte-compile-free-assignments))
       (byte-compile-dynamic-variable-op 'byte-varset var))))
 
diff --git a/lisp/emacs-lisp/cconv.el b/lisp/emacs-lisp/cconv.el
index 010026b416..b62c69868b 100644
--- a/lisp/emacs-lisp/cconv.el
+++ b/lisp/emacs-lisp/cconv.el
@@ -591,8 +591,10 @@ cconv--analyze-use
               (eq ?_ (aref (symbol-name var) 0))
 	      ;; As a special exception, ignore "ignore".
 	      (eq var 'ignored))
-       (byte-compile-warn "Unused lexical %s `%S'"
-                          varkind var)))
+       (let ((suggestions (help-uni-confusable-suggestions (symbol-name var))))
+         (byte-compile-warn "Unused lexical %s `%S'%s"
+                            varkind var
+                            (if suggestions (concat "\n  " suggestions) "")))))
     ;; If it's unused, there's no point converting it into a cons-cell, even if
     ;; it's captured and mutated.
     (`(,binder ,_ t t ,_)
diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index f31b4d4dd2..619d525309 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -280,6 +280,19 @@ elisp--font-lock-backslash
          `(face ,font-lock-warning-face
                 help-echo "This \\ has no effect"))))
 
+(defun lisp--match-confusable-symbol-character  (limit)
+  ;; Match a confusable character within a Lisp symbol.
+  (catch 'matched
+    (while t
+      (if (re-search-forward uni-confusables-regexp limit t)
+          ;; Skip confusables which are backslash escaped, or inside
+          ;; strings or comments.
+          (save-match-data
+            (unless (or (eq (char-before (match-beginning 0)) ?\\)
+                        (nth 8 (syntax-ppss)))
+              (throw 'matched t)))
+        (throw 'matched nil)))))
+
 (let-when-compile
     ((lisp-fdefs '("defmacro" "defun"))
      (lisp-vdefs '("defvar"))
@@ -463,7 +476,10 @@ elisp--font-lock-backslash
            (3 'font-lock-regexp-grouping-construct prepend))
          (lisp--match-hidden-arg
           (0 '(face font-lock-warning-face
-               help-echo "Hidden behind deeper element; move to another line?")))
+                    help-echo "Hidden behind deeper element; move to another line?")))
+         (lisp--match-confusable-symbol-character
+          0 '(face font-lock-warning-face
+                    help-echo "Confusable character"))
          ))
       "Gaudy level highlighting for Emacs Lisp mode.")
 
diff --git a/lisp/help.el b/lisp/help.el
index 6e06fc9e1c..2cc28fca24 100644
--- a/lisp/help.el
+++ b/lisp/help.el
@@ -1489,6 +1489,55 @@ help--make-usage-docstring
     (help--docstring-quote (format "%S" (help--make-usage fn arglist)))))
 
 \f
+
+;; Just some quote-like characters for now.  TODO: generate this stuff
+;; from official Unicode data.
+(defconst uni-confusables
+  '((#x2018 . "'") ;; LEFT SINGLE QUOTATION MARK
+    (#x2019 . "'") ;; RIGHT SINGLE QUOTATION MARK
+    (#x201B . "'") ;; SINGLE HIGH-REVERSED-9 QUOTATION MARK
+    (#x201C . "\"") ;; LEFT DOUBLE QUOTATION MARK
+    (#x201D . "\"") ;; RIGHT DOUBLE QUOTATION MARK
+    (#x201F . "\"") ;; DOUBLE HIGH-REVERSED-9 QUOTATION MARK
+    (#x301E . "\"") ;; DOUBLE PRIME QUOTATION MARK
+    (#xFF02 . "'") ;; FULLWIDTH QUOTATION MARK
+    (#xFF07 . "'") ;; FULLWIDTH APOSTROPHE
+    ))
+
+(defconst uni-confusables-regexp
+  (concat "[" (mapcar #'car uni-confusables) "]"))
+
+(defun help-uni-confusable-suggestions (string)
+  "Return a message describing confusables in STRING."
+  (let ((i 0)
+        (confusables nil))
+    (while (setq i (string-match uni-confusables-regexp string i))
+      (let ((replacement (alist-get (aref string i) uni-confusables)))
+        (push (aref string i) confusables)
+        (setq string (replace-match replacement t t string))
+        (setq i (+ i (length replacement)))))
+    (when confusables
+      (format-message
+       (if (> (length confusables) 1)
+           "Found confusable characters: %s; perhaps you meant: `%s'?"
+         "Found confusable character: %s, perhaps you meant: `%s'?")
+       (mapconcat (lambda (c) (format-message "`%c'" c))
+                  confusables ", ")
+       string))))
+
+(defun help-command-error-confusable-suggestions (data _context _signal)
+  (pcase data
+    (`(void-variable ,var)
+     (let ((suggestions (help-uni-confusable-suggestions
+                         (symbol-name var))))
+       (when suggestions
+         (princ (concat "\n  " suggestions) t))))
+    (_ nil)))
+
+(add-function :after command-error-function
+              #'help-command-error-confusable-suggestions)
+
+\f
 (provide 'help)
 
 ;;; help.el ends here
diff --git a/test/lisp/emacs-lisp/lisp-mode-tests.el b/test/lisp/emacs-lisp/lisp-mode-tests.el
index 30f606d381..8339591740 100644
--- a/test/lisp/emacs-lisp/lisp-mode-tests.el
+++ b/test/lisp/emacs-lisp/lisp-mode-tests.el
@@ -20,6 +20,10 @@
 (require 'ert)
 (require 'cl-lib)
 (require 'lisp-mode)
+(require 'faceup)
+
+\f
+;;; Indentation
 
 (defconst lisp-mode-tests--correctly-indented-sexp "\
 \(a
@@ -256,7 +260,27 @@ lisp-mode-tests--correctly-indented-sexp
     (lisp-indent-line)
     (should (equal (buffer-string) "prompt> foo"))))
 
+\f
+;;; Fontification
 
+(ert-deftest lisp-fontify-confusables ()
+  "Unescaped 'smart quotes' should be fontified in `font-lock-warning-face'."
+  (with-temp-buffer
+    (dolist (ch
+             '(#x2018 ;; LEFT SINGLE QUOTATION MARK
+               #x2019 ;; RIGHT SINGLE QUOTATION MARK
+               #x201B ;; SINGLE HIGH-REVERSED-9 QUOTATION MARK
+               #x201C ;; LEFT DOUBLE QUOTATION MARK
+               #x201D ;; RIGHT DOUBLE QUOTATION MARK
+               #x201F ;; DOUBLE HIGH-REVERSED-9 QUOTATION MARK
+               #x301E ;; DOUBLE PRIME QUOTATION MARK
+               #xFF02 ;; FULLWIDTH QUOTATION MARK
+               #xFF07 ;; FULLWIDTH APOSTROPHE
+               ))
+      (insert (format "«w:%c»foo \\%cfoo\n" ch ch)))
+    (let ((faceup (buffer-string)))
+      (faceup-clean-buffer)
+      (should (faceup-test-font-lock-buffer 'emacs-lisp-mode faceup)))))
 
 (provide 'lisp-mode-tests)
 ;;; lisp-mode-tests.el ends here
-- 
2.11.0


  reply	other threads:[~2018-10-30  1:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-04 23:58 bug#32939: 27.0.50; Don't reject confusable quotes, just fontify & mention them in error messages Noam Postavsky
2018-10-05  8:44 ` Eli Zaretskii
2018-10-24 22:25   ` Noam Postavsky
2018-10-27  9:48     ` Eli Zaretskii
2018-10-30  1:30       ` Noam Postavsky [this message]
2018-10-30  7:12         ` Eli Zaretskii
2019-11-17  8:42           ` Lars Ingebrigtsen
2019-11-27 19:42             ` Noam Postavsky
2019-11-28 23:25               ` Noam Postavsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87woq0i5ks.fsf@gmail.com \
    --to=npostavs@gmail.com \
    --cc=32939@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).