unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Addressing confusion over uninterned symbols
@ 2021-04-12 22:20 Matt Armstrong
  2021-04-14 23:06 ` Matt Armstrong
  0 siblings, 1 reply; 3+ messages in thread
From: Matt Armstrong @ 2021-04-12 22:20 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1161 bytes --]

Recently I came to realize that I have been routinely confused by Emacs
macros that don't use 'gensym' or some equivalent.  I have long taken a
liking to running commands like emacs-lisp-macroexpand to debug my use
of macros, but tend to get confused when the macros use merely
'make-symbol' instead of 'gensym'.  I regularly run into situations
where the uninterned symbols introduced by the macros aren't distinct
from my own code.  I also tend to expand macros and edebug the result,
which often breaks unless `print-gensym' and `print-circle' are set,
which is inconvenient and annoying.

So, two questions.

First, would patches to switch some of the lower level Emacs macros to
'gensym' be welcome?  I'm thinking of those in macroexp.el itself.  Or,
are there reasons for those macros to continue to use plain old
'make-symbol'?

Second, is there any interest in the package I wrote to effectively call
hack a call to 'gensym' on behalf of all macros that don't appear to
have done so themselves, where needed.  I called it 'hacroexp' and I now
use 'hacroexp-1' instead of 'emacs-lisp-macroexpand', and am generally
happy with the result.  See attached:


[-- Attachment #2: hacroexp.el --]
[-- Type: application/emacs-lisp, Size: 10450 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Addressing confusion over uninterned symbols
  2021-04-12 22:20 Addressing confusion over uninterned symbols Matt Armstrong
@ 2021-04-14 23:06 ` Matt Armstrong
  2021-04-14 23:16   ` Stefan Monnier
  0 siblings, 1 reply; 3+ messages in thread
From: Matt Armstrong @ 2021-04-14 23:06 UTC (permalink / raw)
  To: emacs-devel

I'll reword for brevity:

Are there reasons that macros like macroexp-let2* avoid using gensym?  I
am assuming it is merely historical?  I'd like to send patches in the
hopes of saving me and others some head scratching.



Matt Armstrong <matt@rfc20.org> writes:

> Recently I came to realize that I have been routinely confused by Emacs
> macros that don't use 'gensym' or some equivalent.  I have long taken a
> liking to running commands like emacs-lisp-macroexpand to debug my use
> of macros, but tend to get confused when the macros use merely
> 'make-symbol' instead of 'gensym'.  I regularly run into situations
> where the uninterned symbols introduced by the macros aren't distinct
> from my own code.  I also tend to expand macros and edebug the result,
> which often breaks unless `print-gensym' and `print-circle' are set,
> which is inconvenient and annoying.
>
> So, two questions.
>
> First, would patches to switch some of the lower level Emacs macros to
> 'gensym' be welcome?  I'm thinking of those in macroexp.el itself.  Or,
> are there reasons for those macros to continue to use plain old
> 'make-symbol'?
>
> Second, is there any interest in the package I wrote to effectively call
> hack a call to 'gensym' on behalf of all macros that don't appear to
> have done so themselves, where needed.  I called it 'hacroexp' and I now
> use 'hacroexp-1' instead of 'emacs-lisp-macroexpand', and am generally
> happy with the result.  See attached:
>
> ;;; hacroexp.el --- Hacked Humane Emacs Lisp Macro Expansion -*- lexical-binding: t; -*-
>
> ;; Copyright (C) 2021  Matt Armstrong
>
> ;; Author: Matt Armstrong <matt@rfc20.org>
> ;; Keywords: lisp
>
> ;; This program is free software; you can redistribute it and/or modify
> ;; it under the terms of the GNU General Public License as published by
> ;; the Free Software Foundation, either version 3 of the License, or
> ;; (at your option) any later version.
>
> ;; This program is distributed in the hope that it will be useful,
> ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> ;; GNU General Public License for more details.
>
> ;; You should have received a copy of the GNU General Public License
> ;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
>
> ;;; Commentary:
>
> ;; This package provides replacements for several Emacs Lisp macro
> ;; expansion functions:
> ;;
> ;;     `hacroexpand-1' for `macroexpand-1'
> ;;     `hacroexpand-all' for `macroexpand-all'
> ;;     `hacroexp-1' for `emacs-lisp-macroexpand'
> ;;     `hacroexp-all' for `cl-prettyexpand' (sort-of)
> ;;
> ;; The latter two are interactive and suitable for use in
> ;; `emacs-lisp-mode'.
> ;;
> ;; The expanded forms produced by these functions aim to be more clear,
> ;; without requiring specific print time configuration variables and
> ;; relatively obscure Elisp syntactic features.  The problem solved centers
> ;; around an issue with the way uninterned symbols are printed.
> ;;
> ;; Consider the following form:
> ;;
> ;;    (let ((a (make-symbol "word"))
> ;;          (b (make-symbol "word")))
> ;;      (list a b a))
> ;;
> ;; After after evaluation, using Emacs defaults, the form prints like this:
> ;;
> ;;    (word word word)
> ;;
> ;; It prints like a list of three 'word symbols, yet it is a list of two
> ;; different symbols *neither* of them 'word.  Setting `print-gensym' makes
> ;; things slightly more clear:
> ;;
> ;;    (#:word #:word #:word)
> ;;
> ;; With this, the #: read syntax tells the Lisp reader to read each symbol
> ;; without interning it.  But this, too, is incomplete, because when read
> ;; each #:word will be a distinct, uninterned, symbol, which is different
> ;; from the original form.  This second problem is solved by setting
> ;; `print-circle' non-nil, in which case we see:
> ;;
> ;;    (#1=#:word #:word #1#)
> ;;
> ;; Of the three, this last form is the only one that reads back into an
> ;; equivalent structure.  Indeed, if you're happy printing expanded macros
> ;; by setting both `print-gensym' and `print-circle' non-nil, then you
> ;; don't need this package.
> ;;
> ;; This package provides a fourth option: replace symbols with potentially
> ;; indistinct names with distinct ones.  The `hacroexp-substitute'
> ;; function turns the above form into one that prints as follows:
> ;;
> ;;     (word570 word569 word570)
> ;;
> ;; With this you can leave `print-gensym' and `print-circle' nil (their
> ;; defaults).
> ;;
> ;; What does this have to do with macros?
> ;;
> ;; Emacs Lisp macros often generate forms containing uninterned symbols.
> ;; This way the symbols cannot conflict other symbols.  For a discussion
> ;; about why this is important, see (info "(elisp)Problems with Macros").
> ;;
> ;; This is all well and good, but when a form's uninterened symbols do not
> ;; have unique names within Emacs, the resulting code can be confusing.
> ;;
> ;; Take this macro taken directly from the Emacs Lisp manual:
> ;;
> ;;    (defmacro for (var from init to final do &rest body)
> ;;      "Execute a simple for loop: (for i from 1 to 10 do (print i))."
> ;;      (let ((tempvar (make-symbol "max")))
> ;;        `(let ((,var ,init)
> ;;               (,tempvar ,final))
> ;;           (while (<= ,var ,tempvar)
> ;;             ,@body
> ;;             (inc ,var)))))
> ;;
> ;; Now, let's use it in a plausible way:
> ;;
> ;;    (for min from 1 to 10 do
> ;;         (for max from 1 to 10 do
> ;;              (message "min %S max %S" min max)))
> ;;
> ;; Printing the result of `macroexpand-all' we see this:
> ;;
> ;;    (let ((min 1) (max 10))
> ;;      (while (<= min max)
> ;;        (let ((max 1) (max 10))
> ;;          (while (<= max max)
> ;;            (message "min %S max %S" min max)
> ;;            (inc max)))
> ;;        (inc min)))
> ;;
> ;; In this form there are three different symbols named "max", but they all
> ;; print the same.  The result is confusion!
> ;;
> ;; One good solution is to change the macro code to use `gensym' instead of
> ;; `make-symbol' directly.  `gensym' generates a plausibly unique name for
> ;; each new symbol, which tend to be easier to understand when printed.
> ;; However, using `gensym' over `make-symbol' is not an immediate solution
> ;; when debugging a program's use of macros in other packages.  Further,
> ;; prefering `gensym' is not a universal practice even today.  `gensym' did
> ;; not become part of core Elisp until Emacs 26.  Some of the more commonly
> ;; used core Elisp macros still do not use it, not to mention all of the
> ;; non-GNU Elisp written to date.
> ;;
> ;; Continuing with the above example, setting both `print-gensym' and
> ;; `print-circle' to `t' results in an unambiguous representation of the
> ;; form.  We see:
> ;;
> ;;    (let ((min 1) (#1=#:max 10))
> ;;      (while (<= min #1#)
> ;;        (let ((max 1) (#2=#:max 10))
> ;;          (while (<= max #2#)
> ;;            (message "min %S max %S" min max)
> ;;            (inc max)))
> ;;        (inc min)))
> ;;
> ;; Now the uninterned symbols are clearly different from the interned
> ;; symbols, and even though there are two distinct uninterned symbols named
> ;; "max" their printed forms are distinct from each other.  However, the
> ;; "circular" references to these variables (#1# and #2#) are hard to read.
> ;; We've traded correctness for obfuscation.  This can be particularly
> ;; confusing in larger examples.
> ;;
> ;; Using `hacroexpand-all' instead of `macroexpand-all', and setting
> ;; `print-gensym' and `print-circle' back to nil, the form expands to:
> ;;
> ;;    (let ((min 1) (max497 10))
> ;;      (while (<= min max497)
> ;;        (let ((max 1) (max496 10))
> ;;          (while (<= max max496)
> ;;            (message "min %S max %S" min max)
> ;;            (inc max)))
> ;;        (inc min)))
> ;;
> ;; Each indistinct "max" symbol has been replaced with a distinct one
> ;; created with `gensym'.  The three different "max" symbols now have
> ;; distinct names.  The result is clear, regardless of how the print-*
> ;; variables are set, can be inspected visually and even debugged without
> ;; confusion.
>
> ;;; Code:
>
> (require 'cl-seq)
>
> (defun hacroexp--indistinct-symbol-p (symbol)
>   "Return non-nil if SYMBOL is indistinctly named.
> All interned symbols are considered distinct, as is nil, as is
> any uninterned symbol ending in a digit.  In the latter case the
> assumption is that the uninterned symbol is produced by `gensym'
> or an equivalent, and is likely unique despite being uninterned."
>   (and (symbolp symbol)
>        symbol        ; i.e. not the nil symbol
>        (not (intern-soft symbol))
>        (not (string-match "[[:digit:]]$"
>                           (symbol-name symbol)))))
>
> (defun hacroexp--walk-tree (fn tree)
>   "Call FN for all atoms and cons cells in TREE."
>   (cl-subst-if t #'ignore tree :key fn))
>
> (defun hacroexp--indistinct-symbols (tree)
>   "Return all indistinctly named symbols in TREE.
> See `hacroexp--indistinct-symbol-p'."
>   (let ((indistinct-symbols nil))
>     (hacroexp--walk-tree
>      (lambda (symbol)
>        (when (and (hacroexp--indistinct-symbol-p symbol)
>                   (not (member symbol indistinct-symbols)))
>          (push symbol indistinct-symbols)))
>      tree)
>     indistinct-symbols))
>
> (defun hacroexp--make-humanized (symbol)
>   "Return a new uninterend symbol.
> The name is made by calling `gensym' with SYMBOL's namef."
>   (gensym (symbol-name symbol)))
>
> (defun hacroexp-substitute (tree)
>   "Substitute distinctly named symbols in TREE.
> For each uninterned symbol with a potentially indistinct name,
> generate a new uninterned symbol with a distinct name and
> substitute all occurrences.  Return a copy of TREE with all
> substitutions made."
>   (let* ((symbols (hacroexp--indistinct-symbols tree))
>          (humanized (mapcar #'hacroexp--make-humanized symbols)))
>     (cl-sublis
>      (cl-pairlis symbols humanized)
>      tree)))
>
> (defun hacroexpand-1 (form)
>   "Perform (at most) one step of macroexpansion.
> Returns a copy of FORM after expansion and substitution of
> potentially indistinct uninterened symbol names.  See
> `macroexpand-1' and `hacroexp-substitute'."
>   (hacroexp-substitute (macroexpand-1 form)))
>
> (defun hacroexpand-all (form)
>   "Return the result of expanding macros at all levels in FORM.
> Returns a copy of FORM after expansion and substitution of
> potentially indistinct uninterened symbol names.  See
> `macroexpand-all' and `hacroexp-substitute'."
>   (hacroexp-substitute (macroexpand-all form)))
>
> (defun hacroexp--interactive (expand-function)
>   "Macroexpand the form after point using EXPAND-FUNCTION."
>   (let* ((start (point))
>          (exp (read (current-buffer)))
>          ;; Compute it before, since it may signal errors.
>          (new (funcall expand-function exp)))
>     (if (equal exp new)
>         (message "Not a macro call, nothing to expand")
>       (delete-region start (point))
>       (let ((print-gensym nil)
>             (print-circle nil)
>             (print-quoted t)
>             (print-level nil)
>             (print-length nil)
>             (print-escape-newlines t))
>          ;; (require 'cl-extra)
>          ;; (cl-prettyprint new)
>          (pp new (current-buffer)))
>       (if (bolp) (delete-char -1))
>       (indent-region start (point)))))
>
> (defun hacroexp-1 ()
>   "Replace the form after point with its macro expansion.
> Perform at most one expansion step.  This works like
> `emacs-lisp-macroexpand', but uses heuristics to replace
> indistinct names of unbound symbols names with distinct ones.  See also
> `hacroexp-macroexpand-all'."
>   (interactive)
>   (hacroexp--interactive #'hacroexpand-1))
>
> (defun hacroexp-all ()
>   "Macroexpand the form after point using EXPAND-FUNCTION.
> Replace the form with its expansion.
> Expand macros at all levels of the form.  See also
> `hacroexp-macroexpand-1'."
>   (interactive)
>   (hacroexp--interactive #'hacroexpand-all))
>
> (provide 'hacroexp)
> ;;; hacroexp.el ends here



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Addressing confusion over uninterned symbols
  2021-04-14 23:06 ` Matt Armstrong
@ 2021-04-14 23:16   ` Stefan Monnier
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Monnier @ 2021-04-14 23:16 UTC (permalink / raw)
  To: Matt Armstrong; +Cc: emacs-devel

> Are there reasons that macros like macroexp-let2* avoid using gensym?

The reason is/was to avoid cons'ing the corresponding strings (I
recommend `print-gensym` and `print-circle` for the rare cases where
you need to look at the macroexpanded code).

Maybe we should have a variant of `print-gensym` which prints the
symbol's address along with the rest?


        Stefan




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-14 23:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12 22:20 Addressing confusion over uninterned symbols Matt Armstrong
2021-04-14 23:06 ` Matt Armstrong
2021-04-14 23:16   ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).