* Re: Addressing confusion over uninterned symbols
2021-04-12 22:20 Addressing confusion over uninterned symbols Matt Armstrong
@ 2021-04-14 23:06 ` Matt Armstrong
2021-04-14 23:16 ` Stefan Monnier
0 siblings, 1 reply; 3+ messages in thread
From: Matt Armstrong @ 2021-04-14 23:06 UTC (permalink / raw)
To: emacs-devel
I'll reword for brevity:
Are there reasons that macros like macroexp-let2* avoid using gensym? I
am assuming it is merely historical? I'd like to send patches in the
hopes of saving me and others some head scratching.
Matt Armstrong <matt@rfc20.org> writes:
> Recently I came to realize that I have been routinely confused by Emacs
> macros that don't use 'gensym' or some equivalent. I have long taken a
> liking to running commands like emacs-lisp-macroexpand to debug my use
> of macros, but tend to get confused when the macros use merely
> 'make-symbol' instead of 'gensym'. I regularly run into situations
> where the uninterned symbols introduced by the macros aren't distinct
> from my own code. I also tend to expand macros and edebug the result,
> which often breaks unless `print-gensym' and `print-circle' are set,
> which is inconvenient and annoying.
>
> So, two questions.
>
> First, would patches to switch some of the lower level Emacs macros to
> 'gensym' be welcome? I'm thinking of those in macroexp.el itself. Or,
> are there reasons for those macros to continue to use plain old
> 'make-symbol'?
>
> Second, is there any interest in the package I wrote to effectively call
> hack a call to 'gensym' on behalf of all macros that don't appear to
> have done so themselves, where needed. I called it 'hacroexp' and I now
> use 'hacroexp-1' instead of 'emacs-lisp-macroexpand', and am generally
> happy with the result. See attached:
>
> ;;; hacroexp.el --- Hacked Humane Emacs Lisp Macro Expansion -*- lexical-binding: t; -*-
>
> ;; Copyright (C) 2021 Matt Armstrong
>
> ;; Author: Matt Armstrong <matt@rfc20.org>
> ;; Keywords: lisp
>
> ;; This program is free software; you can redistribute it and/or modify
> ;; it under the terms of the GNU General Public License as published by
> ;; the Free Software Foundation, either version 3 of the License, or
> ;; (at your option) any later version.
>
> ;; This program is distributed in the hope that it will be useful,
> ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> ;; GNU General Public License for more details.
>
> ;; You should have received a copy of the GNU General Public License
> ;; along with this program. If not, see <https://www.gnu.org/licenses/>.
>
> ;;; Commentary:
>
> ;; This package provides replacements for several Emacs Lisp macro
> ;; expansion functions:
> ;;
> ;; `hacroexpand-1' for `macroexpand-1'
> ;; `hacroexpand-all' for `macroexpand-all'
> ;; `hacroexp-1' for `emacs-lisp-macroexpand'
> ;; `hacroexp-all' for `cl-prettyexpand' (sort-of)
> ;;
> ;; The latter two are interactive and suitable for use in
> ;; `emacs-lisp-mode'.
> ;;
> ;; The expanded forms produced by these functions aim to be more clear,
> ;; without requiring specific print time configuration variables and
> ;; relatively obscure Elisp syntactic features. The problem solved centers
> ;; around an issue with the way uninterned symbols are printed.
> ;;
> ;; Consider the following form:
> ;;
> ;; (let ((a (make-symbol "word"))
> ;; (b (make-symbol "word")))
> ;; (list a b a))
> ;;
> ;; After after evaluation, using Emacs defaults, the form prints like this:
> ;;
> ;; (word word word)
> ;;
> ;; It prints like a list of three 'word symbols, yet it is a list of two
> ;; different symbols *neither* of them 'word. Setting `print-gensym' makes
> ;; things slightly more clear:
> ;;
> ;; (#:word #:word #:word)
> ;;
> ;; With this, the #: read syntax tells the Lisp reader to read each symbol
> ;; without interning it. But this, too, is incomplete, because when read
> ;; each #:word will be a distinct, uninterned, symbol, which is different
> ;; from the original form. This second problem is solved by setting
> ;; `print-circle' non-nil, in which case we see:
> ;;
> ;; (#1=#:word #:word #1#)
> ;;
> ;; Of the three, this last form is the only one that reads back into an
> ;; equivalent structure. Indeed, if you're happy printing expanded macros
> ;; by setting both `print-gensym' and `print-circle' non-nil, then you
> ;; don't need this package.
> ;;
> ;; This package provides a fourth option: replace symbols with potentially
> ;; indistinct names with distinct ones. The `hacroexp-substitute'
> ;; function turns the above form into one that prints as follows:
> ;;
> ;; (word570 word569 word570)
> ;;
> ;; With this you can leave `print-gensym' and `print-circle' nil (their
> ;; defaults).
> ;;
> ;; What does this have to do with macros?
> ;;
> ;; Emacs Lisp macros often generate forms containing uninterned symbols.
> ;; This way the symbols cannot conflict other symbols. For a discussion
> ;; about why this is important, see (info "(elisp)Problems with Macros").
> ;;
> ;; This is all well and good, but when a form's uninterened symbols do not
> ;; have unique names within Emacs, the resulting code can be confusing.
> ;;
> ;; Take this macro taken directly from the Emacs Lisp manual:
> ;;
> ;; (defmacro for (var from init to final do &rest body)
> ;; "Execute a simple for loop: (for i from 1 to 10 do (print i))."
> ;; (let ((tempvar (make-symbol "max")))
> ;; `(let ((,var ,init)
> ;; (,tempvar ,final))
> ;; (while (<= ,var ,tempvar)
> ;; ,@body
> ;; (inc ,var)))))
> ;;
> ;; Now, let's use it in a plausible way:
> ;;
> ;; (for min from 1 to 10 do
> ;; (for max from 1 to 10 do
> ;; (message "min %S max %S" min max)))
> ;;
> ;; Printing the result of `macroexpand-all' we see this:
> ;;
> ;; (let ((min 1) (max 10))
> ;; (while (<= min max)
> ;; (let ((max 1) (max 10))
> ;; (while (<= max max)
> ;; (message "min %S max %S" min max)
> ;; (inc max)))
> ;; (inc min)))
> ;;
> ;; In this form there are three different symbols named "max", but they all
> ;; print the same. The result is confusion!
> ;;
> ;; One good solution is to change the macro code to use `gensym' instead of
> ;; `make-symbol' directly. `gensym' generates a plausibly unique name for
> ;; each new symbol, which tend to be easier to understand when printed.
> ;; However, using `gensym' over `make-symbol' is not an immediate solution
> ;; when debugging a program's use of macros in other packages. Further,
> ;; prefering `gensym' is not a universal practice even today. `gensym' did
> ;; not become part of core Elisp until Emacs 26. Some of the more commonly
> ;; used core Elisp macros still do not use it, not to mention all of the
> ;; non-GNU Elisp written to date.
> ;;
> ;; Continuing with the above example, setting both `print-gensym' and
> ;; `print-circle' to `t' results in an unambiguous representation of the
> ;; form. We see:
> ;;
> ;; (let ((min 1) (#1=#:max 10))
> ;; (while (<= min #1#)
> ;; (let ((max 1) (#2=#:max 10))
> ;; (while (<= max #2#)
> ;; (message "min %S max %S" min max)
> ;; (inc max)))
> ;; (inc min)))
> ;;
> ;; Now the uninterned symbols are clearly different from the interned
> ;; symbols, and even though there are two distinct uninterned symbols named
> ;; "max" their printed forms are distinct from each other. However, the
> ;; "circular" references to these variables (#1# and #2#) are hard to read.
> ;; We've traded correctness for obfuscation. This can be particularly
> ;; confusing in larger examples.
> ;;
> ;; Using `hacroexpand-all' instead of `macroexpand-all', and setting
> ;; `print-gensym' and `print-circle' back to nil, the form expands to:
> ;;
> ;; (let ((min 1) (max497 10))
> ;; (while (<= min max497)
> ;; (let ((max 1) (max496 10))
> ;; (while (<= max max496)
> ;; (message "min %S max %S" min max)
> ;; (inc max)))
> ;; (inc min)))
> ;;
> ;; Each indistinct "max" symbol has been replaced with a distinct one
> ;; created with `gensym'. The three different "max" symbols now have
> ;; distinct names. The result is clear, regardless of how the print-*
> ;; variables are set, can be inspected visually and even debugged without
> ;; confusion.
>
> ;;; Code:
>
> (require 'cl-seq)
>
> (defun hacroexp--indistinct-symbol-p (symbol)
> "Return non-nil if SYMBOL is indistinctly named.
> All interned symbols are considered distinct, as is nil, as is
> any uninterned symbol ending in a digit. In the latter case the
> assumption is that the uninterned symbol is produced by `gensym'
> or an equivalent, and is likely unique despite being uninterned."
> (and (symbolp symbol)
> symbol ; i.e. not the nil symbol
> (not (intern-soft symbol))
> (not (string-match "[[:digit:]]$"
> (symbol-name symbol)))))
>
> (defun hacroexp--walk-tree (fn tree)
> "Call FN for all atoms and cons cells in TREE."
> (cl-subst-if t #'ignore tree :key fn))
>
> (defun hacroexp--indistinct-symbols (tree)
> "Return all indistinctly named symbols in TREE.
> See `hacroexp--indistinct-symbol-p'."
> (let ((indistinct-symbols nil))
> (hacroexp--walk-tree
> (lambda (symbol)
> (when (and (hacroexp--indistinct-symbol-p symbol)
> (not (member symbol indistinct-symbols)))
> (push symbol indistinct-symbols)))
> tree)
> indistinct-symbols))
>
> (defun hacroexp--make-humanized (symbol)
> "Return a new uninterend symbol.
> The name is made by calling `gensym' with SYMBOL's namef."
> (gensym (symbol-name symbol)))
>
> (defun hacroexp-substitute (tree)
> "Substitute distinctly named symbols in TREE.
> For each uninterned symbol with a potentially indistinct name,
> generate a new uninterned symbol with a distinct name and
> substitute all occurrences. Return a copy of TREE with all
> substitutions made."
> (let* ((symbols (hacroexp--indistinct-symbols tree))
> (humanized (mapcar #'hacroexp--make-humanized symbols)))
> (cl-sublis
> (cl-pairlis symbols humanized)
> tree)))
>
> (defun hacroexpand-1 (form)
> "Perform (at most) one step of macroexpansion.
> Returns a copy of FORM after expansion and substitution of
> potentially indistinct uninterened symbol names. See
> `macroexpand-1' and `hacroexp-substitute'."
> (hacroexp-substitute (macroexpand-1 form)))
>
> (defun hacroexpand-all (form)
> "Return the result of expanding macros at all levels in FORM.
> Returns a copy of FORM after expansion and substitution of
> potentially indistinct uninterened symbol names. See
> `macroexpand-all' and `hacroexp-substitute'."
> (hacroexp-substitute (macroexpand-all form)))
>
> (defun hacroexp--interactive (expand-function)
> "Macroexpand the form after point using EXPAND-FUNCTION."
> (let* ((start (point))
> (exp (read (current-buffer)))
> ;; Compute it before, since it may signal errors.
> (new (funcall expand-function exp)))
> (if (equal exp new)
> (message "Not a macro call, nothing to expand")
> (delete-region start (point))
> (let ((print-gensym nil)
> (print-circle nil)
> (print-quoted t)
> (print-level nil)
> (print-length nil)
> (print-escape-newlines t))
> ;; (require 'cl-extra)
> ;; (cl-prettyprint new)
> (pp new (current-buffer)))
> (if (bolp) (delete-char -1))
> (indent-region start (point)))))
>
> (defun hacroexp-1 ()
> "Replace the form after point with its macro expansion.
> Perform at most one expansion step. This works like
> `emacs-lisp-macroexpand', but uses heuristics to replace
> indistinct names of unbound symbols names with distinct ones. See also
> `hacroexp-macroexpand-all'."
> (interactive)
> (hacroexp--interactive #'hacroexpand-1))
>
> (defun hacroexp-all ()
> "Macroexpand the form after point using EXPAND-FUNCTION.
> Replace the form with its expansion.
> Expand macros at all levels of the form. See also
> `hacroexp-macroexpand-1'."
> (interactive)
> (hacroexp--interactive #'hacroexpand-all))
>
> (provide 'hacroexp)
> ;;; hacroexp.el ends here
^ permalink raw reply [flat|nested] 3+ messages in thread