From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Matt Armstrong <matt@rfc20.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Addressing confusion over uninterned symbols
Date: Wed, 14 Apr 2021 16:06:35 -0700
Message-ID: <87sg3sa2kk.fsf@rfc20.org>
References: <87tuob402w.fsf@rfc20.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="2955"; mail-complaints-to="usenet@ciao.gmane.io"
To: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Apr 15 01:07:30 2021
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1lWobY-0000dm-Dt
	for ged-emacs-devel@m.gmane-mx.org; Thu, 15 Apr 2021 01:07:28 +0200
Original-Received: from localhost ([::1]:35164 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1lWobX-0002g4-FJ
	for ged-emacs-devel@m.gmane-mx.org; Wed, 14 Apr 2021 19:07:27 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39576)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <matt@rfc20.org>) id 1lWoaw-0002G4-1U
 for emacs-devel@gnu.org; Wed, 14 Apr 2021 19:06:50 -0400
Original-Received: from relay2-d.mail.gandi.net ([217.70.183.194]:45779)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <matt@rfc20.org>) id 1lWoas-0005q8-My
 for emacs-devel@gnu.org; Wed, 14 Apr 2021 19:06:49 -0400
X-Originating-IP: 24.113.169.116
Original-Received: from mdeb (24-113-169-116.wavecable.com [24.113.169.116])
 (Authenticated sender: matt@rfc20.org)
 by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 1D03140004
 for <emacs-devel@gnu.org>; Wed, 14 Apr 2021 23:06:39 +0000 (UTC)
Original-Received: from matt by mdeb with local (Exim 4.94)
 (envelope-from <matt@rfc20.org>) id 1lWoai-000GVA-1B
 for emacs-devel@gnu.org; Wed, 14 Apr 2021 16:06:36 -0700
In-Reply-To: <87tuob402w.fsf@rfc20.org>
Received-SPF: pass client-ip=217.70.183.194; envelope-from=matt@rfc20.org;
 helo=relay2-d.mail.gandi.net
X-Spam_score_int: -25
X-Spam_score: -2.6
X-Spam_bar: --
X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7,
 RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: "Emacs-devel"
 <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Xref: news.gmane.io gmane.emacs.devel:268070
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/268070>

I'll reword for brevity:

Are there reasons that macros like macroexp-let2* avoid using gensym?  I
am assuming it is merely historical?  I'd like to send patches in the
hopes of saving me and others some head scratching.


Matt Armstrong <matt@rfc20.org> writes:

> Recently I came to realize that I have been routinely confused by Emacs
> macros that don't use 'gensym' or some equivalent.  I have long taken a
> liking to running commands like emacs-lisp-macroexpand to debug my use
> of macros, but tend to get confused when the macros use merely
> 'make-symbol' instead of 'gensym'.  I regularly run into situations
> where the uninterned symbols introduced by the macros aren't distinct
> from my own code.  I also tend to expand macros and edebug the result,
> which often breaks unless `print-gensym' and `print-circle' are set,
> which is inconvenient and annoying.
>
> So, two questions.
>
> First, would patches to switch some of the lower level Emacs macros to
> 'gensym' be welcome?  I'm thinking of those in macroexp.el itself.  Or,
> are there reasons for those macros to continue to use plain old
> 'make-symbol'?
>
> Second, is there any interest in the package I wrote to effectively call
> hack a call to 'gensym' on behalf of all macros that don't appear to
> have done so themselves, where needed.  I called it 'hacroexp' and I now
> use 'hacroexp-1' instead of 'emacs-lisp-macroexpand', and am generally
> happy with the result.  See attached:
>
> ;;; hacroexp.el --- Hacked Humane Emacs Lisp Macro Expansion -*- lexical-binding: t; -*-
>
> ;; Copyright (C) 2021  Matt Armstrong
>
> ;; Author: Matt Armstrong <matt@rfc20.org>
> ;; Keywords: lisp
>
> ;; This program is free software; you can redistribute it and/or modify
> ;; it under the terms of the GNU General Public License as published by
> ;; the Free Software Foundation, either version 3 of the License, or
> ;; (at your option) any later version.
>
> ;; This program is distributed in the hope that it will be useful,
> ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> ;; GNU General Public License for more details.
>
> ;; You should have received a copy of the GNU General Public License
> ;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
>
> ;;; Commentary:
>
> ;; This package provides replacements for several Emacs Lisp macro
> ;; expansion functions:
> ;;
> ;;     `hacroexpand-1' for `macroexpand-1'
> ;;     `hacroexpand-all' for `macroexpand-all'
> ;;     `hacroexp-1' for `emacs-lisp-macroexpand'
> ;;     `hacroexp-all' for `cl-prettyexpand' (sort-of)
> ;;
> ;; The latter two are interactive and suitable for use in
> ;; `emacs-lisp-mode'.
> ;;
> ;; The expanded forms produced by these functions aim to be more clear,
> ;; without requiring specific print time configuration variables and
> ;; relatively obscure Elisp syntactic features.  The problem solved centers
> ;; around an issue with the way uninterned symbols are printed.
> ;;
> ;; Consider the following form:
> ;;
> ;;    (let ((a (make-symbol "word"))
> ;;          (b (make-symbol "word")))
> ;;      (list a b a))
> ;;
> ;; After after evaluation, using Emacs defaults, the form prints like this:
> ;;
> ;;    (word word word)
> ;;
> ;; It prints like a list of three 'word symbols, yet it is a list of two
> ;; different symbols *neither* of them 'word.  Setting `print-gensym' makes
> ;; things slightly more clear:
> ;;
> ;;    (#:word #:word #:word)
> ;;
> ;; With this, the #: read syntax tells the Lisp reader to read each symbol
> ;; without interning it.  But this, too, is incomplete, because when read
> ;; each #:word will be a distinct, uninterned, symbol, which is different
> ;; from the original form.  This second problem is solved by setting
> ;; `print-circle' non-nil, in which case we see:
> ;;
> ;;    (#1=#:word #:word #1#)
> ;;
> ;; Of the three, this last form is the only one that reads back into an
> ;; equivalent structure.  Indeed, if you're happy printing expanded macros
> ;; by setting both `print-gensym' and `print-circle' non-nil, then you
> ;; don't need this package.
> ;;
> ;; This package provides a fourth option: replace symbols with potentially
> ;; indistinct names with distinct ones.  The `hacroexp-substitute'
> ;; function turns the above form into one that prints as follows:
> ;;
> ;;     (word570 word569 word570)
> ;;
> ;; With this you can leave `print-gensym' and `print-circle' nil (their
> ;; defaults).
> ;;
> ;; What does this have to do with macros?
> ;;
> ;; Emacs Lisp macros often generate forms containing uninterned symbols.
> ;; This way the symbols cannot conflict other symbols.  For a discussion
> ;; about why this is important, see (info "(elisp)Problems with Macros").
> ;;
> ;; This is all well and good, but when a form's uninterened symbols do not
> ;; have unique names within Emacs, the resulting code can be confusing.
> ;;
> ;; Take this macro taken directly from the Emacs Lisp manual:
> ;;
> ;;    (defmacro for (var from init to final do &rest body)
> ;;      "Execute a simple for loop: (for i from 1 to 10 do (print i))."
> ;;      (let ((tempvar (make-symbol "max")))
> ;;        `(let ((,var ,init)
> ;;               (,tempvar ,final))
> ;;           (while (<= ,var ,tempvar)
> ;;             ,@body
> ;;             (inc ,var)))))
> ;;
> ;; Now, let's use it in a plausible way:
> ;;
> ;;    (for min from 1 to 10 do
> ;;         (for max from 1 to 10 do
> ;;              (message "min %S max %S" min max)))
> ;;
> ;; Printing the result of `macroexpand-all' we see this:
> ;;
> ;;    (let ((min 1) (max 10))
> ;;      (while (<= min max)
> ;;        (let ((max 1) (max 10))
> ;;          (while (<= max max)
> ;;            (message "min %S max %S" min max)
> ;;            (inc max)))
> ;;        (inc min)))
> ;;
> ;; In this form there are three different symbols named "max", but they all
> ;; print the same.  The result is confusion!
> ;;
> ;; One good solution is to change the macro code to use `gensym' instead of
> ;; `make-symbol' directly.  `gensym' generates a plausibly unique name for
> ;; each new symbol, which tend to be easier to understand when printed.
> ;; However, using `gensym' over `make-symbol' is not an immediate solution
> ;; when debugging a program's use of macros in other packages.  Further,
> ;; prefering `gensym' is not a universal practice even today.  `gensym' did
> ;; not become part of core Elisp until Emacs 26.  Some of the more commonly
> ;; used core Elisp macros still do not use it, not to mention all of the
> ;; non-GNU Elisp written to date.
> ;;
> ;; Continuing with the above example, setting both `print-gensym' and
> ;; `print-circle' to `t' results in an unambiguous representation of the
> ;; form.  We see:
> ;;
> ;;    (let ((min 1) (#1=#:max 10))
> ;;      (while (<= min #1#)
> ;;        (let ((max 1) (#2=#:max 10))
> ;;          (while (<= max #2#)
> ;;            (message "min %S max %S" min max)
> ;;            (inc max)))
> ;;        (inc min)))
> ;;
> ;; Now the uninterned symbols are clearly different from the interned
> ;; symbols, and even though there are two distinct uninterned symbols named
> ;; "max" their printed forms are distinct from each other.  However, the
> ;; "circular" references to these variables (#1# and #2#) are hard to read.
> ;; We've traded correctness for obfuscation.  This can be particularly
> ;; confusing in larger examples.
> ;;
> ;; Using `hacroexpand-all' instead of `macroexpand-all', and setting
> ;; `print-gensym' and `print-circle' back to nil, the form expands to:
> ;;
> ;;    (let ((min 1) (max497 10))
> ;;      (while (<= min max497)
> ;;        (let ((max 1) (max496 10))
> ;;          (while (<= max max496)
> ;;            (message "min %S max %S" min max)
> ;;            (inc max)))
> ;;        (inc min)))
> ;;
> ;; Each indistinct "max" symbol has been replaced with a distinct one
> ;; created with `gensym'.  The three different "max" symbols now have
> ;; distinct names.  The result is clear, regardless of how the print-*
> ;; variables are set, can be inspected visually and even debugged without
> ;; confusion.
>
> ;;; Code:
>
> (require 'cl-seq)
>
> (defun hacroexp--indistinct-symbol-p (symbol)
>   "Return non-nil if SYMBOL is indistinctly named.
> All interned symbols are considered distinct, as is nil, as is
> any uninterned symbol ending in a digit.  In the latter case the
> assumption is that the uninterned symbol is produced by `gensym'
> or an equivalent, and is likely unique despite being uninterned."
>   (and (symbolp symbol)
>        symbol        ; i.e. not the nil symbol
>        (not (intern-soft symbol))
>        (not (string-match "[[:digit:]]$"
>                           (symbol-name symbol)))))
>
> (defun hacroexp--walk-tree (fn tree)
>   "Call FN for all atoms and cons cells in TREE."
>   (cl-subst-if t #'ignore tree :key fn))
>
> (defun hacroexp--indistinct-symbols (tree)
>   "Return all indistinctly named symbols in TREE.
> See `hacroexp--indistinct-symbol-p'."
>   (let ((indistinct-symbols nil))
>     (hacroexp--walk-tree
>      (lambda (symbol)
>        (when (and (hacroexp--indistinct-symbol-p symbol)
>                   (not (member symbol indistinct-symbols)))
>          (push symbol indistinct-symbols)))
>      tree)
>     indistinct-symbols))
>
> (defun hacroexp--make-humanized (symbol)
>   "Return a new uninterend symbol.
> The name is made by calling `gensym' with SYMBOL's namef."
>   (gensym (symbol-name symbol)))
>
> (defun hacroexp-substitute (tree)
>   "Substitute distinctly named symbols in TREE.
> For each uninterned symbol with a potentially indistinct name,
> generate a new uninterned symbol with a distinct name and
> substitute all occurrences.  Return a copy of TREE with all
> substitutions made."
>   (let* ((symbols (hacroexp--indistinct-symbols tree))
>          (humanized (mapcar #'hacroexp--make-humanized symbols)))
>     (cl-sublis
>      (cl-pairlis symbols humanized)
>      tree)))
>
> (defun hacroexpand-1 (form)
>   "Perform (at most) one step of macroexpansion.
> Returns a copy of FORM after expansion and substitution of
> potentially indistinct uninterened symbol names.  See
> `macroexpand-1' and `hacroexp-substitute'."
>   (hacroexp-substitute (macroexpand-1 form)))
>
> (defun hacroexpand-all (form)
>   "Return the result of expanding macros at all levels in FORM.
> Returns a copy of FORM after expansion and substitution of
> potentially indistinct uninterened symbol names.  See
> `macroexpand-all' and `hacroexp-substitute'."
>   (hacroexp-substitute (macroexpand-all form)))
>
> (defun hacroexp--interactive (expand-function)
>   "Macroexpand the form after point using EXPAND-FUNCTION."
>   (let* ((start (point))
>          (exp (read (current-buffer)))
>          ;; Compute it before, since it may signal errors.
>          (new (funcall expand-function exp)))
>     (if (equal exp new)
>         (message "Not a macro call, nothing to expand")
>       (delete-region start (point))
>       (let ((print-gensym nil)
>             (print-circle nil)
>             (print-quoted t)
>             (print-level nil)
>             (print-length nil)
>             (print-escape-newlines t))
>          ;; (require 'cl-extra)
>          ;; (cl-prettyprint new)
>          (pp new (current-buffer)))
>       (if (bolp) (delete-char -1))
>       (indent-region start (point)))))
>
> (defun hacroexp-1 ()
>   "Replace the form after point with its macro expansion.
> Perform at most one expansion step.  This works like
> `emacs-lisp-macroexpand', but uses heuristics to replace
> indistinct names of unbound symbols names with distinct ones.  See also
> `hacroexp-macroexpand-all'."
>   (interactive)
>   (hacroexp--interactive #'hacroexpand-1))
>
> (defun hacroexp-all ()
>   "Macroexpand the form after point using EXPAND-FUNCTION.
> Replace the form with its expansion.
> Expand macros at all levels of the form.  See also
> `hacroexp-macroexpand-1'."
>   (interactive)
>   (hacroexp--interactive #'hacroexpand-all))
>
> (provide 'hacroexp)
> ;;; hacroexp.el ends here