From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Matt Armstrong Newsgroups: gmane.emacs.devel Subject: Re: Addressing confusion over uninterned symbols Date: Wed, 14 Apr 2021 16:06:35 -0700 Message-ID: <87sg3sa2kk.fsf@rfc20.org> References: <87tuob402w.fsf@rfc20.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="2955"; mail-complaints-to="usenet@ciao.gmane.io" To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Apr 15 01:07:30 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lWobY-0000dm-Dt for ged-emacs-devel@m.gmane-mx.org; Thu, 15 Apr 2021 01:07:28 +0200 Original-Received: from localhost ([::1]:35164 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lWobX-0002g4-FJ for ged-emacs-devel@m.gmane-mx.org; Wed, 14 Apr 2021 19:07:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39576) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lWoaw-0002G4-1U for emacs-devel@gnu.org; Wed, 14 Apr 2021 19:06:50 -0400 Original-Received: from relay2-d.mail.gandi.net ([217.70.183.194]:45779) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lWoas-0005q8-My for emacs-devel@gnu.org; Wed, 14 Apr 2021 19:06:49 -0400 X-Originating-IP: 24.113.169.116 Original-Received: from mdeb (24-113-169-116.wavecable.com [24.113.169.116]) (Authenticated sender: matt@rfc20.org) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 1D03140004 for ; Wed, 14 Apr 2021 23:06:39 +0000 (UTC) Original-Received: from matt by mdeb with local (Exim 4.94) (envelope-from ) id 1lWoai-000GVA-1B for emacs-devel@gnu.org; Wed, 14 Apr 2021 16:06:36 -0700 In-Reply-To: <87tuob402w.fsf@rfc20.org> Received-SPF: pass client-ip=217.70.183.194; envelope-from=matt@rfc20.org; helo=relay2-d.mail.gandi.net X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:268070 Archived-At: I'll reword for brevity: Are there reasons that macros like macroexp-let2* avoid using gensym? I am assuming it is merely historical? I'd like to send patches in the hopes of saving me and others some head scratching. Matt Armstrong writes: > Recently I came to realize that I have been routinely confused by Emacs > macros that don't use 'gensym' or some equivalent. I have long taken a > liking to running commands like emacs-lisp-macroexpand to debug my use > of macros, but tend to get confused when the macros use merely > 'make-symbol' instead of 'gensym'. I regularly run into situations > where the uninterned symbols introduced by the macros aren't distinct > from my own code. I also tend to expand macros and edebug the result, > which often breaks unless `print-gensym' and `print-circle' are set, > which is inconvenient and annoying. > > So, two questions. > > First, would patches to switch some of the lower level Emacs macros to > 'gensym' be welcome? I'm thinking of those in macroexp.el itself. Or, > are there reasons for those macros to continue to use plain old > 'make-symbol'? > > Second, is there any interest in the package I wrote to effectively call > hack a call to 'gensym' on behalf of all macros that don't appear to > have done so themselves, where needed. I called it 'hacroexp' and I now > use 'hacroexp-1' instead of 'emacs-lisp-macroexpand', and am generally > happy with the result. See attached: > > ;;; hacroexp.el --- Hacked Humane Emacs Lisp Macro Expansion -*- lexical-binding: t; -*- > > ;; Copyright (C) 2021 Matt Armstrong > > ;; Author: Matt Armstrong > ;; Keywords: lisp > > ;; This program is free software; you can redistribute it and/or modify > ;; it under the terms of the GNU General Public License as published by > ;; the Free Software Foundation, either version 3 of the License, or > ;; (at your option) any later version. > > ;; This program is distributed in the hope that it will be useful, > ;; but WITHOUT ANY WARRANTY; without even the implied warranty of > ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > ;; GNU General Public License for more details. > > ;; You should have received a copy of the GNU General Public License > ;; along with this program. If not, see . > > ;;; Commentary: > > ;; This package provides replacements for several Emacs Lisp macro > ;; expansion functions: > ;; > ;; `hacroexpand-1' for `macroexpand-1' > ;; `hacroexpand-all' for `macroexpand-all' > ;; `hacroexp-1' for `emacs-lisp-macroexpand' > ;; `hacroexp-all' for `cl-prettyexpand' (sort-of) > ;; > ;; The latter two are interactive and suitable for use in > ;; `emacs-lisp-mode'. > ;; > ;; The expanded forms produced by these functions aim to be more clear, > ;; without requiring specific print time configuration variables and > ;; relatively obscure Elisp syntactic features. The problem solved centers > ;; around an issue with the way uninterned symbols are printed. > ;; > ;; Consider the following form: > ;; > ;; (let ((a (make-symbol "word")) > ;; (b (make-symbol "word"))) > ;; (list a b a)) > ;; > ;; After after evaluation, using Emacs defaults, the form prints like this: > ;; > ;; (word word word) > ;; > ;; It prints like a list of three 'word symbols, yet it is a list of two > ;; different symbols *neither* of them 'word. Setting `print-gensym' makes > ;; things slightly more clear: > ;; > ;; (#:word #:word #:word) > ;; > ;; With this, the #: read syntax tells the Lisp reader to read each symbol > ;; without interning it. But this, too, is incomplete, because when read > ;; each #:word will be a distinct, uninterned, symbol, which is different > ;; from the original form. This second problem is solved by setting > ;; `print-circle' non-nil, in which case we see: > ;; > ;; (#1=#:word #:word #1#) > ;; > ;; Of the three, this last form is the only one that reads back into an > ;; equivalent structure. Indeed, if you're happy printing expanded macros > ;; by setting both `print-gensym' and `print-circle' non-nil, then you > ;; don't need this package. > ;; > ;; This package provides a fourth option: replace symbols with potentially > ;; indistinct names with distinct ones. The `hacroexp-substitute' > ;; function turns the above form into one that prints as follows: > ;; > ;; (word570 word569 word570) > ;; > ;; With this you can leave `print-gensym' and `print-circle' nil (their > ;; defaults). > ;; > ;; What does this have to do with macros? > ;; > ;; Emacs Lisp macros often generate forms containing uninterned symbols. > ;; This way the symbols cannot conflict other symbols. For a discussion > ;; about why this is important, see (info "(elisp)Problems with Macros"). > ;; > ;; This is all well and good, but when a form's uninterened symbols do not > ;; have unique names within Emacs, the resulting code can be confusing. > ;; > ;; Take this macro taken directly from the Emacs Lisp manual: > ;; > ;; (defmacro for (var from init to final do &rest body) > ;; "Execute a simple for loop: (for i from 1 to 10 do (print i))." > ;; (let ((tempvar (make-symbol "max"))) > ;; `(let ((,var ,init) > ;; (,tempvar ,final)) > ;; (while (<= ,var ,tempvar) > ;; ,@body > ;; (inc ,var))))) > ;; > ;; Now, let's use it in a plausible way: > ;; > ;; (for min from 1 to 10 do > ;; (for max from 1 to 10 do > ;; (message "min %S max %S" min max))) > ;; > ;; Printing the result of `macroexpand-all' we see this: > ;; > ;; (let ((min 1) (max 10)) > ;; (while (<= min max) > ;; (let ((max 1) (max 10)) > ;; (while (<= max max) > ;; (message "min %S max %S" min max) > ;; (inc max))) > ;; (inc min))) > ;; > ;; In this form there are three different symbols named "max", but they all > ;; print the same. The result is confusion! > ;; > ;; One good solution is to change the macro code to use `gensym' instead of > ;; `make-symbol' directly. `gensym' generates a plausibly unique name for > ;; each new symbol, which tend to be easier to understand when printed. > ;; However, using `gensym' over `make-symbol' is not an immediate solution > ;; when debugging a program's use of macros in other packages. Further, > ;; prefering `gensym' is not a universal practice even today. `gensym' did > ;; not become part of core Elisp until Emacs 26. Some of the more commonly > ;; used core Elisp macros still do not use it, not to mention all of the > ;; non-GNU Elisp written to date. > ;; > ;; Continuing with the above example, setting both `print-gensym' and > ;; `print-circle' to `t' results in an unambiguous representation of the > ;; form. We see: > ;; > ;; (let ((min 1) (#1=#:max 10)) > ;; (while (<= min #1#) > ;; (let ((max 1) (#2=#:max 10)) > ;; (while (<= max #2#) > ;; (message "min %S max %S" min max) > ;; (inc max))) > ;; (inc min))) > ;; > ;; Now the uninterned symbols are clearly different from the interned > ;; symbols, and even though there are two distinct uninterned symbols named > ;; "max" their printed forms are distinct from each other. However, the > ;; "circular" references to these variables (#1# and #2#) are hard to read. > ;; We've traded correctness for obfuscation. This can be particularly > ;; confusing in larger examples. > ;; > ;; Using `hacroexpand-all' instead of `macroexpand-all', and setting > ;; `print-gensym' and `print-circle' back to nil, the form expands to: > ;; > ;; (let ((min 1) (max497 10)) > ;; (while (<= min max497) > ;; (let ((max 1) (max496 10)) > ;; (while (<= max max496) > ;; (message "min %S max %S" min max) > ;; (inc max))) > ;; (inc min))) > ;; > ;; Each indistinct "max" symbol has been replaced with a distinct one > ;; created with `gensym'. The three different "max" symbols now have > ;; distinct names. The result is clear, regardless of how the print-* > ;; variables are set, can be inspected visually and even debugged without > ;; confusion. > > ;;; Code: > > (require 'cl-seq) > > (defun hacroexp--indistinct-symbol-p (symbol) > "Return non-nil if SYMBOL is indistinctly named. > All interned symbols are considered distinct, as is nil, as is > any uninterned symbol ending in a digit. In the latter case the > assumption is that the uninterned symbol is produced by `gensym' > or an equivalent, and is likely unique despite being uninterned." > (and (symbolp symbol) > symbol ; i.e. not the nil symbol > (not (intern-soft symbol)) > (not (string-match "[[:digit:]]$" > (symbol-name symbol))))) > > (defun hacroexp--walk-tree (fn tree) > "Call FN for all atoms and cons cells in TREE." > (cl-subst-if t #'ignore tree :key fn)) > > (defun hacroexp--indistinct-symbols (tree) > "Return all indistinctly named symbols in TREE. > See `hacroexp--indistinct-symbol-p'." > (let ((indistinct-symbols nil)) > (hacroexp--walk-tree > (lambda (symbol) > (when (and (hacroexp--indistinct-symbol-p symbol) > (not (member symbol indistinct-symbols))) > (push symbol indistinct-symbols))) > tree) > indistinct-symbols)) > > (defun hacroexp--make-humanized (symbol) > "Return a new uninterend symbol. > The name is made by calling `gensym' with SYMBOL's namef." > (gensym (symbol-name symbol))) > > (defun hacroexp-substitute (tree) > "Substitute distinctly named symbols in TREE. > For each uninterned symbol with a potentially indistinct name, > generate a new uninterned symbol with a distinct name and > substitute all occurrences. Return a copy of TREE with all > substitutions made." > (let* ((symbols (hacroexp--indistinct-symbols tree)) > (humanized (mapcar #'hacroexp--make-humanized symbols))) > (cl-sublis > (cl-pairlis symbols humanized) > tree))) > > (defun hacroexpand-1 (form) > "Perform (at most) one step of macroexpansion. > Returns a copy of FORM after expansion and substitution of > potentially indistinct uninterened symbol names. See > `macroexpand-1' and `hacroexp-substitute'." > (hacroexp-substitute (macroexpand-1 form))) > > (defun hacroexpand-all (form) > "Return the result of expanding macros at all levels in FORM. > Returns a copy of FORM after expansion and substitution of > potentially indistinct uninterened symbol names. See > `macroexpand-all' and `hacroexp-substitute'." > (hacroexp-substitute (macroexpand-all form))) > > (defun hacroexp--interactive (expand-function) > "Macroexpand the form after point using EXPAND-FUNCTION." > (let* ((start (point)) > (exp (read (current-buffer))) > ;; Compute it before, since it may signal errors. > (new (funcall expand-function exp))) > (if (equal exp new) > (message "Not a macro call, nothing to expand") > (delete-region start (point)) > (let ((print-gensym nil) > (print-circle nil) > (print-quoted t) > (print-level nil) > (print-length nil) > (print-escape-newlines t)) > ;; (require 'cl-extra) > ;; (cl-prettyprint new) > (pp new (current-buffer))) > (if (bolp) (delete-char -1)) > (indent-region start (point))))) > > (defun hacroexp-1 () > "Replace the form after point with its macro expansion. > Perform at most one expansion step. This works like > `emacs-lisp-macroexpand', but uses heuristics to replace > indistinct names of unbound symbols names with distinct ones. See also > `hacroexp-macroexpand-all'." > (interactive) > (hacroexp--interactive #'hacroexpand-1)) > > (defun hacroexp-all () > "Macroexpand the form after point using EXPAND-FUNCTION. > Replace the form with its expansion. > Expand macros at all levels of the form. See also > `hacroexp-macroexpand-1'." > (interactive) > (hacroexp--interactive #'hacroexpand-all)) > > (provide 'hacroexp) > ;;; hacroexp.el ends here