From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#65051: internal_equal manipulates symbols with position without checking symbols-with-pos-enabled. Date: Sun, 6 Aug 2023 15:02:22 +0000 Message-ID: References: <2F680A0A-54B5-42C2-B27B-4E5C6332517A@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18234"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 65051@debbugs.gnu.org, Eli Zaretskii , Stefan Monnier To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Aug 06 17:03:21 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qSfHt-0004Zf-8A for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 06 Aug 2023 17:03:21 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qSfHc-0005Bc-VE; Sun, 06 Aug 2023 11:03:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qSfHa-0005BT-R8 for bug-gnu-emacs@gnu.org; Sun, 06 Aug 2023 11:03:02 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qSfHa-0007Lg-In for bug-gnu-emacs@gnu.org; Sun, 06 Aug 2023 11:03:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qSfHZ-0001uY-UA for bug-gnu-emacs@gnu.org; Sun, 06 Aug 2023 11:03:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 06 Aug 2023 15:03:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 65051 X-GNU-PR-Package: emacs Original-Received: via spool by 65051-submit@debbugs.gnu.org id=B65051.16913341557305 (code B ref 65051); Sun, 06 Aug 2023 15:03:01 +0000 Original-Received: (at 65051) by debbugs.gnu.org; 6 Aug 2023 15:02:35 +0000 Original-Received: from localhost ([127.0.0.1]:59787 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qSfH8-0001tk-MI for submit@debbugs.gnu.org; Sun, 06 Aug 2023 11:02:35 -0400 Original-Received: from mx3.muc.de ([193.149.48.5]:28918) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qSfH3-0001tS-Uy for 65051@debbugs.gnu.org; Sun, 06 Aug 2023 11:02:33 -0400 Original-Received: (qmail 86717 invoked by uid 3782); 6 Aug 2023 17:02:23 +0200 Original-Received: from acm.muc.de (pd953a3d8.dip0.t-ipconnect.de [217.83.163.216]) (using STARTTLS) by colin.muc.de (tmda-ofmipd) with ESMTP; Sun, 06 Aug 2023 17:02:23 +0200 Original-Received: (qmail 10969 invoked by uid 1000); 6 Aug 2023 15:02:22 -0000 Content-Disposition: inline In-Reply-To: X-Submission-Agent: TMDA/1.3.x (Ph3nix) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:266858 Archived-At: Hello, Mattias. On Sun, Aug 06, 2023 at 15:37:24 +0200, Mattias Engdegård wrote: > 5 aug. 2023 kl. 23.07 skrev Alan Mackenzie : > > diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi > > index 34db0caf3a8..a828d303c04 100644 > > --- a/doc/lispref/symbols.texi > > +++ b/doc/lispref/symbols.texi > > @@ -784,9 +784,15 @@ Symbols with Position > > @cindex bare symbol > > A @dfn{symbol with position} is a symbol, the @dfn{bare symbol}, > > together with an unsigned integer called the @dfn{position}. These > > -objects are intended for use by the byte compiler, which records in > > -them the position of each symbol occurrence and uses those positions > > -in warning and error messages. > > +objects are stored internally much like vectors > Not sure why we want to say how they are stored here. They can be > stored in bubble memory for all the user cares. The point is, they are _not_ stored in the obarray. Eli specifically asked me to clarify this point, yesterday. > > , and don't themselves > > +have entries in the obarray (though their bare symbols do; > > +@pxref{Creating Symbols}). > > + > > +Symbols with position are for the use of the byte compiler, which > > +records in them the position of each symbol occurrence and uses those > > +positions in warning and error messages. They shouldn't normally be > > +used otherwise. Doing so can cause unexpected results with basic > > +Emacs functions such as @code{eq} and @code{equal}. > > The printed representation of a symbol with position uses the hash > > notation outlined in @ref{Printed Representation}. It looks like > > @@ -798,11 +804,20 @@ Symbols with Position > > For most purposes, when the flag variable > > @code{symbols-with-pos-enabled} is non-@code{nil}, symbols with > > -positions behave just as bare symbols do. For example, @samp{(eq > > -# foo)} has a value @code{t} when that variable > > -is set (but @code{nil} when it isn't set). Most of the time in Emacs this > > -variable is @code{nil}, but the byte compiler binds it to @code{t} > > -when it runs. > > +positions behave just as their bare symbols would. For example, > > +@samp{(eq # foo)} has a value @code{t} when the > > +variable is set; likewise, @code{equal} will treat a symbol with > > +position argument as its bare symbol. > > + > > +When @code{symbols-with-pos-enabled} is @code{nil}, any symbols with > > +position continue to exist, but do not behave as symbols, or have the > > +other useful properties outlined in the previous paragraph. @code{eq} > > +returns @code{t} when given identical arguments, and @code{equal} > > +returns @code{t} when given arguments with @code{equal} components. > Since the components are bare symbols and fixnums, equality and > identity for them are equivalent, right? No. If there are two distinct SWPs with the same bare symbol and the same position, they should be equal, but not eq. But the real point is to contrast how equal and eq work when symbols-with-pos-enabled is nil with when it is non-nil. > > + > > +Most of the time in Emacs @code{symbols-with-pos-enabled} is > > +@code{nil}, but the byte compiler and the native compiler bind it to > > +@code{t} when they run. > > Typically, symbols with position are created by the byte compiler > > calling the reader function @code{read-positioning-symbols} > > @@ -820,7 +835,7 @@ Symbols with Position > > a symbol with position, ignoring the position. > > @end defvar > > -@defun symbol-with-pos-p symbol. > > +@defun symbol-with-pos-p symbol > > This function returns @code{t} if @var{symbol} is a symbol with > > position, @code{nil} otherwise. > > @end defun > > diff --git a/src/fns.c b/src/fns.c > > index bfd19e8c8f2..d47098c8791 100644 > > --- a/src/fns.c > > +++ b/src/fns.c > > @@ -2773,10 +2773,13 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind, > > /* A symbol with position compares the contained symbol, and is > > `equal' to the corresponding ordinary symbol. */ > > - if (SYMBOL_WITH_POS_P (o1)) > > - o1 = SYMBOL_WITH_POS_SYM (o1); > > - if (SYMBOL_WITH_POS_P (o2)) > > - o2 = SYMBOL_WITH_POS_SYM (o2); > > + if (symbols_with_pos_enabled) > > + { > > + if (SYMBOL_WITH_POS_P (o1)) > > + o1 = SYMBOL_WITH_POS_SYM (o1); > > + if (SYMBOL_WITH_POS_P (o2)) > > + o2 = SYMBOL_WITH_POS_SYM (o2); > > + } > OK. This reduces the number of branches in the hot path for ordinary > (non-sympos) code by one while adding one to sym-pos code, and that > should be a fair trade-off. The new branch should be well-predicted but > is still consuming resources. I did some simple timings on the old and new code, and the new code is not slower. See my post to Eli from yesterday evening [European time] on the bug #65017 thread. > > if (BASE_EQ (o1, o2)) > > return true; > > @@ -2824,8 +2827,8 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind, > > if (ASIZE (o2) != size) > > return false; > > - /* Compare bignums, overlays, markers, and boolvectors > > - specially, by comparing their values. */ > > + /* Compare bignums, overlays, markers, boolvectors, and > > + symbols with position specially, by comparing their values. */ > > if (BIGNUMP (o1)) > > return mpz_cmp (*xbignum_val (o1), *xbignum_val (o2)) == 0; > > if (OVERLAYP (o1)) > > @@ -2857,6 +2860,13 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind, > > if (TS_NODEP (o1)) > > return treesit_node_eq (o1, o2); > > #endif > > + if (SYMBOL_WITH_POS_P(o1)) /* symbols_with_pos_enabled is false. */ > > + return (internal_equal (XSYMBOL_WITH_POS (o1)->sym, > > + XSYMBOL_WITH_POS (o2)->sym, > > + equal_kind, depth + 1, ht) > > + && internal_equal (XSYMBOL_WITH_POS (o1)->pos, > > + XSYMBOL_WITH_POS (o2)->pos, > > + equal_kind, depth + 1, ht)); > Why recurse here if the components are a bare symbol and a fixnum, > respectively? Maybe in case they might somehow be something else? The code in the patch prevents an error being thrown in such a case. The code should be run vanishingly seldomly anyhow, so it shouldn't matter much. > > /* Aside from them, only true vectors, char-tables, compiled > > functions, and fonts (font-spec, font-entity, font-object) > > diff --git a/test/src/fns-tests.el b/test/src/fns-tests.el > > index 79ae4393f40..9c09e4f0c33 100644 > > --- a/test/src/fns-tests.el > > +++ b/test/src/fns-tests.el > > @@ -98,6 +98,26 @@ > > (should-not (equal-including-properties #("a" 0 1 (k "v")) > > #("b" 0 1 (k "v"))))) > > +(ert-deftest fns-tests-equal-symbols-with-position () > > + "Test `eq' and `equal' on symbols with position." > > + (let ((foo1 (position-symbol 'foo 42)) > > + (foo2 (position-symbol 'foo 666)) > > + (foo3 (position-symbol 'foo 42))) > > + (let (symbols-with-pos-enabled) > > + (should (eq foo1 foo1)) > Thank you! There is nothing wrong with the coverage of these tests with > respect to your changes. > However we should make an effort to prevent the compiler from > optimising (eq X X) -> t etc, which it is completely entitled to doing, > .... Why? (eq X X) is t in all circumstances, whether X is a symbol, a cons structure, or anything else. What am I missing, here? > .... and also test both the interpreted and compiled version of `eq` > and `equal`. They're the same code in both cases. I'm missing something here, too, I think. > The test bytecomp--eq-symbols-with-pos-enabled already does most of > this for a different reason. Perhaps it can be extended to cover > `equal` as well? I don't have such a test in my repository anywhere. Are you sure you wrote it right? -- Alan Mackenzie (Nuremberg, Germany).