unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Alan Mackenzie <acm@muc.de>
To: "Mattias Engdegård" <mattias.engdegard@gmail.com>
Cc: 65051@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
	Stefan Monnier <monnier@iro.umontreal.ca>
Subject: bug#65051: internal_equal manipulates symbols with position without checking symbols-with-pos-enabled.
Date: Sun, 6 Aug 2023 15:02:22 +0000	[thread overview]
Message-ID: <ZM-1_icwceg0gmPR@ACM> (raw)
In-Reply-To: <B33AD768-BF64-42A1-94DA-842F4D3784A7@gmail.com>

Hello, Mattias.

On Sun, Aug 06, 2023 at 15:37:24 +0200, Mattias Engdegård wrote:
> 5 aug. 2023 kl. 23.07 skrev Alan Mackenzie <acm@muc.de>:

> > diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
> > index 34db0caf3a8..a828d303c04 100644
> > --- a/doc/lispref/symbols.texi
> > +++ b/doc/lispref/symbols.texi
> > @@ -784,9 +784,15 @@ Symbols with Position
> > @cindex bare symbol
> > A @dfn{symbol with position} is a symbol, the @dfn{bare symbol},
> > together with an unsigned integer called the @dfn{position}.  These
> > -objects are intended for use by the byte compiler, which records in
> > -them the position of each symbol occurrence and uses those positions
> > -in warning and error messages.
> > +objects are stored internally much like vectors

> Not sure why we want to say how they are stored here. They can be
> stored in bubble memory for all the user cares.

The point is, they are _not_ stored in the obarray.  Eli specifically
asked me to clarify this point, yesterday.

> > , and don't themselves
> > +have entries in the obarray (though their bare symbols do;
> > +@pxref{Creating Symbols}).
> > +
> > +Symbols with position are for the use of the byte compiler, which
> > +records in them the position of each symbol occurrence and uses those
> > +positions in warning and error messages.  They shouldn't normally be
> > +used otherwise.  Doing so can cause unexpected results with basic
> > +Emacs functions such as @code{eq} and @code{equal}.

> > The printed representation of a symbol with position uses the hash
> > notation outlined in @ref{Printed Representation}.  It looks like
> > @@ -798,11 +804,20 @@ Symbols with Position

> > For most purposes, when the flag variable
> > @code{symbols-with-pos-enabled} is non-@code{nil}, symbols with
> > -positions behave just as bare symbols do.  For example, @samp{(eq
> > -#<symbol foo at 12345> foo)} has a value @code{t} when that variable
> > -is set (but @code{nil} when it isn't set).  Most of the time in Emacs this
> > -variable is @code{nil}, but the byte compiler binds it to @code{t}
> > -when it runs.
> > +positions behave just as their bare symbols would.  For example,
> > +@samp{(eq #<symbol foo at 12345> foo)} has a value @code{t} when the
> > +variable is set; likewise, @code{equal} will treat a symbol with
> > +position argument as its bare symbol.
> > +
> > +When @code{symbols-with-pos-enabled} is @code{nil}, any symbols with
> > +position continue to exist, but do not behave as symbols, or have the
> > +other useful properties outlined in the previous paragraph.  @code{eq}
> > +returns @code{t} when given identical arguments, and @code{equal}
> > +returns @code{t} when given arguments with @code{equal} components.

> Since the components are bare symbols and fixnums, equality and
> identity for them are equivalent, right?

No.  If there are two distinct SWPs with the same bare symbol and the
same position, they should be equal, but not eq.  But the real point is
to contrast how equal and eq work when symbols-with-pos-enabled is nil
with when it is non-nil.

> > +
> > +Most of the time in Emacs @code{symbols-with-pos-enabled} is
> > +@code{nil}, but the byte compiler and the native compiler bind it to
> > +@code{t} when they run.

> > Typically, symbols with position are created by the byte compiler
> > calling the reader function @code{read-positioning-symbols}
> > @@ -820,7 +835,7 @@ Symbols with Position
> > a symbol with position, ignoring the position.
> > @end defvar

> > -@defun symbol-with-pos-p symbol.
> > +@defun symbol-with-pos-p symbol
> > This function returns @code{t} if @var{symbol} is a symbol with
> > position, @code{nil} otherwise.
> > @end defun
> > diff --git a/src/fns.c b/src/fns.c
> > index bfd19e8c8f2..d47098c8791 100644
> > --- a/src/fns.c
> > +++ b/src/fns.c
> > @@ -2773,10 +2773,13 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,

> >   /* A symbol with position compares the contained symbol, and is
> >      `equal' to the corresponding ordinary symbol.  */
> > -  if (SYMBOL_WITH_POS_P (o1))
> > -    o1 = SYMBOL_WITH_POS_SYM (o1);
> > -  if (SYMBOL_WITH_POS_P (o2))
> > -    o2 = SYMBOL_WITH_POS_SYM (o2);
> > +  if (symbols_with_pos_enabled)
> > +    {
> > +      if (SYMBOL_WITH_POS_P (o1))
> > +	o1 = SYMBOL_WITH_POS_SYM (o1);
> > +      if (SYMBOL_WITH_POS_P (o2))
> > +	o2 = SYMBOL_WITH_POS_SYM (o2);
> > +    }

> OK. This reduces the number of branches in the hot path for ordinary
> (non-sympos) code by one while adding one to sym-pos code, and that
> should be a fair trade-off. The new branch should be well-predicted but
> is still consuming resources.

I did some simple timings on the old and new code, and the new code is
not slower.  See my post to Eli from yesterday evening [European time] on
the bug #65017 thread.

> >   if (BASE_EQ (o1, o2))
> >     return true;
> > @@ -2824,8 +2827,8 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,
> > 	if (ASIZE (o2) != size)
> > 	  return false;

> > -	/* Compare bignums, overlays, markers, and boolvectors
> > -	   specially, by comparing their values.  */
> > +	/* Compare bignums, overlays, markers, boolvectors, and
> > +	   symbols with position specially, by comparing their values.  */
> > 	if (BIGNUMP (o1))
> > 	  return mpz_cmp (*xbignum_val (o1), *xbignum_val (o2)) == 0;
> > 	if (OVERLAYP (o1))
> > @@ -2857,6 +2860,13 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,
> > 	if (TS_NODEP (o1))
> > 	  return treesit_node_eq (o1, o2);
> > #endif
> > +	if (SYMBOL_WITH_POS_P(o1)) /* symbols_with_pos_enabled is false.  */
> > +	  return (internal_equal (XSYMBOL_WITH_POS (o1)->sym,
> > +				  XSYMBOL_WITH_POS (o2)->sym,
> > +				  equal_kind, depth + 1, ht)
> > +		  && internal_equal (XSYMBOL_WITH_POS (o1)->pos,
> > +				     XSYMBOL_WITH_POS (o2)->pos,
> > +				     equal_kind, depth + 1, ht));

> Why recurse here if the components are a bare symbol and a fixnum,
> respectively?

Maybe in case they might somehow be something else?  The code in the
patch prevents an error being thrown in such a case.  The code should be
run vanishingly seldomly anyhow, so it shouldn't matter much.

> > 	/* Aside from them, only true vectors, char-tables, compiled
> > 	   functions, and fonts (font-spec, font-entity, font-object)
> > diff --git a/test/src/fns-tests.el b/test/src/fns-tests.el
> > index 79ae4393f40..9c09e4f0c33 100644
> > --- a/test/src/fns-tests.el
> > +++ b/test/src/fns-tests.el
> > @@ -98,6 +98,26 @@
> >   (should-not (equal-including-properties #("a" 0 1 (k "v"))
> >                                           #("b" 0 1 (k "v")))))

> > +(ert-deftest fns-tests-equal-symbols-with-position ()
> > +  "Test `eq' and `equal' on symbols with position."
> > +  (let ((foo1 (position-symbol 'foo 42))
> > +        (foo2 (position-symbol 'foo 666))
> > +        (foo3 (position-symbol 'foo 42)))
> > +    (let (symbols-with-pos-enabled)
> > +      (should (eq foo1 foo1))

> Thank you! There is nothing wrong with the coverage of these tests with
> respect to your changes.

> However we should make an effort to prevent the compiler from
> optimising (eq X X) -> t etc, which it is completely entitled to doing,
> ....

Why?  (eq X X) is t in all circumstances, whether X is a symbol, a cons
structure, or anything else.  What am I missing, here?

> .... and also test both the interpreted and compiled version of `eq`
> and `equal`.

They're the same code in both cases.  I'm missing something here, too, I
think.

> The test bytecomp--eq-symbols-with-pos-enabled already does most of
> this for a different reason. Perhaps it can be extended to cover
> `equal` as well?

I don't have such a test in my repository anywhere.  Are you sure you
wrote it right?

-- 
Alan Mackenzie (Nuremberg, Germany).





  reply	other threads:[~2023-08-06 15:02 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-04 14:00 bug#65051: internal_equal manipulates symbols with position without checking symbols-with-pos-enabled Alan Mackenzie
2023-08-04 14:32 ` Eli Zaretskii
2023-08-04 14:59   ` Alan Mackenzie
2023-08-04 15:27     ` Eli Zaretskii
2023-08-04 17:06       ` Alan Mackenzie
2023-08-04 18:01         ` Eli Zaretskii
2023-08-05 10:45           ` Alan Mackenzie
2023-08-05 10:57             ` Eli Zaretskii
2023-08-05 11:52               ` Alan Mackenzie
2023-08-05 12:13                 ` Eli Zaretskii
2023-08-05 13:04                   ` Alan Mackenzie
2023-08-05 13:13                     ` Eli Zaretskii
2023-08-13 16:14                       ` Alan Mackenzie
2023-08-05 14:40 ` Mattias Engdegård
2023-08-05 16:59   ` Alan Mackenzie
2023-08-05 17:02     ` Mattias Engdegård
2023-08-05 21:07   ` Alan Mackenzie
2023-08-06 13:37     ` Mattias Engdegård
2023-08-06 15:02       ` Alan Mackenzie [this message]
2023-08-07  8:58         ` Mattias Engdegård
2023-08-07  9:44           ` Alan Mackenzie
2023-08-09 18:45             ` Mattias Engdegård
2023-08-07  3:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-07  9:20   ` Alan Mackenzie
2023-08-08  2:56     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-08 15:33       ` Alan Mackenzie
2023-08-10  3:28         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-10  9:14           ` Alan Mackenzie
2023-08-10 14:28             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-10 18:35               ` Alan Mackenzie
2023-08-12  5:36                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-12  6:10                   ` Eli Zaretskii
2023-08-12 18:46                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-12 19:10                       ` Eli Zaretskii
2023-08-13 15:27                       ` Alan Mackenzie
2023-08-12 10:41                   ` Alan Mackenzie
2023-08-12 18:07                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-08-13 13:52                       ` Alan Mackenzie
2023-08-12 21:59                   ` Alan Mackenzie
2023-08-11  0:51         ` Dmitry Gutov
2023-08-11 10:42           ` Alan Mackenzie
2023-08-11 11:18             ` Dmitry Gutov
2023-08-11 12:05               ` Alan Mackenzie
2023-08-11 13:19                 ` Dmitry Gutov
2023-08-11 14:04                   ` Alan Mackenzie
2023-08-11 18:15                     ` Dmitry Gutov
     [not found] ` <handler.65051.B.169115764532326.ack@debbugs.gnu.org>
2023-09-04 12:57   ` bug#65051: Acknowledgement (internal_equal manipulates symbols with position without checking symbols-with-pos-enabled.) Alan Mackenzie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZM-1_icwceg0gmPR@ACM \
    --to=acm@muc.de \
    --cc=65051@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=mattias.engdegard@gmail.com \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).