* Help please! To track down GC trying to free an already freed object. @ 2019-04-02 11:25 Alan Mackenzie 2019-04-02 15:04 ` Eli Zaretskii 2019-04-02 19:09 ` Daniel Colascione 0 siblings, 2 replies; 24+ messages in thread From: Alan Mackenzie @ 2019-04-02 11:25 UTC (permalink / raw) To: emacs-devel Hello, Emacs. I get this problem after a recent merge of master into /scratch/accurate-warning-pos (my branch where I'm trying to implement correct source positions in the byte compiler's warning messages). This was a large merge, including bringing in the portable dumper. Emacs aborts at mark_object L+179 (in alloc.c), because a pseudovector being freed already has type PVEC_FREE, i.e. has been freed already. This object is a "symbol with position", a type of pseudovector which doesn't yet exist outside of this scratch branch. At a guess, I'm setting some data structure in the C code to a Lisp structure containing this object, but failing to apply static protection to this C variable. Or something like that. This failure occurs during the byte compilation of .../lisp/registry.el in a make or make bootstrap. The failure only occurs when this byte compilation is started as -batch from the command line. So my use of GDB is from the command line, not within a running Emacs. With GDB, I can break at the creation of this symbol-with-position object and again at its (first) freeing with this breakpoint: break setup_on_free_list if (v == 0x5555561d0450) . However, this isn't helping me to track down the Lisp object which still references this symbol-with-position. I've tried to find the address of Emacs's data segment, so as to be able to search through it for 0x5555561d0455 in GDB, but this doesn't feel like a very useful thing to do. Could somebody who has experience in this sort of thing please suggest how I might proceed with the debugging, or possibly offer me some other sort of help or hints. Thanks in advance! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 11:25 Help please! To track down GC trying to free an already freed object Alan Mackenzie @ 2019-04-02 15:04 ` Eli Zaretskii 2019-04-02 20:42 ` Alan Mackenzie 2019-04-02 19:09 ` Daniel Colascione 1 sibling, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-04-02 15:04 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Date: Tue, 2 Apr 2019 11:25:37 +0000 > From: Alan Mackenzie <acm@muc.de> > > With GDB, I can break at the creation of this symbol-with-position > object and again at its (first) freeing with this breakpoint: > > break setup_on_free_list if (v == 0x5555561d0450) > > . However, this isn't helping me to track down the Lisp object which > still references this symbol-with-position. I've tried to find the > address of Emacs's data segment, so as to be able to search through it > for 0x5555561d0455 in GDB, but this doesn't feel like a very useful > thing to do. > > Could somebody who has experience in this sort of thing please suggest > how I might proceed with the debugging, or possibly offer me some other > sort of help or hints. The usual method of debugging such problems is described in etc/DEBUG, it basically uses the last_marked[] array. You start with the object at last_marked[last_marked_index - 1], and go backwards (in circular manner), comparing the objects you find in the array with those you see in the call-stack frames that call mark_* functions. Just be very careful when you print the objects; e.g., never use 'pp', because the function it calls cannot handle marked objects. If you already tried this, please ask more specific questions. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 15:04 ` Eli Zaretskii @ 2019-04-02 20:42 ` Alan Mackenzie 2019-04-03 4:43 ` Eli Zaretskii 0 siblings, 1 reply; 24+ messages in thread From: Alan Mackenzie @ 2019-04-02 20:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Hello, Eli. On Tue, Apr 02, 2019 at 18:04:22 +0300, Eli Zaretskii wrote: > > Date: Tue, 2 Apr 2019 11:25:37 +0000 > > From: Alan Mackenzie <acm@muc.de> > > With GDB, I can break at the creation of this symbol-with-position > > object and again at its (first) freeing with this breakpoint: > > break setup_on_free_list if (v == 0x5555561d0450) > > . However, this isn't helping me to track down the Lisp object which > > still references this symbol-with-position. I've tried to find the > > address of Emacs's data segment, so as to be able to search through it > > for 0x5555561d0455 in GDB, but this doesn't feel like a very useful > > thing to do. > > Could somebody who has experience in this sort of thing please suggest > > how I might proceed with the debugging, or possibly offer me some other > > sort of help or hints. > The usual method of debugging such problems is described in etc/DEBUG, Apologies, I didn't see this. I read quite a bit of etc/DEBUG, but for some reason completely missed the bit about GC problems. > it basically uses the last_marked[] array. You start with the object > at last_marked[last_marked_index - 1], and go backwards (in circular > manner), comparing the objects you find in the array with those you > see in the call-stack frames that call mark_* functions. Just be very > careful when you print the objects; e.g., never use 'pp', because the > function it calls cannot handle marked objects. I'm having some difficult seeing the entire last_marked array with GDB. I will try to find a solution in the GDB manual. > If you already tried this, please ask more specific questions. No, I hadn't. I didn't know about last_marked. I'll see if I can get further with its help. Thanks! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 20:42 ` Alan Mackenzie @ 2019-04-03 4:43 ` Eli Zaretskii 2019-04-04 18:57 ` Alan Mackenzie 0 siblings, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-04-03 4:43 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Date: Tue, 2 Apr 2019 20:42:37 +0000 > From: Alan Mackenzie <acm@muc.de> > Cc: emacs-devel@gnu.org > > I'm having some difficult seeing the entire last_marked array with GDB. > I will try to find a solution in the GDB manual. You want "set print elements unlimited", I think. However, my recommendation is to examine the array one element at a time, moving back to the previous one only when you understand what the element you've looked at is and whether it is or isn't related to the problem. Also, last_marked array is written cyclically, so you may need to wrap around the index to see the objects in the right order. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-03 4:43 ` Eli Zaretskii @ 2019-04-04 18:57 ` Alan Mackenzie 0 siblings, 0 replies; 24+ messages in thread From: Alan Mackenzie @ 2019-04-04 18:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Hello, Eli. On Wed, Apr 03, 2019 at 07:43:22 +0300, Eli Zaretskii wrote: > > Date: Tue, 2 Apr 2019 20:42:37 +0000 > > From: Alan Mackenzie <acm@muc.de> > > Cc: emacs-devel@gnu.org > > I'm having some difficult seeing the entire last_marked array with GDB. > > I will try to find a solution in the GDB manual. > You want "set print elements unlimited", I think. > However, my recommendation is to examine the array one element at a > time, moving back to the previous one only when you understand what > the element you've looked at is and whether it is or isn't related to > the problem. Also, last_marked array is written cyclically, so you > may need to wrap around the index to see the objects in the right > order. I've found the bug. In the garbage collection, it's necessary for Qsymbols_with_pos_enabled to be bound to nil. (That's the variable which enables symbols with position). I had bound that variable to nil in Fgarbage_collect, not noticing that there are calls to the C function garbage_collect which bypass the primitive. This was the bug. As a result, the pseudovector (Symbol "nil" at position 339) was caught by a NILP, causing it not to get marked. So it got swept away, even though it was still live. So I've spent several days on this, but as a consolation I now know GDB much better than I did before. ;-). My branch now builds successfully. Thanks for all the help! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 11:25 Help please! To track down GC trying to free an already freed object Alan Mackenzie 2019-04-02 15:04 ` Eli Zaretskii @ 2019-04-02 19:09 ` Daniel Colascione 2019-04-02 19:21 ` Eli Zaretskii 2019-04-02 20:24 ` Alan Mackenzie 1 sibling, 2 replies; 24+ messages in thread From: Daniel Colascione @ 2019-04-02 19:09 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Hello, Emacs. > > I get this problem after a recent merge of master into > /scratch/accurate-warning-pos (my branch where I'm trying to implement > correct source positions in the byte compiler's warning messages). This > was a large merge, including bringing in the portable dumper. > > Emacs aborts at mark_object L+179 (in alloc.c), because a pseudovector > being freed already has type PVEC_FREE, i.e. has been freed already. > This object is a "symbol with position", a type of pseudovector which > doesn't yet exist outside of this scratch branch. Out of curiosity, why do we need a new C-level type here? > At a guess, I'm setting some data structure in the C code to a Lisp > structure containing this object, but failing to apply static protection > to this C variable. Or something like that. > > This failure occurs during the byte compilation of .../lisp/registry.el > in a make or make bootstrap. The failure only occurs when this byte > compilation is started as -batch from the command line. So my use of > GDB is from the command line, not within a running Emacs. > > With GDB, I can break at the creation of this symbol-with-position > object and again at its (first) freeing with this breakpoint: > > break setup_on_free_list if (v == 0x5555561d0450) > > . However, this isn't helping me to track down the Lisp object which > still references this symbol-with-position. I've tried to find the > address of Emacs's data segment, so as to be able to search through it > for 0x5555561d0455 in GDB, but this doesn't feel like a very useful > thing to do. > > Could somebody who has experience in this sort of thing please suggest > how I might proceed with the debugging, or possibly offer me some other > sort of help or hints. > > Thanks in advance! rr is incredibly helpful for debugging this sort of problem. See https://rr-project.org/. You can record an rr session containing the crash, replay it, get to the crash, and then reverse-next, reverse-finish, and reverse-continue your way through the GC, running it in reverse until you find whatever it is that made mark_object on the dead object happen. Hardware watchpoints with rr are also very useful and work great in reverse mode: just use watch -l myvar and reverse-continue to see who last wrote a memory location, or use rwatch to see who last *read* a location. (The -l is important since it enables the use of hardware watchpoints.) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 19:09 ` Daniel Colascione @ 2019-04-02 19:21 ` Eli Zaretskii 2019-04-02 20:46 ` Alan Mackenzie 2019-04-02 20:24 ` Alan Mackenzie 1 sibling, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-04-02 19:21 UTC (permalink / raw) To: Daniel Colascione; +Cc: acm, emacs-devel > Date: Tue, 2 Apr 2019 12:09:59 -0700 > From: "Daniel Colascione" <dancol@dancol.org> > Cc: emacs-devel@gnu.org > > rr is incredibly helpful for debugging this sort of problem. See > https://rr-project.org/. You can record an rr session containing the > crash, replay it, get to the crash, and then reverse-next, reverse-finish, > and reverse-continue your way through the GC, running it in reverse until > you find whatever it is that made mark_object on the dead object happen. GDB supports reverse execution as well, on some platforms. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 19:21 ` Eli Zaretskii @ 2019-04-02 20:46 ` Alan Mackenzie 2019-04-02 21:03 ` Daniel Colascione 2019-04-03 4:39 ` Eli Zaretskii 0 siblings, 2 replies; 24+ messages in thread From: Alan Mackenzie @ 2019-04-02 20:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel Hello, Eli. On Tue, Apr 02, 2019 at 22:21:26 +0300, Eli Zaretskii wrote: > > Date: Tue, 2 Apr 2019 12:09:59 -0700 > > From: "Daniel Colascione" <dancol@dancol.org> > > Cc: emacs-devel@gnu.org > > > > rr is incredibly helpful for debugging this sort of problem. See > > https://rr-project.org/. You can record an rr session containing the > > crash, replay it, get to the crash, and then reverse-next, reverse-finish, > > and reverse-continue your way through the GC, running it in reverse until > > you find whatever it is that made mark_object on the dead object happen. > GDB supports reverse execution as well, on some platforms. On my GNU/Linux system, I tried to run 'reverse-next', and got the error message: Target multi-thread does not support this command. . :-( I suppose I could reconfigure without multi threading, but then the bug (which is reproducible) probably wouldn't happen in the same place. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 20:46 ` Alan Mackenzie @ 2019-04-02 21:03 ` Daniel Colascione 2019-04-03 4:39 ` Eli Zaretskii 1 sibling, 0 replies; 24+ messages in thread From: Daniel Colascione @ 2019-04-02 21:03 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel > Hello, Eli. > > On Tue, Apr 02, 2019 at 22:21:26 +0300, Eli Zaretskii wrote: >> > Date: Tue, 2 Apr 2019 12:09:59 -0700 >> > From: "Daniel Colascione" <dancol@dancol.org> >> > Cc: emacs-devel@gnu.org >> > >> > rr is incredibly helpful for debugging this sort of problem. See >> > https://rr-project.org/. You can record an rr session containing the >> > crash, replay it, get to the crash, and then reverse-next, >> reverse-finish, >> > and reverse-continue your way through the GC, running it in reverse >> until >> > you find whatever it is that made mark_object on the dead object >> happen. > >> GDB supports reverse execution as well, on some platforms. > > On my GNU/Linux system, I tried to run 'reverse-next', and got the error > message: > > Target multi-thread does not support this command. > > . :-( I suppose I could reconfigure without multi threading, but then > the bug (which is reproducible) probably wouldn't happen in the same > place. I don't think I've ever gotten pure-GDB reverse execution to work correctly. rr Just Works for me in every instance I've tried it. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 20:46 ` Alan Mackenzie 2019-04-02 21:03 ` Daniel Colascione @ 2019-04-03 4:39 ` Eli Zaretskii 2019-04-03 10:01 ` Alan Mackenzie 1 sibling, 1 reply; 24+ messages in thread From: Eli Zaretskii @ 2019-04-03 4:39 UTC (permalink / raw) To: Alan Mackenzie; +Cc: dancol, emacs-devel > Date: Tue, 2 Apr 2019 20:46:53 +0000 > From: Alan Mackenzie <acm@muc.de> > Cc: Daniel Colascione <dancol@dancol.org>, emacs-devel@gnu.org > > > GDB supports reverse execution as well, on some platforms. > > On my GNU/Linux system, I tried to run 'reverse-next', and got the error > message: > > Target multi-thread does not support this command. I think you are supposed to record the execution, and then say (gdb) target record-core or (gdb) target record-btrace before the reverse execution is available. But I was always able to debug GC problems by using last_marked array. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-03 4:39 ` Eli Zaretskii @ 2019-04-03 10:01 ` Alan Mackenzie 2019-04-03 10:12 ` Eli Zaretskii 2019-04-03 15:23 ` Paul Eggert 0 siblings, 2 replies; 24+ messages in thread From: Alan Mackenzie @ 2019-04-03 10:01 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dancol, emacs-devel Hello, Eli. On Wed, Apr 03, 2019 at 07:39:35 +0300, Eli Zaretskii wrote: > > Date: Tue, 2 Apr 2019 20:46:53 +0000 > > From: Alan Mackenzie <acm@muc.de> > > Cc: Daniel Colascione <dancol@dancol.org>, emacs-devel@gnu.org > > > GDB supports reverse execution as well, on some platforms. > > On my GNU/Linux system, I tried to run 'reverse-next', and got the error > > message: > > Target multi-thread does not support this command. > I think you are supposed to record the execution, and then say > (gdb) target record-core > or > (gdb) target record-btrace > before the reverse execution is available. Yes. I thought there was something missing. ;-) There's no mention of such recording in the GDB manual's "Reverse Execution" page, nor any cross reference to "Process Record and Replay" there. I'll try again and see if I can get it working. > But I was always able to debug GC problems by using last_marked array. The problem I think I'm up against is that the symbol-with-pos object is not being marked at a particular garbage_collect_1, and thus gets freed prematurely. I intend to get the hex values of the Lisp_Objects which constitute the list in which the symbol-with-pos is embedded and search for these in last_marked. Putting a conditional breakpoint on Fcons slows down Emacs somewhat. ;-) -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-03 10:01 ` Alan Mackenzie @ 2019-04-03 10:12 ` Eli Zaretskii 2019-04-03 15:23 ` Paul Eggert 1 sibling, 0 replies; 24+ messages in thread From: Eli Zaretskii @ 2019-04-03 10:12 UTC (permalink / raw) To: Alan Mackenzie; +Cc: dancol, emacs-devel > Date: Wed, 3 Apr 2019 10:01:13 +0000 > Cc: dancol@dancol.org, emacs-devel@gnu.org > From: Alan Mackenzie <acm@muc.de> > > The problem I think I'm up against is that the symbol-with-pos object is > not being marked at a particular garbage_collect_1, and thus gets freed > prematurely. > > I intend to get the hex values of the Lisp_Objects which constitute the > list in which the symbol-with-pos is embedded and search for these in > last_marked. Putting a conditional breakpoint on Fcons slows down Emacs > somewhat. ;-) GDB has memory-search commands, see the node "Searching Memory" in the GDB manual. Maybe this can help. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-03 10:01 ` Alan Mackenzie 2019-04-03 10:12 ` Eli Zaretskii @ 2019-04-03 15:23 ` Paul Eggert 1 sibling, 0 replies; 24+ messages in thread From: Paul Eggert @ 2019-04-03 15:23 UTC (permalink / raw) To: Alan Mackenzie, Eli Zaretskii; +Cc: dancol, emacs-devel Alan Mackenzie wrote: > There's no mention of > such recording in the GDB manual's "Reverse Execution" page, nor any > cross reference to "Process Record and Replay" there. I filed a bug report for that here: https://sourceware.org/bugzilla/show_bug.cgi?id=24417 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 19:09 ` Daniel Colascione 2019-04-02 19:21 ` Eli Zaretskii @ 2019-04-02 20:24 ` Alan Mackenzie 2019-04-02 20:33 ` Daniel Colascione 1 sibling, 1 reply; 24+ messages in thread From: Alan Mackenzie @ 2019-04-02 20:24 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel Hello, Daniel. On Tue, Apr 02, 2019 at 12:09:59 -0700, Daniel Colascione wrote: > > Hello, Emacs. > > I get this problem after a recent merge of master into > > /scratch/accurate-warning-pos (my branch where I'm trying to implement > > correct source positions in the byte compiler's warning messages). This > > was a large merge, including bringing in the portable dumper. > > Emacs aborts at mark_object L+179 (in alloc.c), because a pseudovector > > being freed already has type PVEC_FREE, i.e. has been freed already. > > This object is a "symbol with position", a type of pseudovector which > > doesn't yet exist outside of this scratch branch. > Out of curiosity, why do we need a new C-level type here? It's to help solve a bug in the byte compiler, which up until recently was intractable. The byte compiler frequently (?usually) reports incorrect line/column numbers in its warning messages. This is due to the kludge it uses to keep track of them. The only current candidate for a fix is for the reader, on a flag being bound to non-nil, to return "symbols with position" rather than standard symbols. The "position" associated with the symbol is it's textual offset from the beginning of the construct in the source file being read. These symbols with position are implemented as pseudovectors with type PVEC_SYMBOL_WITH_POS and behave as ordinary symbols for all purposes, except for when a warning message is being output, when the postion supplies a correct file/line number for the message. This works and works well. However it causes an unacceptable slowdown in Emacs (around 8 - 15 per cent). I'm working on a fix for this, and have made substantial progress. The topic was discussed at length in emacs-devel starting November last year in posts whose Subject: contained "scratch/accurate-warning-pos". -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 20:24 ` Alan Mackenzie @ 2019-04-02 20:33 ` Daniel Colascione 2019-04-02 21:00 ` Alan Mackenzie 0 siblings, 1 reply; 24+ messages in thread From: Daniel Colascione @ 2019-04-02 20:33 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel > Hello, Daniel. > > On Tue, Apr 02, 2019 at 12:09:59 -0700, Daniel Colascione wrote: >> > Hello, Emacs. > >> > I get this problem after a recent merge of master into >> > /scratch/accurate-warning-pos (my branch where I'm trying to implement >> > correct source positions in the byte compiler's warning messages). >> This >> > was a large merge, including bringing in the portable dumper. > >> > Emacs aborts at mark_object L+179 (in alloc.c), because a pseudovector >> > being freed already has type PVEC_FREE, i.e. has been freed already. >> > This object is a "symbol with position", a type of pseudovector which >> > doesn't yet exist outside of this scratch branch. > >> Out of curiosity, why do we need a new C-level type here? > > It's to help solve a bug in the byte compiler, which up until recently > was intractable. The byte compiler frequently (?usually) reports > incorrect line/column numbers in its warning messages. This is due to > the kludge it uses to keep track of them. > > The only current candidate for a fix is for the reader, on a flag being > bound to non-nil, to return "symbols with position" rather than standard > symbols. The "position" associated with the symbol is it's textual > offset from the beginning of the construct in the source file being read. So if I read symbol foo from file1.el and symbol foo from file2.el, I get two different symbol-with-location instances, each tagged with a different source location? Do these symbol objects compare eq to each other? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 20:33 ` Daniel Colascione @ 2019-04-02 21:00 ` Alan Mackenzie 2019-04-05 4:49 ` Alex 0 siblings, 1 reply; 24+ messages in thread From: Alan Mackenzie @ 2019-04-02 21:00 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel Hello again, Daniel. On Tue, Apr 02, 2019 at 13:33:02 -0700, Daniel Colascione wrote: > > Hello, Daniel. > > On Tue, Apr 02, 2019 at 12:09:59 -0700, Daniel Colascione wrote: > >> > Hello, Emacs. > >> > I get this problem after a recent merge of master into > >> > /scratch/accurate-warning-pos (my branch where I'm trying to implement > >> > correct source positions in the byte compiler's warning messages). > >> This > >> > was a large merge, including bringing in the portable dumper. > >> > Emacs aborts at mark_object L+179 (in alloc.c), because a pseudovector > >> > being freed already has type PVEC_FREE, i.e. has been freed already. > >> > This object is a "symbol with position", a type of pseudovector which > >> > doesn't yet exist outside of this scratch branch. > >> Out of curiosity, why do we need a new C-level type here? > > It's to help solve a bug in the byte compiler, which up until recently > > was intractable. The byte compiler frequently (?usually) reports > > incorrect line/column numbers in its warning messages. This is due to > > the kludge it uses to keep track of them. > > The only current candidate for a fix is for the reader, on a flag being > > bound to non-nil, to return "symbols with position" rather than standard > > symbols. The "position" associated with the symbol is it's textual > > offset from the beginning of the construct in the source file being read. > So if I read symbol foo from file1.el and symbol foo from file2.el, I get > two different symbol-with-location instances, each tagged with a different > source location? Do these symbol objects compare eq to each other? They do, yes. Otherwise the byte compiler wouldn't work, as it frequently compares a symbol-with-position with a constant ("ordinary") symbol using eq. However, it is envisaged the flag symbols-with-pos-enable will be bound to non-nil only by the byte compiler. The reader resets this position to zero for each top-level form it reads. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-02 21:00 ` Alan Mackenzie @ 2019-04-05 4:49 ` Alex 2019-04-05 8:26 ` Alan Mackenzie 0 siblings, 1 reply; 24+ messages in thread From: Alex @ 2019-04-05 4:49 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel Alan Mackenzie <acm@muc.de> writes: > Hello again, Daniel. > > On Tue, Apr 02, 2019 at 13:33:02 -0700, Daniel Colascione wrote: > >> So if I read symbol foo from file1.el and symbol foo from file2.el, I get >> two different symbol-with-location instances, each tagged with a different >> source location? Do these symbol objects compare eq to each other? > > They do, yes. Otherwise the byte compiler wouldn't work, as it > frequently compares a symbol-with-position with a constant ("ordinary") > symbol using eq. > > However, it is envisaged the flag symbols-with-pos-enable will be bound > to non-nil only by the byte compiler. The reader resets this position to > zero for each top-level form it reads. I apologize if this topic already reached its conclusion, but IMO having eq return true for two different object types is quite surprising behaviour. Is it out of the question to leave eq alone and introduce, e.g., eq-excluding-position that strips possible positions before comparison? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Help please! To track down GC trying to free an already freed object. 2019-04-05 4:49 ` Alex @ 2019-04-05 8:26 ` Alan Mackenzie 2019-04-05 17:05 ` Comparing symbol-with-position using eq (was: Help please! To track down GC trying to free an already freed object.) Alex 0 siblings, 1 reply; 24+ messages in thread From: Alan Mackenzie @ 2019-04-05 8:26 UTC (permalink / raw) To: Alex; +Cc: Daniel Colascione, emacs-devel Hello, Alex. On Thu, Apr 04, 2019 at 22:49:22 -0600, Alex wrote: > Alan Mackenzie <acm@muc.de> writes: > > On Tue, Apr 02, 2019 at 13:33:02 -0700, Daniel Colascione wrote: > >> So if I read symbol foo from file1.el and symbol foo from file2.el, > >> I get two different symbol-with-location instances, each tagged with > >> a different source location? Do these symbol objects compare eq to > >> each other? > > They do, yes. Otherwise the byte compiler wouldn't work, as it > > frequently compares a symbol-with-position with a constant > > ("ordinary") symbol using eq. > > However, it is envisaged the flag symbols-with-pos-enable will be bound > > to non-nil only by the byte compiler. The reader resets this position to > > zero for each top-level form it reads. > I apologize if this topic already reached its conclusion, but IMO > having eq return true for two different object types is quite > surprising behaviour. We are comparing two symbols, both of which are 'foo, but one of which is annotated with its position in a source file. The two symbols are the same symbol. I understand the reaction to the idea, though. Even though the representation of these two objects is different, conceptually they are the same object. But consider: on a make bootstrap I did last night, there were 332 warning messages from the byte compiler. Of these, only 80 gave the correct line/column position, the other 252 being wrong. There have been several bug reports from users complaining about such false positions. This is what I'm trying to fix. > Is it out of the question to leave eq alone and introduce, e.g., > eq-excluding-position that strips possible positions before comparison? It is, rather. To implement this would involve rewriting everything which calls eq and is used by the byte compiler, to call eq-excluding-position instead. These functions would need to exist in two versions. There are rather a lot of functions which use eq. ;-) My actual strategy is to have two versions of each C primitive used by the byte compiler, and to switch over to the "symbol-with-position" version at the start of the byte compiler. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Comparing symbol-with-position using eq (was: Help please! To track down GC trying to free an already freed object.) 2019-04-05 8:26 ` Alan Mackenzie @ 2019-04-05 17:05 ` Alex 2019-04-05 18:21 ` Comparing symbol-with-position using eq Alan Mackenzie 0 siblings, 1 reply; 24+ messages in thread From: Alex @ 2019-04-05 17:05 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel Hello, Alan. Alan Mackenzie <acm@muc.de> writes: > On Thu, Apr 04, 2019 at 22:49:22 -0600, Alex wrote: > >> I apologize if this topic already reached its conclusion, but IMO >> having eq return true for two different object types is quite >> surprising behaviour. > > We are comparing two symbols, both of which are 'foo, but one of which is > annotated with its position in a source file. The two symbols are the > same symbol. Is it not comparing a symbol with a pseudovector containing that symbol and a position? > I understand the reaction to the idea, though. Even though the > representation of these two objects is different, conceptually they are > the same object. Similar objects, but I don't believe that's enough for eq. Consider that it's regarded non-portable in Lisp to compare integers with eq since the same number may be represented by different objects, or (eq 3 3.0), or (eq (list 1 2) (list 1 2)). > But consider: on a make bootstrap I did last night, there were 332 > warning messages from the byte compiler. Of these, only 80 gave the > correct line/column position, the other 252 being wrong. There have been > several bug reports from users complaining about such false positions. > This is what I'm trying to fix. I agree that it's a problem very much worth fixing; thank you for working on it. >> Is it out of the question to leave eq alone and introduce, e.g., >> eq-excluding-position that strips possible positions before comparison? > > It is, rather. To implement this would involve rewriting everything > which calls eq and is used by the byte compiler, to call > eq-excluding-position instead. These functions would need to exist in > two versions. There are rather a lot of functions which use eq. ;-) Why would you need to rewrite the helper procedures that the byte compiler uses? What about stripping the position at each relevant call site? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Comparing symbol-with-position using eq 2019-04-05 17:05 ` Comparing symbol-with-position using eq (was: Help please! To track down GC trying to free an already freed object.) Alex @ 2019-04-05 18:21 ` Alan Mackenzie 2019-04-05 20:18 ` Daniel Colascione 0 siblings, 1 reply; 24+ messages in thread From: Alan Mackenzie @ 2019-04-05 18:21 UTC (permalink / raw) To: Alex; +Cc: Daniel Colascione, emacs-devel Hello, Alex. On Fri, Apr 05, 2019 at 11:05:59 -0600, Alex wrote: > Hello, Alan. > Alan Mackenzie <acm@muc.de> writes: > > On Thu, Apr 04, 2019 at 22:49:22 -0600, Alex wrote: > >> I apologize if this topic already reached its conclusion, but IMO > >> having eq return true for two different object types is quite > >> surprising behaviour. > > We are comparing two symbols, both of which are 'foo, but one of which is > > annotated with its position in a source file. The two symbols are the > > same symbol. > Is it not comparing a symbol with a pseudovector containing that symbol > and a position? At the machine code level, that is what it's doing, yes. > > I understand the reaction to the idea, though. Even though the > > representation of these two objects is different, conceptually they are > > the same object. > Similar objects, but I don't believe that's enough for eq. Consider that > it's regarded non-portable in Lisp to compare integers with eq since the > same number may be represented by different objects, or (eq 3 3.0), or > (eq (list 1 2) (list 1 2)). The point is that comparing 'foo with (Symbol "foo" at 339) with `eq', and returning t doesn't do any harm. On the contrary, it enables correct source positions to be output in byte compiler warning messages. That it does no harm is verified by the fact that a make bootstrap with such annotated symbols works. However, there is a slight slowdown in this Emacs, compared with the master branch. The powers that be have intimated that this slowdown is unacceptable, so I'm having to make more far reaching changes in the C code to confine this slowdown to byte compilation. > > But consider: on a make bootstrap I did last night, there were 332 > > warning messages from the byte compiler. Of these, only 80 gave the > > correct line/column position, the other 252 being wrong. There have been > > several bug reports from users complaining about such false positions. > > This is what I'm trying to fix. > I agree that it's a problem very much worth fixing; thank you for > working on it. It's a difficult problem. The idea of annotating symbols with a source position (this was Stefan M.'s idea) is the only idea which has even come close to solving this problem. I was struggling with another approach back in 2016 which involved keeping the source location in a hash table indexed by the corresponding cons cell. This effort collapsed from the sheer tedium of the changes needed, coupled with the unlikelihood of getting the changes working, to say nothing of the fact it would have rendered the byte compiler unreadable. > >> Is it out of the question to leave eq alone and introduce, e.g., > >> eq-excluding-position that strips possible positions before comparison? > > It is, rather. To implement this would involve rewriting everything > > which calls eq and is used by the byte compiler, to call > > eq-excluding-position instead. These functions would need to exist in > > two versions. There are rather a lot of functions which use eq. ;-) > Why would you need to rewrite the helper procedures that the byte > compiler uses? What about stripping the position at each relevant call > site? I'm not sure what you mean here. If by "relevant call site" you mean "places where `eq' is used", there are just too many of them. They're in the C code as well as the Lisp. If you mean "places where the helper procedures are called", then that stripping the positions would negate the whole point of the symbols with positions, since it is these helper procedures which output warning messages. Or did you mean something else? -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Comparing symbol-with-position using eq 2019-04-05 18:21 ` Comparing symbol-with-position using eq Alan Mackenzie @ 2019-04-05 20:18 ` Daniel Colascione 2019-04-05 21:54 ` Alan Mackenzie 0 siblings, 1 reply; 24+ messages in thread From: Daniel Colascione @ 2019-04-05 20:18 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Daniel Colascione, Alex, emacs-devel > Hello, Alex. > > On Fri, Apr 05, 2019 at 11:05:59 -0600, Alex wrote: >> Hello, Alan. > >> Alan Mackenzie <acm@muc.de> writes: > >> > On Thu, Apr 04, 2019 at 22:49:22 -0600, Alex wrote: > >> >> I apologize if this topic already reached its conclusion, but IMO >> >> having eq return true for two different object types is quite >> >> surprising behaviour. > >> > We are comparing two symbols, both of which are 'foo, but one of which >> is >> > annotated with its position in a source file. The two symbols are the >> > same symbol. > >> Is it not comparing a symbol with a pseudovector containing that symbol >> and a position? > > At the machine code level, that is what it's doing, yes. > >> > I understand the reaction to the idea, though. Even though the >> > representation of these two objects is different, conceptually they >> are >> > the same object. > >> Similar objects, but I don't believe that's enough for eq. Consider that >> it's regarded non-portable in Lisp to compare integers with eq since the >> same number may be represented by different objects, or (eq 3 3.0), or >> (eq (list 1 2) (list 1 2)). > > The point is that comparing 'foo with (Symbol "foo" at 339) with `eq', > and returning t doesn't do any harm. On the contrary, it enables correct > source positions to be output in byte compiler warning messages. That it > does no harm is verified by the fact that a make bootstrap with such > annotated symbols works. > > However, there is a slight slowdown in this Emacs, compared with the > master branch. The powers that be have intimated that this slowdown is > unacceptable, so I'm having to make more far reaching changes in the C > code to confine this slowdown to byte compilation. I'm also concerned that by overloading eq this way we'll make it easy to "lose" information about positions. In general, when (eq a b), we can substitute a for b and vice versa. The objects are equivalent in the strongest sense. Now, they're not equivalent, and choosing a instead of b can lead to subtle bugs, especially since we're talking about error-path and warning-path code that might not be frequently exercised. You mention that we'd need to change the use of EQ throughout the byte compiler in order to work with positional symbols properly. Can we just do that, in one big renaming patch? In cases where we don't want positions, we can just define a macro making the new eq-for-position function equivalent to eq. But yes, it's kind of unfortunate that we haven't been using an explicit AST representation. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Comparing symbol-with-position using eq 2019-04-05 20:18 ` Daniel Colascione @ 2019-04-05 21:54 ` Alan Mackenzie 2019-04-05 22:50 ` Paul Eggert 2019-04-06 12:23 ` Clément Pit-Claudel 0 siblings, 2 replies; 24+ messages in thread From: Alan Mackenzie @ 2019-04-05 21:54 UTC (permalink / raw) To: Daniel Colascione; +Cc: Alex, emacs-devel Hello, Daniel. On Fri, Apr 05, 2019 at 13:18:55 -0700, Daniel Colascione wrote: > > Hello, Alex. > > On Fri, Apr 05, 2019 at 11:05:59 -0600, Alex wrote: > >> Hello, Alan. > >> Alan Mackenzie <acm@muc.de> writes: > >> > On Thu, Apr 04, 2019 at 22:49:22 -0600, Alex wrote: > >> >> I apologize if this topic already reached its conclusion, but IMO > >> >> having eq return true for two different object types is quite > >> >> surprising behaviour. > >> > We are comparing two symbols, both of which are 'foo, but one of which > >> is > >> > annotated with its position in a source file. The two symbols are the > >> > same symbol. > >> Is it not comparing a symbol with a pseudovector containing that symbol > >> and a position? > > At the machine code level, that is what it's doing, yes. > >> > I understand the reaction to the idea, though. Even though the > >> > representation of these two objects is different, conceptually they > >> are > >> > the same object. > >> Similar objects, but I don't believe that's enough for eq. Consider that > >> it's regarded non-portable in Lisp to compare integers with eq since the > >> same number may be represented by different objects, or (eq 3 3.0), or > >> (eq (list 1 2) (list 1 2)). > > The point is that comparing 'foo with (Symbol "foo" at 339) with `eq', > > and returning t doesn't do any harm. On the contrary, it enables correct > > source positions to be output in byte compiler warning messages. That it > > does no harm is verified by the fact that a make bootstrap with such > > annotated symbols works. > > However, there is a slight slowdown in this Emacs, compared with the > > master branch. The powers that be have intimated that this slowdown is > > unacceptable, so I'm having to make more far reaching changes in the C > > code to confine this slowdown to byte compilation. > I'm also concerned that by overloading eq this way we'll make it easy to > "lose" information about positions. In general, when (eq a b), we can > substitute a for b and vice versa. You could still do that (not that you'd want to), and your code would still work up to the point where the byte compiler warning output wouldn't have a position to output, and would degrade to a less accurate position, in the limit not outputting a position at all. But this isn't going to happen in practice. A symbol with position is merely an annotated version of an ordinary symbol. It behaves identically to that ordinary symbol, provided only that the enabling flag, symbols-with-pos-enabled, is bound to non-nil. The normal way these annotated symbols come into existence is via the reader when a form is read with read-positioning-symbols (as contrasted with the standard read). All the details are in the code in branch scratch/accurate-warning-pos. > The objects are equivalent in the strongest sense. Now, they're not > equivalent, and choosing a instead of b can lead to subtle bugs, > especially since we're talking about error-path and warning-path code > that might not be frequently exercised. In the byte compiler, the warning path code is all too frequently exercised. ;-( But I've just found (and fixed) a subtle bug, which was what this thread was about. The fact that make bootstrap works with these annotated symbols is a very strong test. > You mention that we'd need to change the use of EQ throughout the byte > compiler in order to work with positional symbols properly. Can we just do > that, in one big renaming patch? I'm not quite sure what you mean here, but I think the answer's no. The byte compiler calls C primitives which use EQ. > In cases where we don't want positions, we can just define a macro > making the new eq-for-position function equivalent to eq. > But yes, it's kind of unfortunate that we haven't been using an explicit > AST representation. I've been thinking that for the time (nearly 3 years) that I've been trying to fix this bug. Is this how compilers for other Lisp systems are written? It seems horribly easy to compile as Emacs does, by taking the (read) starting form and gradually transforming it as a Lisp form. It is difficult to keep track of (text) source positions when one does this. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Comparing symbol-with-position using eq 2019-04-05 21:54 ` Alan Mackenzie @ 2019-04-05 22:50 ` Paul Eggert 2019-04-06 12:23 ` Clément Pit-Claudel 1 sibling, 0 replies; 24+ messages in thread From: Paul Eggert @ 2019-04-05 22:50 UTC (permalink / raw) To: Alan Mackenzie, Daniel Colascione; +Cc: Alex, emacs-devel On 4/5/19 4:54 PM, Alan Mackenzie wrote: > I've been thinking that for the time (nearly 3 years) that I've been > trying to fix this bug. Is this how compilers for other Lisp systems are > written? It seems horribly easy to compile as Emacs does, by taking the > (read) starting form and gradually transforming it as a Lisp form. Sure, it's standard for Lisp compilers to use a representation that is somewhat more complicated than the original. This kind of practice goes back a long way. For example, the Multics MACLISP compiler, although it didn't do a full AST, systematically used a different representation (i.e., not simple symbols) for variables, a representation that let the compiler issue more-precise diagnostics. See Bernard Greenberg's tutorial <https://multicians.org/lcp.html>. Although this sort of thing does complicate the compiler, that's typically better than complicating 'eq'. 'eq' is supposed to be verrry simple and straightforward. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Comparing symbol-with-position using eq 2019-04-05 21:54 ` Alan Mackenzie 2019-04-05 22:50 ` Paul Eggert @ 2019-04-06 12:23 ` Clément Pit-Claudel 1 sibling, 0 replies; 24+ messages in thread From: Clément Pit-Claudel @ 2019-04-06 12:23 UTC (permalink / raw) To: emacs-devel On 2019-04-05 17:54, Alan Mackenzie wrote: > I've been thinking that for the time (nearly 3 years) that I've been > trying to fix this bug. Is this how compilers for other Lisp systems are > written? It seems horribly easy to compile as Emacs does, by taking the > (read) starting form and gradually transforming it as a Lisp form. It is > difficult to keep track of (text) source positions when one does this. The following page might be of interest, about how Racket does this sort of things: https://docs.racket-lang.org/reference/Syntax_Quoting__quote-syntax.html The idea is that macros (not just the byte-compiler) may want to access position information, to issue better diagnostics. This is useful for small languages implemented using macros, like cl-loop. Clément. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2019-04-06 12:23 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-04-02 11:25 Help please! To track down GC trying to free an already freed object Alan Mackenzie 2019-04-02 15:04 ` Eli Zaretskii 2019-04-02 20:42 ` Alan Mackenzie 2019-04-03 4:43 ` Eli Zaretskii 2019-04-04 18:57 ` Alan Mackenzie 2019-04-02 19:09 ` Daniel Colascione 2019-04-02 19:21 ` Eli Zaretskii 2019-04-02 20:46 ` Alan Mackenzie 2019-04-02 21:03 ` Daniel Colascione 2019-04-03 4:39 ` Eli Zaretskii 2019-04-03 10:01 ` Alan Mackenzie 2019-04-03 10:12 ` Eli Zaretskii 2019-04-03 15:23 ` Paul Eggert 2019-04-02 20:24 ` Alan Mackenzie 2019-04-02 20:33 ` Daniel Colascione 2019-04-02 21:00 ` Alan Mackenzie 2019-04-05 4:49 ` Alex 2019-04-05 8:26 ` Alan Mackenzie 2019-04-05 17:05 ` Comparing symbol-with-position using eq (was: Help please! To track down GC trying to free an already freed object.) Alex 2019-04-05 18:21 ` Comparing symbol-with-position using eq Alan Mackenzie 2019-04-05 20:18 ` Daniel Colascione 2019-04-05 21:54 ` Alan Mackenzie 2019-04-05 22:50 ` Paul Eggert 2019-04-06 12:23 ` Clément Pit-Claudel
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).