* bug#53164: After an ELC+ELN build, don't load the source file into a buffer. @ 2022-01-10 16:50 Alan Mackenzie 2022-01-10 17:15 ` bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed Alan Mackenzie 2022-01-10 18:01 ` bug#53164: After an ELC+ELN build, don't load the source file into a buffer Eli Zaretskii 0 siblings, 2 replies; 10+ messages in thread From: Alan Mackenzie @ 2022-01-10 16:50 UTC (permalink / raw) To: 53164 Hello, Emacs. emacs-28 and master branches. In batch-byte+native-compile (lisp/emacs-lisp/comp.el), the code fails to remove the source file argument from command-line-args-left, thus causing Emacs (or bootstrap-emacs) to regard this argument as a file to be visited after the compilation is complete. This can lead (and has led) to bugs. Remove this element from command-line-args-left at the end of batch-byte+native-compile. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-10 16:50 bug#53164: After an ELC+ELN build, don't load the source file into a buffer Alan Mackenzie @ 2022-01-10 17:15 ` Alan Mackenzie 2022-01-10 18:05 ` Eli Zaretskii 2022-01-10 18:01 ` bug#53164: After an ELC+ELN build, don't load the source file into a buffer Eli Zaretskii 1 sibling, 1 reply; 10+ messages in thread From: Alan Mackenzie @ 2022-01-10 17:15 UTC (permalink / raw) To: 53164-done Bug fixed in the emacs-28 branch. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-10 17:15 ` bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed Alan Mackenzie @ 2022-01-10 18:05 ` Eli Zaretskii 2022-01-10 19:34 ` Alan Mackenzie 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-01-10 18:05 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 53164 > Date: Mon, 10 Jan 2022 17:15:34 +0000 > From: Alan Mackenzie <acm@muc.de> > > Bug fixed in the emacs-28 branch. I reverted that commit. Please don't make any non-trivial changes on the release branch without asking first, and definitely not without explaining the problem and the solution in much more detail than you did. FTR, I didn't see any build problems due to this issue, not even once. So I wonder what exactly did you see and in what scenario. Please elaborate. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-10 18:05 ` Eli Zaretskii @ 2022-01-10 19:34 ` Alan Mackenzie 2022-01-10 19:53 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: Alan Mackenzie @ 2022-01-10 19:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 53164 Hello, Eli. On Mon, Jan 10, 2022 at 20:05:22 +0200, Eli Zaretskii wrote: > > Date: Mon, 10 Jan 2022 17:15:34 +0000 > > From: Alan Mackenzie <acm@muc.de> > > Bug fixed in the emacs-28 branch. > I reverted that commit. Please don't make any non-trivial changes on > the release branch without asking first, and definitely not without > explaining the problem and the solution in much more detail than you > did. It seemed to me like such an obvious fix for a non-obvious bug which can cause serious amounts of time to be lost (see below), and with a negligible chance of doing any damage. But after reading the following, please consider putting it back. It will make building Emacs faster, even if only a little bit. > FTR, I didn't see any build problems due to this issue, not even > once. So I wonder what exactly did you see and in what scenario. > Please elaborate. It took me practically three days solid to diagnose this bug. The symptom, whilst bootstrapping the scratch/correct-warning-pos, was the following message on stderr: Error: (invalid-function #<symbol = at 18944>) .. I quickly tracked down the source of the `=' symbol to lisp/subr.el, in the code for `zerop'. Further progress was elusive. In the course of debugging it, I ended up writing my own backtrace routine, free of dependencies, which can work early in the bootstrap, unlike the standard backtrace. The course of events which led to that error message is this: 1. During the bootstrap with native compile, ELC+ELN compiles subr.el. It does so by getting the LAP code from bytecomp.el via a side channel. It completes the compilation. 2. bootstrap-emacs now contains the native-compiled version of subr.el. Unfortunately, the macro `zerop--anon-cmacro' a complier macro for `zerop', still contains symbols with position. 3. Having finished the native compilation, ELC+ELN visited subr.el in a buffer, due to this bug. 4. As part of visiting subr.el, Emacs calls after-find-file, which invokes find-file-hook. 5. One of the entries in find-file-hook is vc-refresh-state, which gets called. 6. This causes vc-git.elc to be loaded. Eventually, vc-git--out-ok gets called. 7. vc-git--out-ok invokes `zerop', or more precisely the code generated by the macro `zerop--anon-cmacro'. This contains #<symbol = at 18944>. 8. eval signals an error, since symbols-with-pos-enabled is nil and it thus can't handle the symbol with position. This error gets blocked from reaching a backtrace function by an inconsiderate condition-case, which just dumps the message onto stderr. The most obvious cause of the error seems to be at step 3, where bootstrap-emacs spuriously visits the source file. As I said above, please consider putting the fix to this bug back. I do not want anybody else to have to go through what I had to to track the bug down. Leaving stale arguments on the command line, later to be visited, cannot possibly be the Right Thing. I'm pretty sure it was not done deliberately, it was just a minor oversight which didn't seem to matter. Thanks. > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-10 19:34 ` Alan Mackenzie @ 2022-01-10 19:53 ` Eli Zaretskii 2022-01-10 20:21 ` Alan Mackenzie 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-01-10 19:53 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 53164 > Date: Mon, 10 Jan 2022 19:34:58 +0000 > Cc: 53164@debbugs.gnu.org > From: Alan Mackenzie <acm@muc.de> > > > FTR, I didn't see any build problems due to this issue, not even > > once. So I wonder what exactly did you see and in what scenario. > > Please elaborate. > > It took me practically three days solid to diagnose this bug. The > symptom, whilst bootstrapping the scratch/correct-warning-pos, was the > following message on stderr: > > Error: (invalid-function #<symbol = at 18944>) > > .. I quickly tracked down the source of the `=' symbol to lisp/subr.el, > in the code for `zerop'. Further progress was elusive. In the course > of debugging it, I ended up writing my own backtrace routine, free of > dependencies, which can work early in the bootstrap, unlike the standard > backtrace. > > The course of events which led to that error message is this: > > 1. During the bootstrap with native compile, ELC+ELN compiles subr.el. > It does so by getting the LAP code from bytecomp.el via a side > channel. It completes the compilation. > 2. bootstrap-emacs now contains the native-compiled version of subr.el. > Unfortunately, the macro `zerop--anon-cmacro' a complier macro for > `zerop', still contains symbols with position. > 3. Having finished the native compilation, ELC+ELN visited subr.el in a > buffer, due to this bug. > 4. As part of visiting subr.el, Emacs calls after-find-file, which > invokes find-file-hook. > 5. One of the entries in find-file-hook is vc-refresh-state, which gets > called. > 6. This causes vc-git.elc to be loaded. Eventually, vc-git--out-ok gets > called. > 7. vc-git--out-ok invokes `zerop', or more precisely the code generated > by the macro `zerop--anon-cmacro'. This contains #<symbol = at > 18944>. > 8. eval signals an error, since symbols-with-pos-enabled is nil and it > thus can't handle the symbol with position. This error gets blocked > from reaching a backtrace function by an inconsiderate condition-case, > which just dumps the message onto stderr. > > The most obvious cause of the error seems to be at step 3, where > bootstrap-emacs spuriously visits the source file. After reading this, I still don't understand how come you bump into this problem, whereas I don't, and neither does anyone else who builds the release branch with native-compilation. Is this something specific to that branch you are working on? If so, why is it urgent to have the fix on the release branch? The branch on which you ware working will land on master. > As I said above, please consider putting the fix to this bug back. I do > not want anybody else to have to go through what I had to to track the > bug down. Leaving stale arguments on the command line, later to be > visited, cannot possibly be the Right Thing. I'm pretty sure it was not > done deliberately, it was just a minor oversight which didn't seem to > matter. There's always one more bug. When we are so far into the pretest process, problems that take unusual steps to reproduce are not important enough to delay the release. The judgment call I need to make is how important is it to have this in the release branch, and that's only after I understand how to trigger the problem in the first place. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-10 19:53 ` Eli Zaretskii @ 2022-01-10 20:21 ` Alan Mackenzie 2022-01-11 12:27 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: Alan Mackenzie @ 2022-01-10 20:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 53164, acm Hello, Eli. On Mon, Jan 10, 2022 at 21:53:36 +0200, Eli Zaretskii wrote: > > Date: Mon, 10 Jan 2022 19:34:58 +0000 > > Cc: 53164@debbugs.gnu.org > > From: Alan Mackenzie <acm@muc.de> > > > FTR, I didn't see any build problems due to this issue, not even > > > once. So I wonder what exactly did you see and in what scenario. > > > Please elaborate. > > It took me practically three days solid to diagnose this bug. The > > symptom, whilst bootstrapping the scratch/correct-warning-pos, was the > > following message on stderr: > > Error: (invalid-function #<symbol = at 18944>) > > .. I quickly tracked down the source of the `=' symbol to lisp/subr.el, > > in the code for `zerop'. Further progress was elusive. In the course > > of debugging it, I ended up writing my own backtrace routine, free of > > dependencies, which can work early in the bootstrap, unlike the standard > > backtrace. > > The course of events which led to that error message is this: > > 1. During the bootstrap with native compile, ELC+ELN compiles subr.el. > > It does so by getting the LAP code from bytecomp.el via a side > > channel. It completes the compilation. > > 2. bootstrap-emacs now contains the native-compiled version of subr.el. > > Unfortunately, the macro `zerop--anon-cmacro' a complier macro for > > `zerop', still contains symbols with position. > > 3. Having finished the native compilation, ELC+ELN visited subr.el in a > > buffer, due to this bug. > > 4. As part of visiting subr.el, Emacs calls after-find-file, which > > invokes find-file-hook. > > 5. One of the entries in find-file-hook is vc-refresh-state, which gets > > called. > > 6. This causes vc-git.elc to be loaded. Eventually, vc-git--out-ok gets > > called. > > 7. vc-git--out-ok invokes `zerop', or more precisely the code generated > > by the macro `zerop--anon-cmacro'. This contains #<symbol = at > > 18944>. > > 8. eval signals an error, since symbols-with-pos-enabled is nil and it > > thus can't handle the symbol with position. This error gets blocked > > from reaching a backtrace function by an inconsiderate condition-case, > > which just dumps the message onto stderr. > > The most obvious cause of the error seems to be at step 3, where > > bootstrap-emacs spuriously visits the source file. > After reading this, I still don't understand how come you bump into > this problem, whereas I don't, and neither does anyone else who builds > the release branch with native-compilation. Is this something > specific to that branch you are working on? Indeed, yes. It is the symbols with position which escaped from a compiler macro. > If so, why is it urgent to have the fix on the release branch? The > branch on which you ware working will land on master. There was no urgency. I though the convention was for documentation fixes and simple safe code fixes to go onto the release branch. The fix to this bug, a single line, is well understood and certainly comes into the category of simple and safe. > > As I said above, please consider putting the fix to this bug back. I do > > not want anybody else to have to go through what I had to to track the > > bug down. Leaving stale arguments on the command line, later to be > > visited, cannot possibly be the Right Thing. I'm pretty sure it was not > > done deliberately, it was just a minor oversight which didn't seem to > > matter. > There's always one more bug. When we are so far into the pretest > process, problems that take unusual steps to reproduce are not > important enough to delay the release. OK. But can I ask you to consider announcing on emacs-devel when the criteria for making bug fixes on the release branch change? Apologies if you have done, and I missed it. > The judgment call I need to make is how important is it to have this in > the release branch, and that's only after I understand how to trigger > the problem in the first place. It is not particularly important for the release branch. But it is a positive step (making the build faster, and removing a potential bug source). As for triggering it, I cannot give you a recipe except for committing my recent changes to scratch/correct-warning-pos, and asking you to repeat what I did (basically, a make bootstrap). Somebody else may trigger it in some other way, and the debugging effort is potentially unbounded. So, I still think it would be better in the release branch, but can see that it's not very important. There're three ways we could go. Commit it in emacs-28, master, or scratch/correct-warning-pos. I'm willing to commit it again in any of these places. > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-10 20:21 ` Alan Mackenzie @ 2022-01-11 12:27 ` Eli Zaretskii 2022-01-11 18:03 ` Alan Mackenzie 0 siblings, 1 reply; 10+ messages in thread From: Eli Zaretskii @ 2022-01-11 12:27 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 53164 > Date: Mon, 10 Jan 2022 20:21:40 +0000 > Cc: 53164@debbugs.gnu.org, acm@muc.de > From: Alan Mackenzie <acm@muc.de> > > > After reading this, I still don't understand how come you bump into > > this problem, whereas I don't, and neither does anyone else who builds > > the release branch with native-compilation. Is this something > > specific to that branch you are working on? > > Indeed, yes. It is the symbols with position which escaped from a > compiler macro. > > > If so, why is it urgent to have the fix on the release branch? The > > branch on which you ware working will land on master. > > There was no urgency. I though the convention was for documentation > fixes and simple safe code fixes to go onto the release branch. The fix > to this bug, a single line, is well understood and certainly comes into > the category of simple and safe. The fix is well understood, but its possible effects aren't. We have been using the current code for more than a year with no problems whatsoever. > > There's always one more bug. When we are so far into the pretest > > process, problems that take unusual steps to reproduce are not > > important enough to delay the release. > > OK. But can I ask you to consider announcing on emacs-devel when the > criteria for making bug fixes on the release branch change? Apologies if > you have done, and I missed it. This is in CONTRIBUTE: If you are fixing a bug that exists in the current release, you should generally commit it to the release branch; it will be merged to the master branch later by the gitmerge function. However, when the release branch is for Emacs version NN.2 and later, or when it is for Emacs version NN.1 that is in the very last stages of its pretest, that branch is considered to be in a feature freeze: only bug fixes that are "safe" or are fixing major problems should go to the release branch, the rest should be committed to the master branch. This is so to avoid destabilizing the next Emacs release. If you are unsure whether your bug fix is "safe" enough for the release branch, ask on the emacs-devel mailing list. > There're three ways we could go. Commit it in emacs-28, master, or > scratch/correct-warning-pos. I'm willing to commit it again in any of > these places. Please install this on master (or leave it on your branch), and we will revisit this when time comes for Emacs 28.2. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-11 12:27 ` Eli Zaretskii @ 2022-01-11 18:03 ` Alan Mackenzie 2022-01-11 18:45 ` Eli Zaretskii 0 siblings, 1 reply; 10+ messages in thread From: Alan Mackenzie @ 2022-01-11 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 53164 Hello, Eli. On Tue, Jan 11, 2022 at 14:27:14 +0200, Eli Zaretskii wrote: > > Date: Mon, 10 Jan 2022 20:21:40 +0000 > > Cc: 53164@debbugs.gnu.org, acm@muc.de > > From: Alan Mackenzie <acm@muc.de> > > > After reading this, I still don't understand how come you bump into > > > this problem, whereas I don't, and neither does anyone else who builds > > > the release branch with native-compilation. Is this something > > > specific to that branch you are working on? > > Indeed, yes. It is the symbols with position which escaped from a > > compiler macro. > > > If so, why is it urgent to have the fix on the release branch? The > > > branch on which you ware working will land on master. > > There was no urgency. I though the convention was for documentation > > fixes and simple safe code fixes to go onto the release branch. The fix > > to this bug, a single line, is well understood and certainly comes into > > the category of simple and safe. > The fix is well understood, but its possible effects aren't. OK, we will just have to disagree on this point. If I wasn't sure about the lack of possible negative effects, I wouldn't have committed the fix to the emacs-28 branch. > We have been using the current code for more than a year with no > problems whatsoever. None on the emacs-28 branch, perhaps. I had severe problems with the same code on a branch branched from master. Incidentally, I timed a bootstrap on the release branch with and without the fix, both starting from a warm file cache. With the fix, the build ran ~7% faster. > > > There's always one more bug. When we are so far into the pretest > > > process, problems that take unusual steps to reproduce are not > > > important enough to delay the release. > > OK. But can I ask you to consider announcing on emacs-devel when the > > criteria for making bug fixes on the release branch change? Apologies if > > you have done, and I missed it. > This is in CONTRIBUTE: > If you are fixing a bug that exists in the current release, you should > generally commit it to the release branch; it will be merged to the > master branch later by the gitmerge function. However, when the > release branch is for Emacs version NN.2 and later, or when it is for > Emacs version NN.1 that is in the very last stages of its pretest, > that branch is considered to be in a feature freeze: only bug fixes > that are "safe" or are fixing major problems should go to the release > branch, the rest should be committed to the master branch. This is so > to avoid destabilizing the next Emacs release. If you are unsure > whether your bug fix is "safe" enough for the release branch, ask on > the emacs-devel mailing list. OK, thanks. I'm not sure I understand any more what "safe" means in this context. > > There're three ways we could go. Commit it in emacs-28, master, or > > scratch/correct-warning-pos. I'm willing to commit it again in any of > > these places. > Please install this on master (or leave it on your branch), and we > will revisit this when time comes for Emacs 28.2. I'll install it on master this evening. Thanks! > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed. 2022-01-11 18:03 ` Alan Mackenzie @ 2022-01-11 18:45 ` Eli Zaretskii 0 siblings, 0 replies; 10+ messages in thread From: Eli Zaretskii @ 2022-01-11 18:45 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 53164 > Date: Tue, 11 Jan 2022 18:03:14 +0000 > Cc: 53164@debbugs.gnu.org > From: Alan Mackenzie <acm@muc.de> > > > The fix is well understood, but its possible effects aren't. > > OK, we will just have to disagree on this point. If I wasn't sure about > the lack of possible negative effects, I wouldn't have committed the fix > to the emacs-28 branch. We are mis-communicating. I didn't mean something as silly as saying that the result of removing a command-line argument is not well understood. What I meant to say is that since it's impossible to look at all the possible situations where this code is being used, we cannot know what would happen due to this removal. You basically only tested the fix in one situation, where you discovered the problem, and just assume that it cannot possibly have any adverse effects on the infinity of other cases. But that's just an assumption. The reason we have long pretests is precisely because code we think we understand has unintended consequences that take a lot of time to reveal. This change is no different. > > We have been using the current code for more than a year with no > > problems whatsoever. > > None on the emacs-28 branch, perhaps. I had severe problems with the > same code on a branch branched from master. Incidentally, I timed a > bootstrap on the release branch with and without the fix, both starting > from a warm file cache. With the fix, the build ran ~7% faster. Thanks, but at this point in the pretest I no longer worry about speeding up the build. > > If you are fixing a bug that exists in the current release, you should > > generally commit it to the release branch; it will be merged to the > > master branch later by the gitmerge function. However, when the > > release branch is for Emacs version NN.2 and later, or when it is for > > Emacs version NN.1 that is in the very last stages of its pretest, > > that branch is considered to be in a feature freeze: only bug fixes > > that are "safe" or are fixing major problems should go to the release > > branch, the rest should be committed to the master branch. This is so > > to avoid destabilizing the next Emacs release. If you are unsure > > whether your bug fix is "safe" enough for the release branch, ask on > > the emacs-devel mailing list. > > OK, thanks. I'm not sure I understand any more what "safe" means in > this context. That's right, at this point no change is "safe" a-priori. > > Please install this on master (or leave it on your branch), and we > > will revisit this when time comes for Emacs 28.2. > > I'll install it on master this evening. Thanks! Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#53164: After an ELC+ELN build, don't load the source file into a buffer. 2022-01-10 16:50 bug#53164: After an ELC+ELN build, don't load the source file into a buffer Alan Mackenzie 2022-01-10 17:15 ` bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed Alan Mackenzie @ 2022-01-10 18:01 ` Eli Zaretskii 1 sibling, 0 replies; 10+ messages in thread From: Eli Zaretskii @ 2022-01-10 18:01 UTC (permalink / raw) To: Alan Mackenzie; +Cc: 53164 > Date: Mon, 10 Jan 2022 16:50:56 +0000 > From: Alan Mackenzie <acm@muc.de> > > emacs-28 and master branches. > > In batch-byte+native-compile (lisp/emacs-lisp/comp.el), the code fails > to remove the source file argument from command-line-args-left, thus > causing Emacs (or bootstrap-emacs) to regard this argument as a file to > be visited after the compilation is complete. > > This can lead (and has led) to bugs. > > Remove this element from command-line-args-left at the end of > batch-byte+native-compile. Please show the smallest recipe to reproduce this. Because I don't think I follow your description, and in particular I saw no such problems when building Emacs. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-01-11 18:45 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-10 16:50 bug#53164: After an ELC+ELN build, don't load the source file into a buffer Alan Mackenzie 2022-01-10 17:15 ` bug#53164: (After an ELC+ELN build, don't load the source file into a buffer.) fixed Alan Mackenzie 2022-01-10 18:05 ` Eli Zaretskii 2022-01-10 19:34 ` Alan Mackenzie 2022-01-10 19:53 ` Eli Zaretskii 2022-01-10 20:21 ` Alan Mackenzie 2022-01-11 12:27 ` Eli Zaretskii 2022-01-11 18:03 ` Alan Mackenzie 2022-01-11 18:45 ` Eli Zaretskii 2022-01-10 18:01 ` bug#53164: After an ELC+ELN build, don't load the source file into a buffer Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.