* scratch/accurate-warning-pos: next steps. @ 2018-12-10 18:00 Alan Mackenzie 2018-12-10 18:15 ` Eli Zaretskii 2018-12-10 23:54 ` Paul Eggert 0 siblings, 2 replies; 20+ messages in thread From: Alan Mackenzie @ 2018-12-10 18:00 UTC (permalink / raw) To: emacs-devel Hello, Emacs. Here's my scheme for making further progress on the scratch/accurate-warning-pos branch. At the moment, it appears to display the requisite accurate source positions in warning messages, but it slows Emacs down a little. Hence it has not been accepted in its current form. Following an idea from Paul, I propose to build an alternative byte-code interpreter alongside the primary one. This second interpreter would regard symbols with position as being EQ to the corresponding bare symbols, just as the branch currently does when symbols-with-pos-enabled is bound to non-nil. C symbols in components of the second interpreter would be those of the main one, prefixed by "BC_". lisp.h would be modified to define alternative versions of EQ, NILP, SYMBOLP, and XSYMBOL, and alternative versions of the INLINE functions which call them. These would be called BC_EQ, BC_NILP, BC_SYMBOLP, and BC_XSYMBOL. Most of the C sources would, at build time, be fed to a preprocessor which would analyse (almost every) C function, and write a temporary file containing the functions foo and BC_foo next to eachother. foo would be unchanged from the C source, BC_foo would have calls to bar modified to BC_bar, and invocations of EQ etc., modified to BC_EQ, etc. These preprocessor outputs would be compiled into temacs in place of the primary C sources. The resulting temacs would, of course, be bigger than the current temacs. In particular, the byte code interpreter exec_byte_code in bytecode.c would have its alternative BC_exec_byte_code. The struct Lisp_Subr would be amended to hold three Lisp_Functions - the currently live one, the normal one, and the BC_... one. Also a next pointer would be introduced, chaining all the subrs together. The .el and .elc files would not require amendment (apart from bytecomp.el, and so on, of course). When a byte compilation is initiated, the compiler would replace the current live function field with the corresponding BC_ function in every Lisp_Subr, thus switching over to the BC_... interpreter. At termination of the compiler, an unwind-protect would restore the Lisp_Subrs to their standard settings. There remains the question, which C functions would get a BC_... version? To begin with, I propose almost every C function. Only those for which a second version would be damaging (for example, the command loop) would remain unique. Once the mechanism is working, we could steadily reduce the number of BC_... functions from "as many as possible" to "what is needed ". For example, surely xdisp.c, and xterm.c would not need duplication. This scheme would allow accurate warning line numbers to be output, whilst not slowing down the normal operation of Emacs. It would likely slow down the operation of the byte compiler by several per cent, as has been measured in the current scratch/accurate-warning-pos branch. Comments? -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 18:00 scratch/accurate-warning-pos: next steps Alan Mackenzie @ 2018-12-10 18:15 ` Eli Zaretskii 2018-12-10 18:28 ` Alan Mackenzie 2018-12-10 23:54 ` Paul Eggert 1 sibling, 1 reply; 20+ messages in thread From: Eli Zaretskii @ 2018-12-10 18:15 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Date: Mon, 10 Dec 2018 18:00:33 +0000 > From: Alan Mackenzie <acm@muc.de> > > > Following an idea from Paul, I propose to build an alternative byte-code > interpreter alongside the primary one. This second interpreter would > regard symbols with position as being EQ to the corresponding bare > symbols, just as the branch currently does when symbols-with-pos-enabled > is bound to non-nil. I don't think I understood when will this alternative interpreter be used, and when will the "primary" one be used. Can you elaborate on that? Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 18:15 ` Eli Zaretskii @ 2018-12-10 18:28 ` Alan Mackenzie 2018-12-10 18:39 ` Eli Zaretskii 0 siblings, 1 reply; 20+ messages in thread From: Alan Mackenzie @ 2018-12-10 18:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Hello, Eli. On Mon, Dec 10, 2018 at 20:15:18 +0200, Eli Zaretskii wrote: > > Date: Mon, 10 Dec 2018 18:00:33 +0000 > > From: Alan Mackenzie <acm@muc.de> > > Following an idea from Paul, I propose to build an alternative byte-code > > interpreter alongside the primary one. This second interpreter would > > regard symbols with position as being EQ to the corresponding bare > > symbols, just as the branch currently does when symbols-with-pos-enabled > > is bound to non-nil. > I don't think I understood when will this alternative interpreter be > used, and when will the "primary" one be used. Can you elaborate on > that? Yes. The alternative interpreter would be used only for byte compilation (and possibly other programs which want to use the symbols with position mechanism), the primary one will be used at all other times. There would be a function switch-to-BC-subrs accessible from Lisp which would switch to the alternative interpreter, and switch-to-normal-subrs for the reverse. Or something like that. byte-compile-file and friends would use these functions. Any recursive-edit would "bind" to the normal interpreter. C-g, and any other quit actions would restore the normal interpreter. > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 18:28 ` Alan Mackenzie @ 2018-12-10 18:39 ` Eli Zaretskii 2018-12-10 19:35 ` Alan Mackenzie 0 siblings, 1 reply; 20+ messages in thread From: Eli Zaretskii @ 2018-12-10 18:39 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Date: Mon, 10 Dec 2018 18:28:30 +0000 > Cc: emacs-devel@gnu.org > From: Alan Mackenzie <acm@muc.de> > > > I don't think I understood when will this alternative interpreter be > > used, and when will the "primary" one be used. Can you elaborate on > > that? > > Yes. The alternative interpreter would be used only for byte > compilation (and possibly other programs which want to use the symbols > with position mechanism), the primary one will be used at all other > times. Then how about invoking this alternative interpreter only if the prime interpreter detected a warning or error while byte-compiling? You could invoke the alternative interpreter only on the form where the problem was detected, with the goal of "drilling down" to find the exact position of the problematic symbol(s). This would have the advantage of not only avoiding the slow-down in the "prime" interpreter, but also avoiding slowing down byte compilation of error-free sources. Does this make sense? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 18:39 ` Eli Zaretskii @ 2018-12-10 19:35 ` Alan Mackenzie 2018-12-10 20:06 ` Eli Zaretskii 0 siblings, 1 reply; 20+ messages in thread From: Alan Mackenzie @ 2018-12-10 19:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Hello, Eli. On Mon, Dec 10, 2018 at 20:39:56 +0200, Eli Zaretskii wrote: > > Date: Mon, 10 Dec 2018 18:28:30 +0000 > > Cc: emacs-devel@gnu.org > > From: Alan Mackenzie <acm@muc.de> > > > I don't think I understood when will this alternative interpreter be > > > used, and when will the "primary" one be used. Can you elaborate on > > > that? > > Yes. The alternative interpreter would be used only for byte > > compilation (and possibly other programs which want to use the symbols > > with position mechanism), the primary one will be used at all other > > times. > Then how about invoking this alternative interpreter only if the prime > interpreter detected a warning or error while byte-compiling? You > could invoke the alternative interpreter only on the form where the > problem was detected, with the goal of "drilling down" to find the > exact position of the problematic symbol(s). That would mean starting the byte compilation with no position information being gathered, and then when an warning occurs, aborting the compilation and starting again from scratch with the position information being gather and alternative interpreter being used. The problem is, that we cannot use #<symbol nil at 666> in the normal interpreter, since it is not EQ nil there. > This would have the advantage of not only avoiding the slow-down in > the "prime" interpreter, but also avoiding slowing down byte > compilation of error-free sources. This is an optimisation. > Does this make sense? I understand the idea, yes. But given the timings I measured in the existing scratch/accurate-warning-pos (IIRC, around 11% - 12% for an actual compilation) and the fact that in the alternative interpreter, the slowdown will be somewhat less (one fewer flag comparison per EQ, NILP, ...., and we can drop the traditional alist of symbols and positions which is running alongside the new symbols with position) it may not be worth the extra complexity. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 19:35 ` Alan Mackenzie @ 2018-12-10 20:06 ` Eli Zaretskii 2018-12-10 21:03 ` Alan Mackenzie 0 siblings, 1 reply; 20+ messages in thread From: Eli Zaretskii @ 2018-12-10 20:06 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Date: Mon, 10 Dec 2018 19:35:57 +0000 > Cc: emacs-devel@gnu.org > From: Alan Mackenzie <acm@muc.de> > > > Then how about invoking this alternative interpreter only if the prime > > interpreter detected a warning or error while byte-compiling? You > > could invoke the alternative interpreter only on the form where the > > problem was detected, with the goal of "drilling down" to find the > > exact position of the problematic symbol(s). > > That would mean starting the byte compilation with no position > information being gathered, and then when an warning occurs, aborting > the compilation and starting again from scratch with the position > information being gather and alternative interpreter being used. Not necessarily. It could mean invocation of a special code whose goal is to find the position of an error in a given form. The position of the beginning of this form will have been known, as AFAIU the existing byte compiler does collect that, or has means to determine that. > The problem is, that we cannot use #<symbol nil at 666> in the normal > interpreter, since it is not EQ nil there. I'm not sure you must use symbols with positions in the above arrangement, you could simply invoke special-purpose code that analyzed the problematic form. But if you do need to use symbols with positions, you could do this only when looking for error position, so other symbol comparisons will not be affected. > I understand the idea, yes. But given the timings I measured in the > existing scratch/accurate-warning-pos (IIRC, around 11% - 12% for an > actual compilation) and the fact that in the alternative interpreter, > the slowdown will be somewhat less (one fewer flag comparison per EQ, > NILP, ...., and we can drop the traditional alist of symbols and > positions which is running alongside the new symbols with position) it > may not be worth the extra complexity. Yes, but what you suggested as the implementation of the alternative interpreter includes a heck of complexity of its own, IMO. The idea I proposed doesn't even require changes in basic types, it could hopefully be implemented with "normal" Lisp. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 20:06 ` Eli Zaretskii @ 2018-12-10 21:03 ` Alan Mackenzie 2018-12-11 6:41 ` Eli Zaretskii 2018-12-11 19:07 ` Stefan Monnier 0 siblings, 2 replies; 20+ messages in thread From: Alan Mackenzie @ 2018-12-10 21:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Hello, Eli. On Mon, Dec 10, 2018 at 22:06:33 +0200, Eli Zaretskii wrote: > > Date: Mon, 10 Dec 2018 19:35:57 +0000 > > Cc: emacs-devel@gnu.org > > From: Alan Mackenzie <acm@muc.de> > > > Then how about invoking this alternative interpreter only if the prime > > > interpreter detected a warning or error while byte-compiling? You > > > could invoke the alternative interpreter only on the form where the > > > problem was detected, with the goal of "drilling down" to find the > > > exact position of the problematic symbol(s). > > That would mean starting the byte compilation with no position > > information being gathered, and then when an warning occurs, aborting > > the compilation and starting again from scratch with the position > > information being gather and alternative interpreter being used. > Not necessarily. It could mean invocation of a special code whose > goal is to find the position of an error in a given form. The > position of the beginning of this form will have been known, as AFAIU > the existing byte compiler does collect that, or has means to > determine that. We know the position of the beginning of the form, yes. We need some way of determining the source position of a symbol, cons, or vector on the inside of this form. The traditional alist of symbols and positions is one way, and it no longer works well (if it ever did). Symbols with position is another way, which appears to work well, in spite of the complexity. You seem to be proposing a third way, but without giving away any details. > > The problem is, that we cannot use #<symbol nil at 666> in the normal > > interpreter, since it is not EQ nil there. > I'm not sure you must use symbols with positions in the above > arrangement, you could simply invoke special-purpose code that > analyzed the problematic form. But if you do need to use symbols with > positions, you could do this only when looking for error position, so > other symbol comparisons will not be affected. I'm not sure how this special-purpose code would work. Say we find an error or warning involving symbol foo as the car of some form, I can't see any way of determining its source position that doesn't involve going back to the position of the beginning of the form, and slogging through the form, somehow. Maybe, rather than reading the form, we could scan it a token at a time, storing it in, say vectors, rather like a traditional non-lisp compiler does. But this is hardly attractive, and would be a LOT of work. > > I understand the idea, yes. But given the timings I measured in the > > existing scratch/accurate-warning-pos (IIRC, around 11% - 12% for an > > actual compilation) and the fact that in the alternative interpreter, > > the slowdown will be somewhat less (one fewer flag comparison per EQ, > > NILP, ...., and we can drop the traditional alist of symbols and > > positions which is running alongside the new symbols with position) it > > may not be worth the extra complexity. > Yes, but what you suggested as the implementation of the alternative > interpreter includes a heck of complexity of its own, IMO. It does, yes. Partly imposed by external circumstances. ;-) > The idea I proposed doesn't even require changes in basic types, it > could hopefully be implemented with "normal" Lisp. The symbols with positions idea doesn't conceptually change basic types either - it just adds annotations to an existing type. Anything you could do with symbols before, you can still do with symbols-with-positions and get the same answers now. If you have come up with a new way of getting a source position for a symbol or a cons, please detail it here. I recognise the complexity in what I'm proposing and what I've already implemented, and I don't think it's good; it's just less bad than anything else that's come up, so far. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 21:03 ` Alan Mackenzie @ 2018-12-11 6:41 ` Eli Zaretskii 2018-12-11 19:21 ` Stefan Monnier 2018-12-11 19:07 ` Stefan Monnier 1 sibling, 1 reply; 20+ messages in thread From: Eli Zaretskii @ 2018-12-11 6:41 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > Date: Mon, 10 Dec 2018 21:03:10 +0000 > Cc: emacs-devel@gnu.org > From: Alan Mackenzie <acm@muc.de> > > > > That would mean starting the byte compilation with no position > > > information being gathered, and then when an warning occurs, aborting > > > the compilation and starting again from scratch with the position > > > information being gather and alternative interpreter being used. > > > Not necessarily. It could mean invocation of a special code whose > > goal is to find the position of an error in a given form. The > > position of the beginning of this form will have been known, as AFAIU > > the existing byte compiler does collect that, or has means to > > determine that. > > We know the position of the beginning of the form, yes. We need some way > of determining the source position of a symbol, cons, or vector on the > inside of this form. > > The traditional alist of symbols and positions is one way, and it no > longer works well (if it ever did). Symbols with position is another > way, which appears to work well, in spite of the complexity. > > You seem to be proposing a third way, but without giving away any > details. I don't have any details, just an idea. I hope it could be helpful, because implementing it would side-step all the problems you discovered with the other approaches: . it doesn't slow down the Lisp interpreter . it doesn't slow down byte compilation when there are no errors/warnings to report . it probably doesn't require introduction of new low-level facilities, like annotating symbols with positions or redirecting Lisp subroutines to alternative versions > I'm not sure how this special-purpose code would work. Say we find an > error or warning involving symbol foo as the car of some form, I can't > see any way of determining its source position that doesn't involve going > back to the position of the beginning of the form, and slogging through > the form, somehow. Yes, I was proposing something like that. Why is that a problem? > Maybe, rather than reading the form, we could scan it a token at a time, > storing it in, say vectors, rather like a traditional non-lisp compiler > does. But this is hardly attractive, and would be a LOT of work. I'm not sure why is it less attractive than the other alternatives. But if my idea doesn't sound helpful, feel free to disregard it. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 6:41 ` Eli Zaretskii @ 2018-12-11 19:21 ` Stefan Monnier 0 siblings, 0 replies; 20+ messages in thread From: Stefan Monnier @ 2018-12-11 19:21 UTC (permalink / raw) To: emacs-devel >> I'm not sure how this special-purpose code would work. Say we find an >> error or warning involving symbol foo as the car of some form, I can't >> see any way of determining its source position that doesn't involve going >> back to the position of the beginning of the form, and slogging through >> the form, somehow. > Yes, I was proposing something like that. Why is that a problem? IIUC what you're suggesting here is to add a heuristic which takes an arbitrary chunk of code (can be a single symbol but not necessarily), an approximate source location, and then tries to compute a better source location from it. I think making this 100% reliable (either in the sense of "return the *right* location" or just "return a location that's sometimes/often better and never worse") is somewhere between very hard and impossible. But maybe a few well chosen heuristics could indeed give a significant improvement (i.e. return a location that's sometimes/often better and rarely worse). To help the heuristic, we could pass it some indication of the error (e.g. so it knows whether the symbol (or chunk of code) we're looking for is expected to be in the position of a normal expression, a let-binding, a var definition, a function definition, a function call, ...). Oh wait: I think if we return a range rather than a single position, we could make it reliable in the sense that the actual position is within the range (but sometimes the range will degenerate to cover a whole top-level definition). Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 21:03 ` Alan Mackenzie 2018-12-11 6:41 ` Eli Zaretskii @ 2018-12-11 19:07 ` Stefan Monnier 1 sibling, 0 replies; 20+ messages in thread From: Stefan Monnier @ 2018-12-11 19:07 UTC (permalink / raw) To: emacs-devel > The traditional alist of symbols and positions is one way, and it no > longer works well (if it ever did). It works as well as ever, AFAICT. It's still a significant improvement over the previous arrangement. But as things get better, we get used to the improvement and start being annoyed by other details. IOW we get greedy ;-) Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 18:00 scratch/accurate-warning-pos: next steps Alan Mackenzie 2018-12-10 18:15 ` Eli Zaretskii @ 2018-12-10 23:54 ` Paul Eggert 2018-12-11 11:34 ` Alan Mackenzie 1 sibling, 1 reply; 20+ messages in thread From: Paul Eggert @ 2018-12-10 23:54 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel On 12/10/18 10:00 AM, Alan Mackenzie wrote: > lisp.h would be modified to define alternative versions of EQ, NILP, > SYMBOLP, and XSYMBOL, and alternative versions of the INLINE functions > which call them. These would be called BC_EQ, BC_NILP, BC_SYMBOLP, and > BC_XSYMBOL. > > Most of the C sources would, at build time, be fed to a preprocessor > which would analyse (almost every) C function, and write a temporary file > containing the functions foo and BC_foo next to eachother. This preprocessor would be a separate program that we'd write? If so, that sounds error-prone. C is notoriously tricky to preprocess, and Emacs already uses the C preprocessor aggressively. Instead, why not use the C preprocessor itself, rather than writing another preprocessor for C? In other words, compile each file twice, once with one -D option and once with another. Even with this suggestion, though, I'm leery of multiple interpreters. Although it'd be better to have multiple interpreters (one faster, one slower) than to have just a single, slower interpreter, it'd be better yet to have just a single, faster interpreter. Instead, I suggest looking into Stefan's suggestion to use edebug info <https://lists.gnu.org/archive/html/emacs-devel/2018-11/msg00526.html>, which should be a much less-drastic way to address the problem; for more info, see Gemini's followup <https://lists.gnu.org/r/emacs-devel/2018-12/msg00043.html>. Alternatively, Yuri's suggestion of an opt-in property for macros <https://lists.gnu.org/r/emacs-devel/2018-12/msg00023.html> also seems like a much-simpler approach that should work just as well in the long run as multiple interpreters would. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-10 23:54 ` Paul Eggert @ 2018-12-11 11:34 ` Alan Mackenzie 2018-12-11 18:05 ` Paul Eggert 0 siblings, 1 reply; 20+ messages in thread From: Alan Mackenzie @ 2018-12-11 11:34 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel Hello, Paul. On Mon, Dec 10, 2018 at 15:54:02 -0800, Paul Eggert wrote: > On 12/10/18 10:00 AM, Alan Mackenzie wrote: > > lisp.h would be modified to define alternative versions of EQ, NILP, > > SYMBOLP, and XSYMBOL, and alternative versions of the INLINE functions > > which call them. These would be called BC_EQ, BC_NILP, BC_SYMBOLP, and > > BC_XSYMBOL. > > Most of the C sources would, at build time, be fed to a preprocessor > > which would analyse (almost every) C function, and write a temporary file > > containing the functions foo and BC_foo next to eachother. > This preprocessor would be a separate program that we'd write? Yes. > If so, that sounds error-prone. C is notoriously tricky to > preprocess, ... You don't need to tell me that. ;-) However, all this preprocessor would do would be to recognise starts and ends of functions from a list of known functions, and textually substitute BC_foo for foo, again from that list of known substitutions. It would need to parse comments and strings. The list of known functions can be reliably generated by objdump (from binutils). This preprocessor would be tedious rather than difficult to write. > .... and Emacs already uses the C preprocessor aggressively. Instead, > why not use the C preprocessor itself, rather than writing another > preprocessor for C? In other words, compile each file twice, once with > one -D option and once with another. Because the two interpreters will need to share file static data, of which there must be only one copy. So the two versions of each function need to be in the same "source" file. The approach has the advantage that only minimal amendment of the C source, if that, will be needed. > Even with this suggestion, though, I'm leery of multiple interpreters. > Although it'd be better to have multiple interpreters (one faster, one > slower) than to have just a single, slower interpreter, it'd be better > yet to have just a single, faster interpreter. Yes, we'd all like that, but several weeks of exploring alternatives has failed to produce any workable solutions on these lines. > Instead, I suggest looking into Stefan's suggestion to use edebug info > <https://lists.gnu.org/archive/html/emacs-devel/2018-11/msg00526.html>, > which should be a much less-drastic way to address the problem; Not really, no. To recap, that would involve the reader adding annotations to every Lisp element, turning it into a list looking like: (locinfo FILE POS (foo (locinfo FILE POS a) (locinfo FILE POS 4))) in place of (foo a 4) . The form Stefan suggested is MUCH bigger than the plain form, having, perhaps four times the number of conses (I haven't counted them). A large part of the compiler would need to be amended to cope with the new format, even supposing it could work with macros (which I don't think it could). This amendment would be uninspiring and tedious in the extreme. I seriously doubt this would run faster than the symbols-with-position approach (which has already been implemented), even if it could be made to work. > for more info, see Gemini's followup > <https://lists.gnu.org/r/emacs-devel/2018-12/msg00043.html>. I've read this several times. It suffers the same drawbacks as Stefan's idea. In particular it doesn't give any idea how the compiler would operate on the proposed forms. > Alternatively, Yuri's suggestion of an opt-in property for macros > <https://lists.gnu.org/r/emacs-devel/2018-12/msg00023.html> also seems > like a much-simpler approach that should work just as well in the long > run as multiple interpreters would. I don't think it would work, either. That idea is for macros' uses of eq to be replaced by BC-eq inside the macro. The trouble is, many uses of eq are actually expansions of EQ in the C code (e.g. in Fequal, Fassq, ....) and they would all need modifying too, and we're back in the same situation of having an alternative interpreter. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 11:34 ` Alan Mackenzie @ 2018-12-11 18:05 ` Paul Eggert 2018-12-11 19:20 ` Alan Mackenzie 0 siblings, 1 reply; 20+ messages in thread From: Paul Eggert @ 2018-12-11 18:05 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel On 12/11/18 3:34 AM, Alan Mackenzie wrote: > The list of known functions can be reliably generated by objdump (from binutils). Would objdump be run on every build that compiles a .c file that goes into the Emacs executable? If so, aren't we limiting builds to platforms that have binutils, which would be a new restriction? And if not, wouldn't the objdump use be a hand-done process that would need to be redone on a binutils platform (with output committed to Git) whenever a significant-enough change is made to src/*? Either way, this sounds like a hassle. >> .... and Emacs already uses the C preprocessor aggressively. Instead, why not use the C preprocessor itself, rather than writing another preprocessor for C? In other words, compile each file twice, once with one -D option and once with another. > > Because the two interpreters will need to share file static data, of which there must be only one copy. So the two versions of each function need to be in the same "source" file. It should be easy enough to move shared file-static data into another file, that would be compiled only once. We don't have so much shared file-static data that this would be a major obstacle. > The form Stefan suggested is MUCH bigger than the plain form, having, > > perhaps four times the number of conses (I haven't counted them). This overhead would occur only when byte-compiling the form, which shouldn't be much of a problem in practice. > This preprocessor would be tedious rather than difficult to write. > ... > > A large part of the compiler would need to be amended to cope with the new format, even supposing it could work with macros (which I don't think it could). This amendment would be uninspiring and tedious in the extreme. I agree that either approach would be tedious. :-) However, a tedious approach that is limited to reading and byte-compilation is better than a tedious approach that affects all of execution. >> for more info, see Gemini's followup <https://lists.gnu.org/r/emacs-devel/2018-12/msg00043.html>. > > I've read this several times. It suffers the same drawbacks as Stefan's idea. In particular it doesn't give any idea how the compiler would operate on the proposed forms. As I understand it, Gemini and Stefan are thinking of essentially the same thing: have the reader optionally generate symbols-with-positions, have the compiler deal with symbols-with-positions, and have the compiler strip positions before passing forms to macro arguments that are not annotated to accept positions. Although (as you mention) this would require amending a large part of the compiler in a tedious way, it should solve the problem for macro arguments that accept positions. >> <https://lists.gnu.org/r/emacs-devel/2018-12/msg00023.html> > > I don't think it would work, either. That idea is for macros' uses of eq to be replaced by BC-eq inside the macro. The trouble is, many uses of eq are actually expansions of EQ in the C code (e.g. in Fequal, Fassq, ....) and they would all need modifying too, and we're back in the same situation of having an alternative interpreter. No, the idea is that the onus of doing comparisons correctly is on the writer of any macro annotated to understand symbols-with-positions. That is, it's the macro's responsibility to use appropriate comparison operations, and this responsibility extends to comparison operations like EQ that are executed in C code. For example, I suggested that 'equal' should ignore symbol positions, as this would let these macros use 'equal' instead of 'eq', 'assoc' instead of 'assq', etc. <https://lists.gnu.org/r/emacs-devel/2018-12/msg00033.html>. Although 'assoc' is written in C and its source code uses EQ, the source code would not need to be changed nor would it need to be compiled twice, as 'assoc' defers to 'equal' (i.e., Fequal in C) to do the tricky work and 'equal' would do the right thing. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 18:05 ` Paul Eggert @ 2018-12-11 19:20 ` Alan Mackenzie 2018-12-11 19:59 ` Paul Eggert 0 siblings, 1 reply; 20+ messages in thread From: Alan Mackenzie @ 2018-12-11 19:20 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel Hello, Paul. On Tue, Dec 11, 2018 at 10:05:49 -0800, Paul Eggert wrote: > On 12/11/18 3:34 AM, Alan Mackenzie wrote: > > The list of known functions can be reliably generated by objdump > > (from binutils). > Would objdump be run on every build that compiles a .c file that goes > into the Emacs executable? I don't know, at this stage. Probably not. > If so, aren't we limiting builds to platforms that have binutils, > which would be a new restriction? Well, we use ld, which also belongs to binutils, and that doesn't seem to restrict the platforms. Other platforms surely have equivalents to both objdump and ld, and they are/would be used appropriately. > And if not, wouldn't the objdump use be a hand-done process that would > need to be redone on a binutils platform (with output committed to > Git) whenever a significant-enough change is made to src/*? Either > way, this sounds like a hassle. globals.h seems to manage. The objdump output could be generated analogously. Somehow. Probably. > >> .... and Emacs already uses the C preprocessor aggressively. > >> Instead, why not use the C preprocessor itself, rather than > >> writing another preprocessor for C? In other words, compile each > >> file twice, once with one -D option and once with another. > > Because the two interpreters will need to share file static data, > > of which there must be only one copy. So the two versions of each > > function need to be in the same "source" file. > It should be easy enough to move shared file-static data into another > file, that would be compiled only once. We don't have so much shared > file-static data that this would be a major obstacle. Possibly not. The same would have to be done with file global data, too. But doing it that way would involve a great deal of change to the source code (testing for the -D option) which would not be popular. > > The form Stefan suggested is MUCH bigger than the plain form, having, > > perhaps four times the number of conses (I haven't counted them). > This overhead would occur only when byte-compiling the form, which > shouldn't be much of a problem in practice. It would likely slow down the compilation by a very great deal. > > This preprocessor would be tedious rather than difficult to write. > ... > > A large part of the compiler would need to be amended to cope with > > the new format, even supposing it could work with macros (which I > > don't think it could). This amendment would be uninspiring and > > tedious in the extreme. > I agree that either approach would be tedious. :-) No. You're conflating "tedious" with "tedious in the extreme". They're different. The first is several days of work. The second is many weeks of work. I tried an approach two years ago which involved amending most of the compiler. Although spending a fair amount of time on it, I didn't get very far, and gave up. Were the reader to produce "Stefan's form", the work to amend the compiler would be more even than what I gave up on. > However, a tedious approach that is limited to reading and > byte-compilation is better than a tedious approach that affects all of > execution. My proposed approach would affect only byte compilation (and the build process, of course). That's the whole point. Besides, my approach will work. The competing half-baked ideas are not even fully formulated, and likely wouldn't work. > >> for more info, see Gemini's followup > >> <https://lists.gnu.org/r/emacs-devel/2018-12/msg00043.html>. > > I've read this several times. It suffers the same drawbacks as > > Stefan's idea. In particular it doesn't give any idea how the > > compiler would operate on the proposed forms. > As I understand it, Gemini and Stefan are thinking of essentially the > same thing: have the reader optionally generate symbols-with-positions, > have the compiler deal with symbols-with-positions, and have the > compiler strip positions before passing forms to macro arguments that > are not annotated to accept positions. Although (as you mention) this > would require amending a large part of the compiler in a tedious way, it > should solve the problem for macro arguments that accept positions. This would place onerous restrictions on what macros were allowed to do, and likely be incompatible with a vast proportion of existing macros. > >> <https://lists.gnu.org/r/emacs-devel/2018-12/msg00023.html> > > I don't think it would work, either. That idea is for macros' uses > > of eq to be replaced by BC-eq inside the macro. The trouble is, > > many uses of eq are actually expansions of EQ in the C code (e.g. > > in Fequal, Fassq, ....) and they would all need modifying too, and > > we're back in the same situation of having an alternative > > interpreter. > No, the idea is that the onus of doing comparisons correctly is on the > writer of any macro annotated to understand symbols-with-positions. That > is, it's the macro's responsibility to use appropriate comparison > operations, and this responsibility extends to comparison operations > like EQ that are executed in C code. I think that is an unacceptable change in Emacs. Macros are already difficult enough to write. Such restrictions could make macros impossible, except for "experts". Besides, we want to maintain compatibility with the vast body of existing macros. > For example, I suggested that 'equal' should ignore symbol positions, as > this would let these macros use 'equal' instead of 'eq', 'assoc' instead > of 'assq', etc. > <https://lists.gnu.org/r/emacs-devel/2018-12/msg00033.html>. Although > 'assoc' is written in C and its source code uses EQ, the source code > would not need to be changed nor would it need to be compiled twice, as > 'assoc' defers to 'equal' (i.e., Fequal in C) to do the tricky work and > 'equal' would do the right thing. The trouble is, macros DO use eq. And why not? The contortions you're envisaging contrast horribly with the simplicity of scratch/accurate-warning-pos, which simply works. It works because it has not changed any of the basic types or interactions between them. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 19:20 ` Alan Mackenzie @ 2018-12-11 19:59 ` Paul Eggert 2018-12-11 20:51 ` Alan Mackenzie 0 siblings, 1 reply; 20+ messages in thread From: Paul Eggert @ 2018-12-11 19:59 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel On 12/11/18 11:20 AM, Alan Mackenzie wrote: >> If so, aren't we limiting builds to platforms that have binutils, >> which would be a new restriction? > Well, we use ld, which also belongs to binutils, and that doesn't seem > to restrict the platforms. Other platforms surely have equivalents to > both objdump and ld, and they are/would be used appropriately. We don't use ld, at least not directly. The makefile uses $(CC), just as it uses $(CC) to compile. On GNU/Linux hosts this eventually uses ld (or gold or whatever), but those details are largely immaterial. If we required objdump or similar utilities, that would be yet another porting hassle. And we might run into platforms where there is no objdump-like utility and we have to write one ourselves. This doesn't sound good at all. > globals.h seems to manage. globals.h manages because we decorate every symbol it needs to find. If we have to decorate every C function that might call EQ (either directly or indirectly), that would also work but it would be a lot more intrusive than globals.h is. And the proposal to use objdump seems to acknowledge this, by proposing a method that wouldn't require such decoration but would have significant portability problems. > >> It should be easy enough to move shared file-static data into another >> file, that would be compiled only once. > Possibly not. The same would have to be done with file global data, > too. But doing it that way would involve a great deal of change to the > source code (testing for the -D option) which would not be popular. It'd be less change than having to decorate every function that might call EQ. > It would likely slow down the compilation by a very great deal. That's OK, if the cost is borne only by people who want accurate diagnostics. People who want compilation speed can simply turn off the accurate-diagnostics flag. > You're conflating "tedious" with "tedious in the extreme". We're estimating how much work would be needed. Even if there would be more work in changing the byte compiler, it shouldn't be so much more work that we need to contort all the rest of Emacs. It's better to localize such changes when possible. > This would place onerous restrictions on what macros were allowed to do, > and likely be incompatible with a vast proportion of existing macros. But under both proposals I mentioned, existing macros would work just fine with no new restrictions. So what I think you're saying is that if people want to write macros that allow for more-accurate diagnostics, they'll find that they can't easily do it for some reason. What reason would that be? Can you give an example based on some macro already defined in Emacs? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 19:59 ` Paul Eggert @ 2018-12-11 20:51 ` Alan Mackenzie 2018-12-11 21:11 ` Stefan Monnier 2018-12-11 21:43 ` Paul Eggert 0 siblings, 2 replies; 20+ messages in thread From: Alan Mackenzie @ 2018-12-11 20:51 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel Hello, Paul. On Tue, Dec 11, 2018 at 11:59:27 -0800, Paul Eggert wrote: > On 12/11/18 11:20 AM, Alan Mackenzie wrote: > If we required objdump or similar utilities, that would be yet another > porting hassle. And we might run into platforms where there is no > objdump-like utility and we have to write one ourselves. This doesn't > sound good at all. Let's just assume we can get a list of functions from somewhere. Exactly how is a minor implementation detail. I only suggested objdump to demonstrate it was possible and easy. I think the C compiler, any C compiler, can generate cross references. That would be another source of the info. [ .... ] > >> It should be easy enough to move shared file-static data into another > >> file, that would be compiled only once. > > Possibly not. The same would have to be done with file global data, > > too. But doing it that way would involve a great deal of change to the > > source code (testing for the -D option) which would not be popular. > It'd be less change than having to decorate every function that might > call EQ. True, but an irrelevant diversion. Nobody but you is suggesting decorating every function. > > It would likely slow down the compilation by a very great deal. > That's OK, if the cost is borne only by people who want accurate > diagnostics. People who want compilation speed can simply turn off the > accurate-diagnostics flag. WHAT???? There is no such flag, will be no such flag, MUST be no such flag. We give accurate diagnostics to EVERYBODY, and we do this FAST. > > You're conflating "tedious" with "tedious in the extreme". > We're estimating how much work would be needed. Even if there would be > more work in changing the byte compiler, it shouldn't be so much more > work that we need to contort all the rest of Emacs. Nobody but you is talking about "contorting all the rest of Emacs". The byte compiler is well over 8000 lines of code, much, possibly most, of which would need to be rewritten. Writing the aforementioned preprocessor is MUCH less work. It is something I can do and intend to do. As for amending the reader and byte compiler to work with "Stefan's format", I know that that is beyond my hacking capacity, even if it could work. I suggest you take on the task yourself, or organise a team to do it. If at the end you have a working solution to the bug, we can compare approaches and merge the better one into the master branch. [ .... ] > > This would place onerous restrictions on what macros were allowed to do, > > and likely be incompatible with a vast proportion of existing macros. > But under both proposals I mentioned, existing macros would work just > fine with no new restrictions. So what I think you're saying is that if > people want to write macros that allow for more-accurate diagnostics, > they'll find that they can't easily do it for some reason. No, I'm saying that people writing macros don't have to and mustn't have to care about diagnostic mechanisms. Lisp hackers deserve to get the best diagnostics without any such ugly compromises being made. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 20:51 ` Alan Mackenzie @ 2018-12-11 21:11 ` Stefan Monnier 2018-12-11 21:35 ` Alan Mackenzie 2018-12-11 21:43 ` Paul Eggert 1 sibling, 1 reply; 20+ messages in thread From: Stefan Monnier @ 2018-12-11 21:11 UTC (permalink / raw) To: emacs-devel > As for amending the reader and byte compiler to work with "Stefan's > format", BTW, I have suggested various approaches, not just one. And all of them have been just rough sketches. Whether and how they'd work is an open question. Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 21:11 ` Stefan Monnier @ 2018-12-11 21:35 ` Alan Mackenzie 2018-12-11 22:58 ` Stefan Monnier 0 siblings, 1 reply; 20+ messages in thread From: Alan Mackenzie @ 2018-12-11 21:35 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Hello, Stefan. On Tue, Dec 11, 2018 at 16:11:21 -0500, Stefan Monnier wrote: > > As for amending the reader and byte compiler to work with "Stefan's > > format", > BTW, I have suggested various approaches, not just one. Apologies, this is true. And you also expressed a desire not to work on the problem. This is accepted and respected. > And all of them have been just rough sketches. One of them was what I've implemented as scratch/accurate-warning-pos. :-) > Whether and how they'd work is an open question. The one Paul and I have been referring to was the one where the reader would return extended lists containing location info alongside the actual Lisp Object. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 21:35 ` Alan Mackenzie @ 2018-12-11 22:58 ` Stefan Monnier 0 siblings, 0 replies; 20+ messages in thread From: Stefan Monnier @ 2018-12-11 22:58 UTC (permalink / raw) To: emacs-devel > The one Paul and I have been referring to was the one where the reader > would return extended lists containing location info alongside the > actual Lisp Object. This is probably a good option in the long run. But let there be no doubt: whlie it should not impact performance of compiled code, it will likely slow down compilation significantly, just because of the increased size of the representation of the code that the compiler needs to manipulate. Stefan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: scratch/accurate-warning-pos: next steps. 2018-12-11 20:51 ` Alan Mackenzie 2018-12-11 21:11 ` Stefan Monnier @ 2018-12-11 21:43 ` Paul Eggert 1 sibling, 0 replies; 20+ messages in thread From: Paul Eggert @ 2018-12-11 21:43 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel On 12/11/18 12:51 PM, Alan Mackenzie wrote: > Let's just assume we can get a list of functions from somewhere. > Exactly how is a minor implementation detail. I only suggested objdump > to demonstrate it was possible and easy. I think the C compiler, any C > compiler, can generate cross references. That would be another source > of the info. I'm afraid that many C compilers don't generate cross references, and that this is not a minor implementation detail that we can assume away. >> That's OK, if the cost is borne only by people who want accurate >> diagnostics. People who want compilation speed can simply turn off the >> accurate-diagnostics flag. > WHAT???? There is no such flag, will be no such flag, MUST be no such > flag. We give accurate diagnostics to EVERYBODY, and we do this FAST. That would be nice, but if we can't do it quickly (without significant slowdowns elsewhere, or major contortions to the code) then perhaps we'll have to settle for accurate diagnostics as an option. > The byte compiler is well over 8000 lines of code, much, possibly > most, of which would need to be rewritten Although it would not be trivial to modify 8000 lines of code in a tedious but mostly-systematic way, that is not what I would call an enormous project. For perspective, the Emacs patch I'm currently hacking on (in a different area) is currently about 3000 lines and I wouldn't be surprised if it doubled before it's done. Sometimes even reasonably-minor conceptual changes require many tedious changes to the source code; that's just life when hacking. > I suggest you take on the task yourself, or organise a team Thanks, but this issue is not that high on my priority list. > people writing macros don't have to and mustn't have > to care about diagnostic mechanisms. Lisp hackers deserve to get the > best diagnostics without any such ugly compromises being made. As I understand it these annotations are not simply concessions to limitations of our location implementation; they also provide information that are useful for other reasons. I'm still not seeing examples of why it would be hard for users to provide the optional annotations, if they want the corresponding advantages. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2018-12-11 22:58 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-12-10 18:00 scratch/accurate-warning-pos: next steps Alan Mackenzie 2018-12-10 18:15 ` Eli Zaretskii 2018-12-10 18:28 ` Alan Mackenzie 2018-12-10 18:39 ` Eli Zaretskii 2018-12-10 19:35 ` Alan Mackenzie 2018-12-10 20:06 ` Eli Zaretskii 2018-12-10 21:03 ` Alan Mackenzie 2018-12-11 6:41 ` Eli Zaretskii 2018-12-11 19:21 ` Stefan Monnier 2018-12-11 19:07 ` Stefan Monnier 2018-12-10 23:54 ` Paul Eggert 2018-12-11 11:34 ` Alan Mackenzie 2018-12-11 18:05 ` Paul Eggert 2018-12-11 19:20 ` Alan Mackenzie 2018-12-11 19:59 ` Paul Eggert 2018-12-11 20:51 ` Alan Mackenzie 2018-12-11 21:11 ` Stefan Monnier 2018-12-11 21:35 ` Alan Mackenzie 2018-12-11 22:58 ` Stefan Monnier 2018-12-11 21:43 ` Paul Eggert
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).