* GNU is looking for Google Summer of Code Projects @ 2020-03-19 15:10 Rocky Bernstein 2020-03-19 17:35 ` Stefan Monnier 0 siblings, 1 reply; 29+ messages in thread From: Rocky Bernstein @ 2020-03-19 15:10 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 966 bytes --] In another list I see that GNU has been accepted for Summer of Code and is looking for projects. My own favorite ones regarding GNU Emacs have to do with beefing up the Emacs Lisp runtime and bytecode system. In particular giving proper callback information from bytecode (bytecode offset, mapping information from bytecode to line numbers). The bytecode decompiler I started, while it works on simple examples, I think I could get going in a much more solid and reliable way. And of course on the elisp package side, realgud has always been hurting for help, multii-display windows in the debugger. But enough about me. What is most in need of help in GNU Emacs that a summer student might reasonably make progress on? Discuss the ideas here (please cc me since I don't regularly follow) and contact me offline and I'll forward the GNU contacts. (Or you probably can look them up for yourself if so inclined. I am not a coordinator, I am just a backup mentor). [-- Attachment #2: Type: text/html, Size: 1104 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 15:10 GNU is looking for Google Summer of Code Projects Rocky Bernstein @ 2020-03-19 17:35 ` Stefan Monnier 2020-03-19 17:56 ` Andrea Corallo 2020-03-19 20:34 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Alan Mackenzie 0 siblings, 2 replies; 29+ messages in thread From: Stefan Monnier @ 2020-03-19 17:35 UTC (permalink / raw) To: Rocky Bernstein; +Cc: emacs-devel > My own favorite ones regarding GNU Emacs have to do with beefing up the > Emacs Lisp runtime and bytecode system. In particular giving proper > callback information from bytecode (bytecode offset, mapping information > from bytecode to line numbers). The bytecode decompiler I started, while it > works on simple examples, I think I could get going in a much more solid > and reliable way. It should be easy (much smaller than a summer project) to change the C code so that a bytecode offset can be extracted from the backtrace. The harder and more interesting part is how to propagate source information (line numbers and/or lexical variable names and location) to byte-code. There are many parts to this, so it's definitely possible to get some summer project(s) out of it. E.g. one such project is to change the reader so it outputs "fat cons cells" (i.e. cons-cells with line-num info), then arrange for that info to survive `macroexpand-all` and `cconv.el`. That could already be used to give more precise line numbers in bytecompiler warnings. Another is to devise a way to annotate bytecode objects with a map from byte-offsets to information about the lexical vars in-scope at that point and their location (i.e. position in the stack or in the closure). And then teach Emacs's debugger to use that info. > But enough about me. What is most in need of help in GNU Emacs that a > summer student might reasonably make progress on? I'm sure there are lots of desires. One I'd suggest is to introduce an "object description" that can be used both by the GC and pdump code (and maybe also by `equal` and `print--preprocess`?), so that when changing the representation of objects or introducing new types we don't have to make corresponding changes in so many different places. XEmacs had such a thing, so there's previous experience on which we can build. It could also be a step towards replacing our GC with one that's incremental such the one in XEmacs (or even better: concurrent, unlike that of XEmacs). Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 17:35 ` Stefan Monnier @ 2020-03-19 17:56 ` Andrea Corallo 2020-03-19 18:05 ` Andrea Corallo ` (2 more replies) 2020-03-19 20:34 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Alan Mackenzie 1 sibling, 3 replies; 29+ messages in thread From: Andrea Corallo @ 2020-03-19 17:56 UTC (permalink / raw) To: Stefan Monnier; +Cc: Rocky Bernstein, emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: > It should be easy (much smaller than a summer project) to change the > C code so that a bytecode offset can be extracted from the backtrace. > > The harder and more interesting part is how to propagate source > information (line numbers and/or lexical variable names and location) to > byte-code. There are many parts to this, so it's definitely possible to > get some summer project(s) out of it. E.g. one such project is to change > the reader so it outputs "fat cons cells" (i.e. cons-cells with line-num > info), then arrange for that info to survive `macroexpand-all` and > `cconv.el`. That could already be used to give more precise line > numbers in bytecompiler warnings. > > Another is to devise a way to annotate bytecode objects with a map from > byte-offsets to information about the lexical vars in-scope at that point > and their location (i.e. position in the stack or in the closure). > And then teach Emacs's debugger to use that info. > >> But enough about me. What is most in need of help in GNU Emacs that a >> summer student might reasonably make progress on? > > I'm sure there are lots of desires. One I'd suggest is to introduce an > "object description" that can be used both by the GC and pdump code (and > maybe also by `equal` and `print--preprocess`?), so that when changing > the representation of objects or introducing new types we don't have to > make corresponding changes in so many different places. XEmacs had such > a thing, so there's previous experience on which we can build. > It could also be a step towards replacing our GC with one that's > incremental such the one in XEmacs (or even better: concurrent, unlike > that of XEmacs). > > > Stefan It's probably definitely early to discuss but can't resist. Do we really need some dedicated low level object? This should be all overhead that disappears with compilation anyway. Also wanted to ask, am I wrong or something has been attempted in this field? I'm quite curious on this because the day we get source locations crossing byte-code we could use the native compiler also as a diagnostic tool. Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 17:56 ` Andrea Corallo @ 2020-03-19 18:05 ` Andrea Corallo 2020-03-19 18:19 ` Rocky Bernstein 2020-03-19 21:26 ` Stefan Monnier 2 siblings, 0 replies; 29+ messages in thread From: Andrea Corallo @ 2020-03-19 18:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: Rocky Bernstein, emacs-devel Andrea Corallo <akrl@sdf.org> writes: > Also wanted to ask, am I wrong or something has been attempted on this ^^^ already -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 17:56 ` Andrea Corallo 2020-03-19 18:05 ` Andrea Corallo @ 2020-03-19 18:19 ` Rocky Bernstein 2020-03-19 21:26 ` Stefan Monnier 2 siblings, 0 replies; 29+ messages in thread From: Rocky Bernstein @ 2020-03-19 18:19 UTC (permalink / raw) To: Andrea Corallo; +Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2836 bytes --] On Thu, Mar 19, 2020 at 1:56 PM Andrea Corallo <akrl@sdf.org> wrote: > Stefan Monnier <monnier@iro.umontreal.ca> writes: > > > It should be easy (much smaller than a summer project) to change the > > C code so that a bytecode offset can be extracted from the backtrace. > > > > The harder and more interesting part is how to propagate source > > information (line numbers and/or lexical variable names and location) to > > byte-code. There are many parts to this, so it's definitely possible to > > get some summer project(s) out of it. E.g. one such project is to change > > the reader so it outputs "fat cons cells" (i.e. cons-cells with line-num > > info), then arrange for that info to survive `macroexpand-all` and > > `cconv.el`. That could already be used to give more precise line > > numbers in bytecompiler warnings. > > > > Another is to devise a way to annotate bytecode objects with a map from > > byte-offsets to information about the lexical vars in-scope at that point > > and their location (i.e. position in the stack or in the closure). > > And then teach Emacs's debugger to use that info. > > > >> But enough about me. What is most in need of help in GNU Emacs that a > >> summer student might reasonably make progress on? > > > > I'm sure there are lots of desires. One I'd suggest is to introduce an > > "object description" that can be used both by the GC and pdump code (and > > maybe also by `equal` and `print--preprocess`?), so that when changing > > the representation of objects or introducing new types we don't have to > > make corresponding changes in so many different places. XEmacs had such > > a thing, so there's previous experience on which we can build. > > It could also be a step towards replacing our GC with one that's > > incremental such the one in XEmacs (or even better: concurrent, unlike > > that of XEmacs). > > > > > > Stefan > > It's probably definitely early to discuss but can't resist. > > Do we really need some dedicated low level object? This should be all > overhead that disappears with compilation anyway. > > Also wanted to ask, am I wrong or something has been attempted in this > field? > In the bit that I have come across looking over byteocde work and history e,g. see http://rocky.github.io/elisp-bytecode.pdf it has become extremely clear that there are precious few who understand how the bytecode and runtime system work. And the people who wrote this initially, e.g. rms, and later jwz, no longer do so. No slight to Stefan, Jim Blandy, Paul Eggert or Tom Tromey, but if nothing else we need a new generation of people to pick up the torch and carry on. > > I'm quite curious on this because the day we get source locations > crossing byte-code we could use the native compiler also as a diagnostic > tool. > > Andrea > > -- > akrl@sdf.org > [-- Attachment #2: Type: text/html, Size: 3840 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 17:56 ` Andrea Corallo 2020-03-19 18:05 ` Andrea Corallo 2020-03-19 18:19 ` Rocky Bernstein @ 2020-03-19 21:26 ` Stefan Monnier 2020-03-19 21:45 ` Andrea Corallo 2 siblings, 1 reply; 29+ messages in thread From: Stefan Monnier @ 2020-03-19 21:26 UTC (permalink / raw) To: Andrea Corallo; +Cc: Rocky Bernstein, emacs-devel > Do we really need some dedicated low level object? I don't know what you mean, sorry. > This should be all overhead that disappears with compilation anyway. I get the impression that you were referring to the part where I talked about the "object description" for the runtime system. Compilation is of no help here. It's already all happening in C code. Maybe rewriting in a language with a bit more introspection might make an "object description" more-or-less readily available (maybe the Remacs work might qualify), but we'd still need to connect that with a GC and with pdump etc... Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 21:26 ` Stefan Monnier @ 2020-03-19 21:45 ` Andrea Corallo 2020-03-19 23:07 ` Rocky Bernstein 0 siblings, 1 reply; 29+ messages in thread From: Andrea Corallo @ 2020-03-19 21:45 UTC (permalink / raw) To: Stefan Monnier; +Cc: Rocky Bernstein, emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >> Do we really need some dedicated low level object? > > I don't know what you mean, sorry. > >> This should be all overhead that disappears with compilation anyway. > > I get the impression that you were referring to the part where I talked > about the "object description" for the runtime system. Compilation is > of no help here. It's already all happening in C code. > > Maybe rewriting in a language with a bit more introspection might make > an "object description" more-or-less readily available (maybe the > Remacs work might qualify), but we'd still need to connect that with > a GC and with pdump etc... Ops I now understand, we are talking about 4 different problems: 1 source location going through the compilation pipeline 2 debug information into bytecode to debug 3 autogenerate GC and pdumper code from obj description 4 GC Clear to me thanks. Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: GNU is looking for Google Summer of Code Projects 2020-03-19 21:45 ` Andrea Corallo @ 2020-03-19 23:07 ` Rocky Bernstein 0 siblings, 0 replies; 29+ messages in thread From: Rocky Bernstein @ 2020-03-19 23:07 UTC (permalink / raw) Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1285 bytes --] On Thu, Mar 19, 2020 at 5:45 PM Andrea Corallo <akrl@sdf.org> wrote: > Stefan Monnier <monnier@iro.umontreal.ca> writes: > > >> Do we really need some dedicated low level object? > > > > I don't know what you mean, sorry. > > > >> This should be all overhead that disappears with compilation anyway. > > > > I get the impression that you were referring to the part where I talked > > about the "object description" for the runtime system. Compilation is > > of no help here. It's already all happening in C code. > > > > Maybe rewriting in a language with a bit more introspection might make > > an "object description" more-or-less readily available (maybe the > > Remacs work might qualify), but we'd still need to connect that with > > a GC and with pdump etc... > > Ops I now understand, we are talking about 4 different problems: > > 1 source location going through the compilation pipeline > 2 debug information into bytecode to debug > The above two I think a summer student could do. Clarification of item 2. There is *reporting* location information especially in traceback information on an error, which I suppose could be considered "to debug". 3 autogenerate GC and pdumper code from obj description > 4 GC > > Clear to me thanks. > > Andrea > > -- > akrl@sdf.org > [-- Attachment #2: Type: text/html, Size: 2220 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 17:35 ` Stefan Monnier 2020-03-19 17:56 ` Andrea Corallo @ 2020-03-19 20:34 ` Alan Mackenzie 2020-03-19 20:43 ` Andrea Corallo ` (2 more replies) 1 sibling, 3 replies; 29+ messages in thread From: Alan Mackenzie @ 2020-03-19 20:34 UTC (permalink / raw) To: Stefan Monnier; +Cc: Rocky Bernstein, emacs-devel Hello, Stefan. On Thu, Mar 19, 2020 at 13:35:08 -0400, Stefan Monnier wrote: [ .... ] > It should be easy (much smaller than a summer project) to change the C > code so that a bytecode offset can be extracted from the backtrace. > The harder and more interesting part is how to propagate source > information (line numbers and/or lexical variable names and location) > to byte-code. There are many parts to this, so it's definitely > possible to get some summer project(s) out of it. E.g. one such > project is to change the reader so it outputs "fat cons cells" (i.e. > cons-cells with line-num info), then arrange for that info to survive > `macroexpand-all` and `cconv.el`. That could already be used to give > more precise line numbers in bytecompiler warnings. "More precise line numbers" is a misconstruction, even though I've used such language myself in the past. Line numbers don't come from a physical instrument which measures with, say +-1% accuracy. CORRECT line (and column) numbers are what we need. You will recall that the output of correct line/column numbers for byte compiler messages is a solved problem. I solved it and presented the fix in December 2018. This fix was rejected because it made Emacs slightly slower. In the 3½ years I've been grappling with this problem, I've tried all sorts of things like "fat cons cells". They don't work, and can't work. They can't work because large chunks of our software chew up and spit out cons cells with gay abandon (I'm talking about the byte compiler and things like cconv.el here). More to the point, users' macros chew up and spit out cons cells, and we have no control over them. So whilst we could, with a lot of tedious effort, clean up our own software to preserve cons cells (believe me, I've tried), this would fail in users' macros. Since then I've worked a fair bit on creating a "double" Emacs core, one core being for normal use, the other for byte compiling. There's a fair amount of work still to do on this, but I know how to do it. The problem is that I have been discouraged by the prospect of having this solution vetoed too, since it will make Emacs quite a bit bigger. I don't think it is fair to give this problem to a group of summer coders. It is too hard a problem, both technically and politically. [ .... ] > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 20:34 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Alan Mackenzie @ 2020-03-19 20:43 ` Andrea Corallo 2020-03-20 19:18 ` Alan Mackenzie 2020-03-19 20:56 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Rocky Bernstein 2020-03-19 21:41 ` Stefan Monnier 2 siblings, 1 reply; 29+ messages in thread From: Andrea Corallo @ 2020-03-19 20:43 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: > "More precise line numbers" is a misconstruction, even though I've used > such language myself in the past. Line numbers don't come from a > physical instrument which measures with, say +-1% accuracy. CORRECT > line (and column) numbers are what we need. > > You will recall that the output of correct line/column numbers for byte > compiler messages is a solved problem. I solved it and presented the > fix in December 2018. This fix was rejected because it made Emacs > slightly slower. > > In the 3½ years I've been grappling with this problem, I've tried all > sorts of things like "fat cons cells". They don't work, and can't work. > They can't work because large chunks of our software chew up and spit > out cons cells with gay abandon (I'm talking about the byte compiler and > things like cconv.el here). More to the point, users' macros chew up and > spit out cons cells, and we have no control over them. So whilst we > could, with a lot of tedious effort, clean up our own software to > preserve cons cells (believe me, I've tried), this would fail in users' > macros. > > Since then I've worked a fair bit on creating a "double" Emacs core, one > core being for normal use, the other for byte compiling. There's a fair > amount of work still to do on this, but I know how to do it. The problem > is that I have been discouraged by the prospect of having this solution > vetoed too, since it will make Emacs quite a bit bigger. > > I don't think it is fair to give this problem to a group of summer > coders. It is too hard a problem, both technically and politically. > Hi Alan, Sorry I'm new to Emacs development, where can be found the code of your attempt? Is it in a feature branch? Thanks Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 20:43 ` Andrea Corallo @ 2020-03-20 19:18 ` Alan Mackenzie 2020-03-21 11:22 ` Andrea Corallo 0 siblings, 1 reply; 29+ messages in thread From: Alan Mackenzie @ 2020-03-20 19:18 UTC (permalink / raw) To: Andrea Corallo; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Hello, Andrea. On Thu, Mar 19, 2020 at 20:43:15 +0000, Andrea Corallo wrote: > Alan Mackenzie <acm@muc.de> writes: > > "More precise line numbers" is a misconstruction, even though I've used > > such language myself in the past. Line numbers don't come from a > > physical instrument which measures with, say +-1% accuracy. CORRECT > > line (and column) numbers are what we need. > > You will recall that the output of correct line/column numbers for byte > > compiler messages is a solved problem. I solved it and presented the > > fix in December 2018. This fix was rejected because it made Emacs > > slightly slower. > > In the 3½ years I've been grappling with this problem, I've tried all > > sorts of things like "fat cons cells". They don't work, and can't work. > > They can't work because large chunks of our software chew up and spit > > out cons cells with gay abandon (I'm talking about the byte compiler and > > things like cconv.el here). More to the point, users' macros chew up and > > spit out cons cells, and we have no control over them. So whilst we > > could, with a lot of tedious effort, clean up our own software to > > preserve cons cells (believe me, I've tried), this would fail in users' > > macros. > > Since then I've worked a fair bit on creating a "double" Emacs core, one > > core being for normal use, the other for byte compiling. There's a fair > > amount of work still to do on this, but I know how to do it. The problem > > is that I have been discouraged by the prospect of having this solution > > vetoed too, since it will make Emacs quite a bit bigger. > > I don't think it is fair to give this problem to a group of summer > > coders. It is too hard a problem, both technically and politically. > Hi Alan, > Sorry I'm new to Emacs development, where can be found the code of your > attempt? Is it in a feature branch? It's in the branch scratch/accurate-warning-pos. The commit which converted the unfinished work to a bug fix was: commit 2e04ddadab266d245a3bd0f6c19223ea515bdb90 Author: Alan Mackenzie <acm@muc.de> Date: Fri Nov 30 14:55:48 2018 +0000 Sundry amendments to branch scratch/accurate-warning-pos. (except, I think it still outputs two positions for each warning message: the traditional one, and the new correct one). > Thanks > Andrea > -- > akrl@sdf.org -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-20 19:18 ` Alan Mackenzie @ 2020-03-21 11:22 ` Andrea Corallo 2020-03-21 15:30 ` Correct line/column numbers in byte compiler messages Alan Mackenzie 0 siblings, 1 reply; 29+ messages in thread From: Andrea Corallo @ 2020-03-21 11:22 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: > It's in the branch scratch/accurate-warning-pos. The commit which > converted the unfinished work to a bug fix was: > > commit 2e04ddadab266d245a3bd0f6c19223ea515bdb90 > Author: Alan Mackenzie <acm@muc.de> > Date: Fri Nov 30 14:55:48 2018 +0000 > > Sundry amendments to branch scratch/accurate-warning-pos. > > (except, I think it still outputs two positions for each warning > message: the traditional one, and the new correct one). > I all, I've took a very quick look to the accurate-warning-pos and did some measures. I've measured the bootstrap time and run elisp-benchmarks (dhrystone take out cause broken on both branches) comparing accurate-warning-pos against the last in-tree commit it's based on. Here what I see on my dev machine: * b071398ba3 @ scratch/accurate-warning-pos ** bootstrap real 2m31.076s user 15m8.049s sys 0m38.087s ** elisp-benckmarks | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | |----------------+----------------+------------+---------+-------------+-----------------| | bubble-no-cons | 11.53 | 0.04 | 4 | 11.57 | 0.01 | | bubble | 4.74 | 3.81 | 484 | 8.55 | 0.00 | | fibn-rec | 6.35 | 0.00 | 0 | 6.35 | 0.00 | | fibn-tc | 5.59 | 0.00 | 0 | 5.59 | 0.02 | | fibn | 11.90 | 0.00 | 0 | 11.90 | 0.01 | | inclist | 17.86 | 0.01 | 1 | 17.87 | 0.01 | | listlen-tc | 6.48 | 0.00 | 0 | 6.48 | 0.01 | | nbody | 3.58 | 6.70 | 839 | 10.28 | 0.01 | | pidigits | 5.60 | 5.68 | 457 | 11.28 | 0.03 | |----------------+----------------+------------+---------+-------------+-----------------| | total | 73.62 | 16.24 | 1785 | 89.86 | 0.04 | * b619777dd6 (baseline) ** bootstrap real 2m20.762s user 13m35.418s sys 0m37.349s ** elisp-benckmarks | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | |----------------+----------------+------------+---------+-------------+-----------------| | bubble-no-cons | 11.43 | 0.04 | 4 | 11.47 | 0.00 | | bubble | 4.67 | 3.58 | 487 | 8.25 | 0.01 | | fibn-rec | 6.21 | 0.00 | 0 | 6.21 | 0.00 | | fibn-tc | 5.68 | 0.00 | 0 | 5.68 | 0.00 | | fibn | 11.47 | 0.00 | 0 | 11.47 | 0.00 | | inclist | 17.37 | 0.01 | 1 | 17.38 | 0.00 | | listlen-tc | 6.46 | 0.00 | 0 | 6.46 | 0.00 | | nbody | 3.36 | 6.24 | 839 | 9.60 | 0.01 | | pidigits | 5.66 | 5.53 | 457 | 11.19 | 0.03 | |----------------+----------------+------------+---------+-------------+-----------------| | total | 72.32 | 15.39 | 1788 | 87.71 | 0.03 | The outcome as I see it is that total bootstrap time gets bigger 1.1x while normal runtime appears not affected. For my quick understanding of how it works this is expected. The additional branch and compare against symbols_with_pos_enabled in `eq' is a kind of branch that is very easily predictable by any modern CPU, therefore when the feature is off (not compiling) it becomes transparent (I'd see a compiler branch hit there too). elisp-benchmarks are not completely rapresentative for now but again... better than nothing. Am I missing something else here or we are trading out the exact solution for like ~15% off the byte compile-time? I think this feature would be a big step forward for our toolchain opening many possibilities. I suspect fat conses will requires more modifications across the whole compilation pipeline (including macros?) bringing a less accurate result and still they have to prove the smaller overhead. At this point I start suspecting I'm missing something very big here, am I? Anyway thanks Alan for this. Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 11:22 ` Andrea Corallo @ 2020-03-21 15:30 ` Alan Mackenzie 2020-03-21 16:28 ` Andrea Corallo 0 siblings, 1 reply; 29+ messages in thread From: Alan Mackenzie @ 2020-03-21 15:30 UTC (permalink / raw) To: Andrea Corallo; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Hello, Andrea. On Sat, Mar 21, 2020 at 11:22:03 +0000, Andrea Corallo wrote: > Alan Mackenzie <acm@muc.de> writes: > > It's in the branch scratch/accurate-warning-pos. The commit which > > converted the unfinished work to a bug fix was: > > commit 2e04ddadab266d245a3bd0f6c19223ea515bdb90 > > Author: Alan Mackenzie <acm@muc.de> > > Date: Fri Nov 30 14:55:48 2018 +0000 > > Sundry amendments to branch scratch/accurate-warning-pos. > > (except, I think it still outputs two positions for each warning > > message: the traditional one, and the new correct one). > I all, > I've took a very quick look to the accurate-warning-pos and did some > measures. Thanks, that's appreciated. > I've measured the bootstrap time and run elisp-benchmarks (dhrystone > take out cause broken on both branches) comparing accurate-warning-pos > against the last in-tree commit it's based on. Here what I see on my > dev machine: > * b071398ba3 @ scratch/accurate-warning-pos > ** bootstrap > real 2m31.076s > user 15m8.049s > sys 0m38.087s > ** elisp-benckmarks > | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | > |----------------+----------------+------------+---------+-------------+-----------------| > | bubble-no-cons | 11.53 | 0.04 | 4 | 11.57 | 0.01 | > | bubble | 4.74 | 3.81 | 484 | 8.55 | 0.00 | > | fibn-rec | 6.35 | 0.00 | 0 | 6.35 | 0.00 | > | fibn-tc | 5.59 | 0.00 | 0 | 5.59 | 0.02 | > | fibn | 11.90 | 0.00 | 0 | 11.90 | 0.01 | > | inclist | 17.86 | 0.01 | 1 | 17.87 | 0.01 | > | listlen-tc | 6.48 | 0.00 | 0 | 6.48 | 0.01 | > | nbody | 3.58 | 6.70 | 839 | 10.28 | 0.01 | > | pidigits | 5.60 | 5.68 | 457 | 11.28 | 0.03 | > |----------------+----------------+------------+---------+-------------+-----------------| > | total | 73.62 | 16.24 | 1785 | 89.86 | 0.04 | > * b619777dd6 (baseline) > ** bootstrap > real 2m20.762s > user 13m35.418s > sys 0m37.349s > ** elisp-benckmarks > | test | non-gc avg (s) | gc avg (s) | gcs avg | tot avg (s) | tot avg err (s) | > |----------------+----------------+------------+---------+-------------+-----------------| > | bubble-no-cons | 11.43 | 0.04 | 4 | 11.47 | 0.00 | > | bubble | 4.67 | 3.58 | 487 | 8.25 | 0.01 | > | fibn-rec | 6.21 | 0.00 | 0 | 6.21 | 0.00 | > | fibn-tc | 5.68 | 0.00 | 0 | 5.68 | 0.00 | > | fibn | 11.47 | 0.00 | 0 | 11.47 | 0.00 | > | inclist | 17.37 | 0.01 | 1 | 17.38 | 0.00 | > | listlen-tc | 6.46 | 0.00 | 0 | 6.46 | 0.00 | > | nbody | 3.36 | 6.24 | 839 | 9.60 | 0.01 | > | pidigits | 5.66 | 5.53 | 457 | 11.19 | 0.03 | > |----------------+----------------+------------+---------+-------------+-----------------| > | total | 72.32 | 15.39 | 1788 | 87.71 | 0.03 | > The outcome as I see it is that total bootstrap time gets bigger 1.1x > while normal runtime appears not affected. Well, it looks like the normal runtime is around 2.x% slower for scratch/accurate-warning-pos. > For my quick understanding of how it works this is expected. The > additional branch and compare against symbols_with_pos_enabled in `eq' > is a kind of branch that is very easily predictable by any modern CPU, > therefore when the feature is off (not compiling) it becomes transparent > (I'd see a compiler branch hit there too). In other words, the processor will test symbols_with_pos_enabled simultaneously with starting the continuation for the "not" case. This extra test in the EQ code was always the main thing in the slowdown occurring in this git branch. When I timed things back in 2018, I got a slowdown of somewhat more than 2.x%. May I ask what sort of processor you're using? Mine (unchanged since then) is an AMD Ryzen. > elisp-benchmarks are not completely rapresentative for now but > again... better than nothing. > Am I missing something else here or we are trading out the exact > solution for like ~15% off the byte compile-time? I think this feature > would be a big step forward for our toolchain opening many > possibilities. I suspect fat conses will requires more modifications > across the whole compilation pipeline (including macros?) bringing a > less accurate result and still they have to prove the smaller overhead. > At this point I start suspecting I'm missing something very big here, am > I? > Anyway thanks Alan for this. Thanks! > Andrea > -- > akrl@sdf.org -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 15:30 ` Correct line/column numbers in byte compiler messages Alan Mackenzie @ 2020-03-21 16:28 ` Andrea Corallo 2020-03-21 18:37 ` Andrea Corallo 0 siblings, 1 reply; 29+ messages in thread From: Andrea Corallo @ 2020-03-21 16:28 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: > Hello, Andrea. > > On Sat, Mar 21, 2020 at 11:22:03 +0000, Andrea Corallo wrote: >> The outcome as I see it is that total bootstrap time gets bigger 1.1x >> while normal runtime appears not affected. > > Well, it looks like the normal runtime is around 2.x% slower for > scratch/accurate-warning-pos. Well I studied physics so for me 2% is pretty much zero :) :) Joking apart I'm not sure this is really sufficient to conclude is noise or not. >> For my quick understanding of how it works this is expected. The >> additional branch and compare against symbols_with_pos_enabled in `eq' >> is a kind of branch that is very easily predictable by any modern CPU, >> therefore when the feature is off (not compiling) it becomes transparent >> (I'd see a compiler branch hit there too). > > In other words, the processor will test symbols_with_pos_enabled > simultaneously with starting the continuation for the "not" case. The processor will just speculate guessing the target branch without having to wait for symbols_with_pos_enabled value to be loaded. Given this change rarely, speculation there should be pretty much always correct. I'd wrap symbols_with_pos_enabled into something like: #define SYMBOLS_WITH_POS_ENABLED \ __builtin_expect(symbols_with_pos_enabled, 0) To make sure we minimize instruction cache overhead too. > This extra test in the EQ code was always the main thing in the slowdown > occurring in this git branch. Is the EQ overhead the main/only one? Also GC seems marginally affected. I think would be interesting to write a nano benchmark EQ focused to test this accurately. > When I timed things back in 2018, I got a slowdown of somewhat more than > 2.x%. May I ask what sort of processor you're using? Mine (unchanged > since then) is an AMD Ryzen. I did the test on a "Xeon E5-1660 v3". I think we can classify it as a good system from few (6?) years ago. Not very fast by today's standards but still quite beefy in terms of caches. Bests Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 16:28 ` Andrea Corallo @ 2020-03-21 18:37 ` Andrea Corallo 2020-03-21 20:19 ` Alan Mackenzie 0 siblings, 1 reply; 29+ messages in thread From: Andrea Corallo @ 2020-03-21 18:37 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Have to apologize this is probably the quarantine effect but I couldn't resist testing this: #+BEGIN_SRC lisp ;; -*- lexical-binding: t; -*- (require 'cl-lib) (defvar elb-list (cl-loop for i from 0 to 1500000 if (cl-oddp i) collect 'a else collect 'b)) (defun elb-eq () (let ((n 0)) (dolist (l elb-list n) (when (eq 'b l) (cl-incf n))))) (defun elb-eq-entry () (dotimes (_ 1000) (elb-eq))) #+END_SRC Results: b619777dd6 (baseline) 50.09s accurate-warning-pos 51.28s This is about 2% perf penalty. Interestingly with the __builtin_expect trick applied exec time gets back to 50.65s. We could probably find a benchmark that better highlights the difference (this is potentially dominated by cache misses while pointer chasing the list) but is it worth? Regards Andrea -- akrl@sdf.org ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 18:37 ` Andrea Corallo @ 2020-03-21 20:19 ` Alan Mackenzie 2020-03-21 21:08 ` Andrea Corallo 2020-03-22 11:26 ` Alan Mackenzie 0 siblings, 2 replies; 29+ messages in thread From: Alan Mackenzie @ 2020-03-21 20:19 UTC (permalink / raw) To: Andrea Corallo; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Hello, Andrea. On Sat, Mar 21, 2020 at 18:37:13 +0000, Andrea Corallo wrote: > Have to apologize this is probably the quarantine effect .... As of today, we're under quarantine, too. :-( > .... but I couldn't resist testing this: > #+BEGIN_SRC lisp > ;; -*- lexical-binding: t; -*- > (require 'cl-lib) > (defvar elb-list (cl-loop for i from 0 to 1500000 > if (cl-oddp i) > collect 'a > else > collect 'b)) > (defun elb-eq () > (let ((n 0)) > (dolist (l elb-list n) > (when (eq 'b l) > (cl-incf n))))) > (defun elb-eq-entry () > (dotimes (_ 1000) > (elb-eq))) > #+END_SRC > Results: > b619777dd6 (baseline) 50.09s > accurate-warning-pos 51.28s > This is about 2% perf penalty. On my Ryzen, I'm seeing a 50% penalty. :-( (Admittedly that's comparing the year old branch to current master. I suppose I should build the correct comparable revision and try again.) This suggests that the branch prediction logic isn't present (or isn't active) on the Ryzen. > Interestingly with the __builtin_expect trick applied exec time gets > back to 50.65s. How do you do this? I couldn't make much sense of the documentation of __builtin_expect. :-( > We could probably find a benchmark that better highlights the difference > (this is potentially dominated by cache misses while pointer chasing the > list) but is it worth? Could I ask you to do the following timing. Evaluate the following (e.g. in *scratch*): ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (defmacro time-it (&rest forms) "Time the running of a sequence of forms using `float-time'. Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"." `(let ((start (float-time))) ,@forms (- (float-time) start))) (defun time-scroll (&optional arg) (interactive "P") (message "%s" (time-it (condition-case nil (while t (if arg (scroll-down) (scroll-up)) (sit-for 0)) (error nil))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; , visit .../emacs/src/xdisp.c, and do M-: (time-scroll). This scrolls through the buffer and prints a timing in the minibuffer. (N.B. to run this again, type something at BOB and undo it, thus marking the fontification as stale.) I'm seeing 19.4s vs. 22.2s, which is around 15% difference. :-( > Regards > Andrea > -- > akrl@sdf.org -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 20:19 ` Alan Mackenzie @ 2020-03-21 21:08 ` Andrea Corallo 2020-03-21 23:39 ` Andrea Corallo 2020-03-22 11:26 ` Alan Mackenzie 1 sibling, 1 reply; 29+ messages in thread From: Andrea Corallo @ 2020-03-21 21:08 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2600 bytes --] Alan Mackenzie <acm@muc.de> writes: > On my Ryzen, I'm seeing a 50% penalty. :-( (Admittedly that's > comparing the year old branch to current master. I suppose I should > build the correct comparable revision and try again.) This suggests > that the branch prediction logic isn't present (or isn't active) on the > Ryzen. This is very strange. You cerntaly have to compare branches from the same epoch. I pretty sure in the last year Paul pushed changes to the inline policy with some measureble effect on performance. >> Interestingly with the __builtin_expect trick applied exec time gets >> back to 50.65s. > > How do you do this? I couldn't make much sense of the documentation of > __builtin_expect. :-( I attach the very simple patch I tried. Basically the compiler has an euristic branch predictor (in GCC predict.c) that is used to order the final basic block output. The wanted outcome is to have the most likely execution line as sequential, this on modern CPUs to maximize the front-end bandwidth. "__builtin_expect" is just a strong hint to this predictor. >> We could probably find a benchmark that better highlights the difference >> (this is potentially dominated by cache misses while pointer chasing the >> list) but is it worth? > > Could I ask you to do the following timing. > > Evaluate the following (e.g. in *scratch*): > > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > (defmacro time-it (&rest forms) > "Time the running of a sequence of forms using `float-time'. > Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"." > `(let ((start (float-time))) > ,@forms > (- (float-time) start))) > > (defun time-scroll (&optional arg) > (interactive "P") > (message "%s" > (time-it > (condition-case nil > (while t > (if arg (scroll-down) (scroll-up)) > (sit-for 0)) > (error nil))))) > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > > , visit .../emacs/src/xdisp.c, and do M-: (time-scroll). This scrolls > through the buffer and prints a timing in the minibuffer. (N.B. to run > this again, type something at BOB and undo it, thus marking the > fontification as stale.) > > I'm seeing 19.4s vs. 22.2s, which is around 15% difference. :-( I get 19.30 sec against 16.65 that is 15% difference here too. This is extremely interesting and would be worth profiling. I bet on the GC for this! (Note I'm notoriously wrong when speculating on benchmarks :) Regards Andrea -- akrl@sdf.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: comp-hint.patch --] [-- Type: text/x-diff, Size: 1572 bytes --] diff --git a/src/lisp.h b/src/lisp.h index a22043026a..6e3cca1bbc 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -394,8 +394,12 @@ typedef EMACS_INT Lisp_Word; /* #define lisp_h_EQ(x, y) (XLI (x) == XLI (y)) */ /* verify (NIL_IS_ZERO) */ + +#define SYMBOLS_WITH_POS_ENABLED \ + __builtin_expect(symbols_with_pos_enabled, 0) + #define lisp_h_EQ(x, y) ((XLI ((x)) == XLI ((y))) \ - || (symbols_with_pos_enabled \ + || (SYMBOLS_WITH_POS_ENABLED \ && (SYMBOL_WITH_POS_P ((x)) \ ? BARE_SYMBOL_P ((y)) \ ? (XSYMBOL_WITH_POS((x)))->sym == (y) \ @@ -424,7 +428,7 @@ typedef EMACS_INT Lisp_Word; #define lisp_h_BARE_SYMBOL_P(x) TAGGEDP ((x), Lisp_Symbol) /* verify (NIL_IS_ZERO) */ #define lisp_h_SYMBOLP(x) ((BARE_SYMBOL_P ((x)) || \ - (symbols_with_pos_enabled && (SYMBOL_WITH_POS_P ((x)))))) + (SYMBOLS_WITH_POS_ENABLED && (SYMBOL_WITH_POS_P ((x)))))) #define lisp_h_TAGGEDP(a, tag) \ (! (((unsigned) (XLI (a) >> (USE_LSB_TAG ? 0 : VALBITS)) \ - (unsigned) (tag)) \ @@ -463,7 +467,7 @@ typedef EMACS_INT Lisp_Word; /* verify (NIL_IS_ZERO) */ # define lisp_h_XSYMBOL(a) \ (eassert (SYMBOLP ((a))), \ - (!symbols_with_pos_enabled \ + (!SYMBOLS_WITH_POS_ENABLED \ ? (XBARE_SYMBOL ((a))) \ : (BARE_SYMBOL_P ((a))) \ ? (XBARE_SYMBOL ((a))) \ ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 21:08 ` Andrea Corallo @ 2020-03-21 23:39 ` Andrea Corallo 0 siblings, 0 replies; 29+ messages in thread From: Andrea Corallo @ 2020-03-21 23:39 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 1155 bytes --] Andrea Corallo <akrl@sdf.org> writes: > Alan Mackenzie <acm@muc.de> writes: > >> >> I'm seeing 19.4s vs. 22.2s, which is around 15% difference. :-( > > I get 19.30 sec against 16.65 that is 15% difference here too. This is > extremely interesting and would be worth profiling. > > I bet on the GC for this! (Note I'm notoriously wrong when speculating > on benchmarks :) At this point the evening has been dedicated to this. Apparently part of the issue is that GCC is quite conservative on the inline policy and because the more complex condition in EQ decide not to inline this (at least I see this is not done always). This is true for EQ and some of his friends defined in lisp.h. Part of the cost is not the branch itself but the additional procedure activations. Pushing a little more into inlining GCC with the raw attached patch I got it down on the mackenzie-test to 18.17s that is still/just 9% out. The remaining part is probably a little harder to investigate but I still suspect more of a 'side reasons' than a fundamental one. I suggest we try the fine tuning when rebased on 28 if not too hard. Regards Andrea -- akrl@sdf.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: tmp.patch --] [-- Type: text/x-diff, Size: 3857 bytes --] diff --git a/src/lisp.h b/src/lisp.h index a22043026a..d0c56d7bbb 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -394,8 +394,12 @@ typedef EMACS_INT Lisp_Word; /* #define lisp_h_EQ(x, y) (XLI (x) == XLI (y)) */ /* verify (NIL_IS_ZERO) */ + +#define SYMBOLS_WITH_POS_ENABLED \ + __builtin_expect(symbols_with_pos_enabled, 0) + #define lisp_h_EQ(x, y) ((XLI ((x)) == XLI ((y))) \ - || (symbols_with_pos_enabled \ + || (SYMBOLS_WITH_POS_ENABLED \ && (SYMBOL_WITH_POS_P ((x)) \ ? BARE_SYMBOL_P ((y)) \ ? (XSYMBOL_WITH_POS((x)))->sym == (y) \ @@ -424,7 +428,7 @@ typedef EMACS_INT Lisp_Word; #define lisp_h_BARE_SYMBOL_P(x) TAGGEDP ((x), Lisp_Symbol) /* verify (NIL_IS_ZERO) */ #define lisp_h_SYMBOLP(x) ((BARE_SYMBOL_P ((x)) || \ - (symbols_with_pos_enabled && (SYMBOL_WITH_POS_P ((x)))))) + (SYMBOLS_WITH_POS_ENABLED && (SYMBOL_WITH_POS_P ((x)))))) #define lisp_h_TAGGEDP(a, tag) \ (! (((unsigned) (XLI (a) >> (USE_LSB_TAG ? 0 : VALBITS)) \ - (unsigned) (tag)) \ @@ -463,7 +467,7 @@ typedef EMACS_INT Lisp_Word; /* verify (NIL_IS_ZERO) */ # define lisp_h_XSYMBOL(a) \ (eassert (SYMBOLP ((a))), \ - (!symbols_with_pos_enabled \ + (!SYMBOLS_WITH_POS_ENABLED \ ? (XBARE_SYMBOL ((a))) \ : (BARE_SYMBOL_P ((a))) \ ? (XBARE_SYMBOL ((a))) \ @@ -1137,38 +1141,38 @@ enum More_Lisp_Bits #define MOST_POSITIVE_FIXNUM (EMACS_INT_MAX >> INTTYPEBITS) #define MOST_NEGATIVE_FIXNUM (-1 - MOST_POSITIVE_FIXNUM) \f -INLINE bool +INLINE bool __attribute__ ((always_inline)) PSEUDOVECTORP (Lisp_Object a, int code) { return lisp_h_PSEUDOVECTORP (a, code); } -INLINE bool +INLINE bool __attribute__ ((always_inline)) (BARE_SYMBOL_P) (Lisp_Object x) { return lisp_h_BARE_SYMBOL_P (x); } -INLINE bool +INLINE bool __attribute__ ((always_inline)) (SYMBOL_WITH_POS_P) (Lisp_Object x) { return lisp_h_SYMBOL_WITH_POS_P (x); } -INLINE bool +INLINE bool __attribute__ ((always_inline)) (SYMBOLP) (Lisp_Object x) { return lisp_h_SYMBOLP (x); } -INLINE struct Lisp_Symbol_With_Pos * +INLINE struct Lisp_Symbol_With_Pos * __attribute__ ((always_inline)) XSYMBOL_WITH_POS (Lisp_Object a) { eassert (SYMBOL_WITH_POS_P (a)); return XUNTAG (a, Lisp_Vectorlike, struct Lisp_Symbol_With_Pos); } -INLINE struct Lisp_Symbol * ATTRIBUTE_NO_SANITIZE_UNDEFINED +INLINE struct Lisp_Symbol * __attribute__ ((always_inline)) ATTRIBUTE_NO_SANITIZE_UNDEFINED (XBARE_SYMBOL) (Lisp_Object a) { #if USE_LSB_TAG @@ -1186,7 +1190,7 @@ INLINE struct Lisp_Symbol * ATTRIBUTE_NO_SANITIZE_UNDEFINED #endif } -INLINE struct Lisp_Symbol * ATTRIBUTE_NO_SANITIZE_UNDEFINED +INLINE struct Lisp_Symbol * __attribute__ ((always_inline)) ATTRIBUTE_NO_SANITIZE_UNDEFINED (XSYMBOL) (Lisp_Object a) { return lisp_h_XSYMBOL (a); @@ -1336,7 +1340,7 @@ INLINE bool /* Return true if X and Y are the same object, reckoning a symbol with position as being the same as the bare symbol. */ -INLINE bool +inline bool __attribute__ ((always_inline)) (EQ) (Lisp_Object x, Lisp_Object y) { return lisp_h_EQ (x, y); @@ -2690,7 +2694,7 @@ XOVERLAY (Lisp_Object a) return XUNTAG (a, Lisp_Vectorlike, struct Lisp_Overlay); } -INLINE Lisp_Object +INLINE Lisp_Object __attribute__ ((always_inline)) SYMBOL_WITH_POS_SYM (Lisp_Object a) { if (!SYMBOL_WITH_POS_P (a)) @@ -2698,7 +2702,7 @@ SYMBOL_WITH_POS_SYM (Lisp_Object a) return XSYMBOL_WITH_POS (a)->sym; } -INLINE Lisp_Object +INLINE Lisp_Object __attribute__ ((always_inline)) SYMBOL_WITH_POS_POS (Lisp_Object a) { if (!SYMBOL_WITH_POS_P (a)) ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages 2020-03-21 20:19 ` Alan Mackenzie 2020-03-21 21:08 ` Andrea Corallo @ 2020-03-22 11:26 ` Alan Mackenzie 1 sibling, 0 replies; 29+ messages in thread From: Alan Mackenzie @ 2020-03-22 11:26 UTC (permalink / raw) To: Andrea Corallo; +Cc: Rocky Bernstein, Stefan Monnier, emacs-devel Hello, Andrea. On Sat, Mar 21, 2020 at 20:19:54 +0000, Alan Mackenzie wrote: > On Sat, Mar 21, 2020 at 18:37:13 +0000, Andrea Corallo wrote: > > Have to apologize this is probably the quarantine effect .... > As of today, we're under quarantine, too. :-( > > .... but I couldn't resist testing this: > > #+BEGIN_SRC lisp > > ;; -*- lexical-binding: t; -*- > > (require 'cl-lib) > > (defvar elb-list (cl-loop for i from 0 to 1500000 > > if (cl-oddp i) > > collect 'a > > else > > collect 'b)) > > (defun elb-eq () > > (let ((n 0)) > > (dolist (l elb-list n) > > (when (eq 'b l) > > (cl-incf n))))) > > (defun elb-eq-entry () > > (dotimes (_ 1000) > > (elb-eq))) > > #+END_SRC > > Results: > > b619777dd6 (baseline) 50.09s > > accurate-warning-pos 51.28s > > This is about 2% perf penalty. > On my Ryzen, I'm seeing a 50% penalty. :-( (Admittedly that's > comparing the year old branch to current master. I suppose I should > build the correct comparable revision and try again.) This suggests > that the branch prediction logic isn't present (or isn't active) on the > Ryzen. OK, I've done just that (with revision b619777dd67e271d639c6fb1d031650af8fd79e6 from 2019-03-30) and I now see what you see: b619777: 76.067s scratch/accurate-warning-pos: 77.656s. master: 52.423s So, clearly, optimisations to Emacs in the last year have borne fruit. Maybe that optimisaton would be useful in s/a-w-p. > > Interestingly with the __builtin_expect trick applied exec time gets > > back to 50.65s. > How do you do this? I couldn't make much sense of the documentation of > __builtin_expect. :-( I've read your patch in your other mail, and I will apply it and try it out. [ .... ] > > Regards > > Andrea > > -- > > akrl@sdf.org -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 20:34 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Alan Mackenzie 2020-03-19 20:43 ` Andrea Corallo @ 2020-03-19 20:56 ` Rocky Bernstein 2020-03-19 22:05 ` Stefan Monnier 2020-03-20 19:25 ` Alan Mackenzie 2020-03-19 21:41 ` Stefan Monnier 2 siblings, 2 replies; 29+ messages in thread From: Rocky Bernstein @ 2020-03-19 20:56 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3508 bytes --] On Thu, Mar 19, 2020 at 4:35 PM Alan Mackenzie <acm@muc.de> wrote: > Hello, Stefan. > > On Thu, Mar 19, 2020 at 13:35:08 -0400, Stefan Monnier wrote: > > [ .... ] > > > It should be easy (much smaller than a summer project) to change the C > > code so that a bytecode offset can be extracted from the backtrace. > > > The harder and more interesting part is how to propagate source > > information (line numbers and/or lexical variable names and location) > > to byte-code. There are many parts to this, so it's definitely > > possible to get some summer project(s) out of it. E.g. one such > > project is to change the reader so it outputs "fat cons cells" (i.e. > > cons-cells with line-num info), then arrange for that info to survive > > `macroexpand-all` and `cconv.el`. That could already be used to give > > more precise line numbers in bytecompiler warnings. > > "More precise line numbers" is a misconstruction, even though I've used > such language myself in the past. Line numbers don't come from a > physical instrument which measures with, say +-1% accuracy. CORRECT > line (and column) numbers are what we need. > A bytecode offset is exact and accurate. Right now this information unavailable. I think the interpreter uses C pointers stored in a register. So just recording the bytecode offset is a little bit of a slowdown, but not that much. I doubt it would even register as %1 slower. But just that would open the way for improvements. This is doable by a Summer student - Stefan thinks it trivial. But tas you point out there is overhead in getting it accepted and into GNU Emacs. Having access to the bytecode offset in a traceback there next are several options. At the lowest level there is just showing that along with a disassembly of the bytecode. And that I believe that is also doable by a summer student. Going further are a number of options that folks have mentioned so I won't expand on that. > You will recall that the output of correct line/column numbers for byte > compiler messages is a solved problem. I solved it and presented the > fix in December 2018. This fix was rejected because it made Emacs > slightly slower. > > In the 3½ years I've been grappling with this problem, I've tried all > sorts of things like "fat cons cells". They don't work, and can't work. > They can't work because large chunks of our software chew up and spit > out cons cells with gay abandon (I'm talking about the byte compiler and > things like cconv.el here). More to the point, users' macros chew up and > spit out cons cells, and we have no control over them. So whilst we > could, with a lot of tedious effort, clean up our own software to > preserve cons cells (believe me, I've tried), this would fail in users' > macros. > > Since then I've worked a fair bit on creating a "double" Emacs core, one > core being for normal use, the other for byte compiling. There's a fair > amount of work still to do on this, but I know how to do it. The problem > is that I have been discouraged by the prospect of having this solution > vetoed too, since it will make Emacs quite a bit bigger. > > I don't think it is fair to give this problem to a group of summer > coders. It is too hard a problem, both technically and politically. > Ok. So do you have a suggestion for what a summer student might do? > > [ .... ] > > > Stefan > > -- > Alan Mackenzie (Nuremberg, Germany). > [-- Attachment #2: Type: text/html, Size: 4502 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 20:56 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Rocky Bernstein @ 2020-03-19 22:05 ` Stefan Monnier 2020-03-20 19:25 ` Alan Mackenzie 1 sibling, 0 replies; 29+ messages in thread From: Stefan Monnier @ 2020-03-19 22:05 UTC (permalink / raw) To: Rocky Bernstein; +Cc: Alan Mackenzie, emacs-devel >> "More precise line numbers" is a misconstruction, even though I've used >> such language myself in the past. Line numbers don't come from a >> physical instrument which measures with, say +-1% accuracy. CORRECT >> line (and column) numbers are what we need. > A bytecode offset is exact and accurate. I think he was talking about line-number info in byte-compiler warnings (where the info comes from the source code, not from the backtrace). Currently we use a hack that gives us approximate locations which can be wildly off-the-mark. > Right now this information unavailable. I think the interpreter uses > C pointers stored in a register. So just recording the bytecode > offset is a little bit of a slowdown, but not that much. Indeed. If it's too high we could make it conditional on a boolean variable. > I doubt it would even register as %1 slower. Reminds me that another project could be to try and speed up function calls. The difficulty here is that we don't really know what's the main source of the cost, so there's a good chance that any specific attempt will give disappointing results. It'd still be useful in helping us getting a better idea of what it is that takes time. > But just that would open the way for improvements. This is doable by a > Summer student - Stefan thinks it trivial. Just recording this info in the backtrace (at a minor performance cost) is indeed very easy. > But tas you point out there is overhead in getting it accepted and > into GNU Emacs. Right. Until this info is actually usable by tools like the debugger, the code would inevitably be #ifdef'd out unless it has zero-cost which seems unlikely. > Having access to the bytecode offset in a traceback there next are > several options. At the lowest level there is just showing that along > with a disassembly of the bytecode. > And that I believe that is also doable by a summer student. Agreed. Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 20:56 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Rocky Bernstein 2020-03-19 22:05 ` Stefan Monnier @ 2020-03-20 19:25 ` Alan Mackenzie 1 sibling, 0 replies; 29+ messages in thread From: Alan Mackenzie @ 2020-03-20 19:25 UTC (permalink / raw) To: Rocky Bernstein; +Cc: Stefan Monnier, emacs-devel Hello, Rocky. On Thu, Mar 19, 2020 at 16:56:45 -0400, Rocky Bernstein wrote: > On Thu, Mar 19, 2020 at 4:35 PM Alan Mackenzie <acm@muc.de> wrote: [ .... ] > > I don't think it is fair to give this problem to a group of summer > > coders. It is too hard a problem, both technically and politically. > Ok. So do you have a suggestion for what a summer student might do? Sorry, no I don't. It would need to be something in that sweet spot between being dull and tedious and being too challenging and difficult. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 20:34 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Alan Mackenzie 2020-03-19 20:43 ` Andrea Corallo 2020-03-19 20:56 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Rocky Bernstein @ 2020-03-19 21:41 ` Stefan Monnier 2020-03-19 22:09 ` Stefan Monnier 2020-03-20 20:10 ` Alan Mackenzie 2 siblings, 2 replies; 29+ messages in thread From: Stefan Monnier @ 2020-03-19 21:41 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, emacs-devel > things like cconv.el here). More to the point, users' macros chew up and > spit out cons cells, and we have no control over them. So whilst we > could, with a lot of tedious effort, clean up our own software to > preserve cons cells (believe me, I've tried), this would fail in users' > macros. I think fat-cons cells are cheap to implement (with (hopefully) no performance impact when not used or weird semantic artifacts like the fat-symbol approach you tried) and can work 99.9% right in the long term with an incremental way to get there. Furthermore it matches the "usual" way to deal with this problem, so there's very little doubt about whether it can work or not. > Since then I've worked a fair bit on creating a "double" Emacs core, one > core being for normal use, the other for byte compiling. There's a fair > amount of work still to do on this, but I know how to do it. The problem > is that I have been discouraged by the prospect of having this solution > vetoed too, since it will make Emacs quite a bit bigger. I'd probably try to veto it, indeed. It might be a good solution in the short-term but it'd just slow down our progress in the long term. Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 21:41 ` Stefan Monnier @ 2020-03-19 22:09 ` Stefan Monnier 2020-03-20 20:10 ` Alan Mackenzie 1 sibling, 0 replies; 29+ messages in thread From: Stefan Monnier @ 2020-03-19 22:09 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, emacs-devel > I think fat-cons cells are cheap to implement (with (hopefully) no > performance impact when not used or weird semantic artifacts like the > fat-symbol approach you tried) and can work 99.9% right in the long term > with an incremental way to get there. Reminds me that another project could be to provide something like Scheme's `syntax-rules` or `syntax-case`. These could be attractive on their own while also making it easier to correctly propagate source-level line-number information. Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-19 21:41 ` Stefan Monnier 2020-03-19 22:09 ` Stefan Monnier @ 2020-03-20 20:10 ` Alan Mackenzie 2020-03-20 21:23 ` Rocky Bernstein ` (2 more replies) 1 sibling, 3 replies; 29+ messages in thread From: Alan Mackenzie @ 2020-03-20 20:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: Rocky Bernstein, emacs-devel Hello, Stefan. On Thu, Mar 19, 2020 at 17:41:30 -0400, Stefan Monnier wrote: > > things like cconv.el here). More to the point, users' macros chew up and > > spit out cons cells, and we have no control over them. So whilst we > > could, with a lot of tedious effort, clean up our own software to > > preserve cons cells (believe me, I've tried), this would fail in users' > > macros. > I think fat-cons cells are cheap to implement (with (hopefully) no > performance impact when not used ..... They may be cheap to implement in themselves, but adapting the entire byte compiler and all our macros to the heavily restricted semantics they would impose would be an enormous job. I've tried something similar, and gave up in exhaustion. > or weird semantic artifacts like the fat-symbol approach you tried), Er, not "tried" but "implemented", please. The implementation was complete, and was capable of bootstrapping Emacs with correct positions for all the (then plentiful) warning messages. > and can work 99.9% right in the long term with an incremental way to > get there. Where does this 99.9% come from? How is this cons tracking you're proposing supposed to work, when there are an infinite number of occurrences of the likes of (cons (car form) (cdr form)) in our code? > Furthermore it matches the "usual" way to deal with this problem, so > there's very little doubt about whether it can work or not. Are you saying that this is how other Lisp compilers deal with source code positions? How do they deal with the difficult problem of user macros? Could you give me an example of a free Lisp system which works this way? I'd be interested in having a look at it. I think there's quite a bit of doubt as to whether this could work effectively in Emacs. The way to dispel this doubt is for Somebody (tm) to implement it. > > Since then I've worked a fair bit on creating a "double" Emacs core, > > one core being for normal use, the other for byte compiling. > > There's a fair amount of work still to do on this, but I know how to > > do it. The problem is that I have been discouraged by the prospect > > of having this solution vetoed too, since it will make Emacs quite a > > bit bigger. > I'd probably try to veto it, indeed. It might be a good solution in > the short-term but it'd just slow down our progress in the long term. Fixing bugs slows down our progress? To which the answer is to install the working solution pending the implementation of something better, after which it can be superseded. Somehow, even that strategy tends to get vetoed. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-20 20:10 ` Alan Mackenzie @ 2020-03-20 21:23 ` Rocky Bernstein 2020-03-20 21:27 ` Clément Pit-Claudel 2020-03-20 21:30 ` Stefan Monnier 2 siblings, 0 replies; 29+ messages in thread From: Rocky Bernstein @ 2020-03-20 21:23 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 6080 bytes --] Before I begin, as has been pointed out, let us be clear that the discussion has changed. Originally I was interested in better call stack and traceback information, a *run-time *thing, which I was proposing as a Summer of Code project. The discussion now is compiler locations at *compile* time. So be it. The problems however do have one thing in common: how to represent a location. Let me also correct one earler correction that the "better" location was construed to be a single line and column number. A better way, I believe, to think of locations is as 1. an *container*, where *container* is defined to be something, 2. an offset off of that container, of some kind where "units" are defined to be something, and 3. an optional length of those units. When the length isn't given it is assumed to be the value one. For example, if you are intersted in only representing a line and column number, one value, an offset would do it. Note that this abstraction works equally well for other kinds of things like bytecode and the offset would be the bytecode offset. Many times contiguous sequence of bytecode many times maps to a contiguous sequence in the source code. Of course that's not necessarily *always* the case, but already this is wandering astray of the proposal to follow for me describe how to deal with this. But let me say again if you just care about a single bytecode instruction, set the length to be 1 or leave out the length field. I know this might not be satisfying to some, but here is a extremely simple but accurate proposal that and doesn't incur a lot of overhead and can deal with a lot of generality. A unit of compilation I think is a *function. *That is the container part. Attach to the function its location information in some other way (e.g. it's container might be a file name if that is appropriate, or defined inside a macro...) A function before bytecompile compiles it is a kind of lambda which is a kind of S-Expression. A location inside that could simply be a tree node's preorder number. Or the pre-order number and a number of successor nodes in preorder traversal. As with the simple-minded run-time error location proposal: when we have a bytecode offset, mark that position in a disassembly, the same thing can be done here: show the position or range of nodes in the S-expresion that you've got. What if the bytecode compiler has done some wild and weird optimization changes? Just show what S-exp you were working on and mark where you were. I know for some or many it may not be satisfying, but it is the honest truth and I'd rather have that than nothing or the wrong guess. Having done this first step, the problem is divided a little bit so carry on: discuss and conquer. A separate tool outside of the compiler proper can be written to take this and given pointers to where the source might be located figure out where in the source code that might be. Maybe pattern matching would work, dunno, but let me not try to speculate too much. Finally, in this proposal though I am not suggesting changing the current behavior: by default the additional precise geeky information might be shown only in some sort of "super hacker" verbose compilation mode. On Fri, Mar 20, 2020 at 4:10 PM Alan Mackenzie <acm@muc.de> wrote: > Hello, Stefan. > > On Thu, Mar 19, 2020 at 17:41:30 -0400, Stefan Monnier wrote: > > > things like cconv.el here). More to the point, users' macros chew up > and > > > spit out cons cells, and we have no control over them. So whilst we > > > could, with a lot of tedious effort, clean up our own software to > > > preserve cons cells (believe me, I've tried), this would fail in users' > > > macros. > > > I think fat-cons cells are cheap to implement (with (hopefully) no > > performance impact when not used ..... > > They may be cheap to implement in themselves, but adapting the entire > byte compiler and all our macros to the heavily restricted semantics > they would impose would be an enormous job. I've tried something > similar, and gave up in exhaustion. > > > or weird semantic artifacts like the fat-symbol approach you tried), > > Er, not "tried" but "implemented", please. The implementation was > complete, and was capable of bootstrapping Emacs with correct positions > for all the (then plentiful) warning messages. > > > and can work 99.9% right in the long term with an incremental way to > > get there. > > Where does this 99.9% come from? How is this cons tracking you're > proposing supposed to work, when there are an infinite number of > occurrences of the likes of > > (cons (car form) (cdr form)) > > in our code? > > > Furthermore it matches the "usual" way to deal with this problem, so > > there's very little doubt about whether it can work or not. > > Are you saying that this is how other Lisp compilers deal with source > code positions? How do they deal with the difficult problem of user > macros? Could you give me an example of a free Lisp system which works > this way? I'd be interested in having a look at it. > > I think there's quite a bit of doubt as to whether this could work > effectively in Emacs. The way to dispel this doubt is for Somebody (tm) > to implement it. > > > > Since then I've worked a fair bit on creating a "double" Emacs core, > > > one core being for normal use, the other for byte compiling. > > > There's a fair amount of work still to do on this, but I know how to > > > do it. The problem is that I have been discouraged by the prospect > > > of having this solution vetoed too, since it will make Emacs quite a > > > bit bigger. > > > I'd probably try to veto it, indeed. It might be a good solution in > > the short-term but it'd just slow down our progress in the long term. > > Fixing bugs slows down our progress? > > To which the answer is to install the working solution pending the > implementation of something better, after which it can be superseded. > Somehow, even that strategy tends to get vetoed. > > > Stefan > > -- > Alan Mackenzie (Nuremberg, Germany). > [-- Attachment #2: Type: text/html, Size: 7257 bytes --] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-20 20:10 ` Alan Mackenzie 2020-03-20 21:23 ` Rocky Bernstein @ 2020-03-20 21:27 ` Clément Pit-Claudel 2020-03-20 23:46 ` Stefan Monnier 2020-03-20 21:30 ` Stefan Monnier 2 siblings, 1 reply; 29+ messages in thread From: Clément Pit-Claudel @ 2020-03-20 21:27 UTC (permalink / raw) To: emacs-devel On 20/03/2020 16.10, Alan Mackenzie wrote: > Are you saying that this is how other Lisp compilers deal with > source code positions? How do they deal with the difficult problem > of user macros? Could you give me an example of a free Lisp system > which works this way? I'd be interested in having a look at it. not sure if it counts as a Lisp compiler, but Racket does this; the "fat cons cells" are called syntax objects. See https://blog.racket-lang.org/2011/04/writing-syntax-case-macros.html for a good explanation, including this intro: > The main idea with Racket’s macro system (and with other syntax-case > systems) is that macros are syntax-to-syntax functions, just like the > case of defmacro, except that instead of raw S-expressions you’re > dealing with syntax objects. This becomes very noticeable when > identifiers are handled: instead of dealing with plain symbols, > you’re dealing with these syntax values (called “identifiers” in this > case) that are essentially a symbol and some opaque information that > represents the lexical scope for its source. In several syntax-case > systems this is the only difference from defmacro macros, but in the > Racket case this applies to everything — identifiers, numbers, other > immediate constants, and even function applications, etc — they are > all the same S-expression values that you’re used to, except wrapped > with additional information. Another thing that is unique to Racket > is the extra information: in addition to the opaque lexical context, > there is also source information and arbitrary properties (there are > also certificates, but that’s ignorable for this text). It would be worth checking more closely what Guile does. Its syntax-manipulating functions automatically propagate "source properties", but from reading https://www.gnu.org/software/guile/manual/html_node/Source-Properties.html it seems that it might use something similar to your approach? Clément. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-20 21:27 ` Clément Pit-Claudel @ 2020-03-20 23:46 ` Stefan Monnier 0 siblings, 0 replies; 29+ messages in thread From: Stefan Monnier @ 2020-03-20 23:46 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > properties", but from reading > https://www.gnu.org/software/guile/manual/html_node/Source-Properties.html > it seems that it might use something similar to your approach? It seems that Guile does it along the lines of "fat cons cells" according to their example: scheme@(guile-user)> (xxx) <unnamed port>:4:1: In procedure module-lookup: <unnamed port>:4:1: Unbound variable: xxx scheme@(guile-user)> xxx ERROR: In procedure module-lookup: ERROR: Unbound variable: xxx where only the code with a cons-cell gets location information. That's also what the earlier text says: The way that source properties are stored means that Guile cannot associate source properties with individual symbols, keywords, characters, booleans, or small integers. Tho, IIUC it seems that rather than "fat cons cells" they may be using a hash-table indexed with the object (cons-cells or otherwise). Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] 2020-03-20 20:10 ` Alan Mackenzie 2020-03-20 21:23 ` Rocky Bernstein 2020-03-20 21:27 ` Clément Pit-Claudel @ 2020-03-20 21:30 ` Stefan Monnier 2 siblings, 0 replies; 29+ messages in thread From: Stefan Monnier @ 2020-03-20 21:30 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Rocky Bernstein, emacs-devel >> I think fat-cons cells are cheap to implement (with (hopefully) no >> performance impact when not used ..... > They may be cheap to implement in themselves, but adapting the entire > byte compiler and all our macros to the heavily restricted semantics > they would impose would be an enormous job. The idea is that you want to make it work acceptably even if only some of the cons-cells are fat. This way, as you adapt the existing code to pay attention/preserve fat-cons-cells, your location information gets more and more precise, but even before you've done this enormous job, you already get some of the benefit. > I've tried something similar, and gave up in exhaustion. If you want "exact" results, then you'll get tired long before getting there, yes. But it's not needed. > Where does this 99.9% come from? How is this cons tracking you're > proposing supposed to work, when there are an infinite number of > occurrences of the likes of > > (cons (car form) (cdr form)) > > in our code? This still preserves info inside the fat-cons-cells contained in (car form) and (cdr form), so it's not as bad as it looks. Of course, when such code is applied recursively on all sub-expressions (i.e. in a code-walker such as macroexpand-all, cconv, and byte-opt) then we lose all the info, so we do need to change those before we can benefit, but AFAICT those 3 are the only crucial ones (there are a few other code-walkers around, such as generator.el) and hopefully some of that rewrite can be made fairly mechanically. > Are you saying that this is how other Lisp compilers deal with source > code positions? How do they deal with the difficult problem of user > macros? Not sure about Common-Lisp, but Scheme systems deal with it by distinguishing "sexp" from "syntax objects" where syntax objects are basically sexps wrapped (recursively) within location wrappers. > I think there's quite a bit of doubt as to whether this could work > effectively in Emacs. I have no doubt that it can work. I am not sure it'll be acceptable, OTOH, because it will depend on the overhead it will impose on the execution of the byte-compiler. > The way to dispel this doubt is for Somebody (tm) to implement it. Exactly. > To which the answer is to install the working solution pending the > implementation of something better, after which it can be superseded. Ever heard of temporary hacks that end up permanent? Take for example the issue of .... oh, I don't know ... line numbers in error messages? ;-) To a large extent the reason we don't have better line-numbers right now is because of the hack we accepted some years ago, so now instead of working on "giving line-numbers in error messages", we're reduced to "improve the precision of line-numbers in error messages" which is not nearly as pressing an issue. Stefan ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2020-03-22 11:26 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-03-19 15:10 GNU is looking for Google Summer of Code Projects Rocky Bernstein 2020-03-19 17:35 ` Stefan Monnier 2020-03-19 17:56 ` Andrea Corallo 2020-03-19 18:05 ` Andrea Corallo 2020-03-19 18:19 ` Rocky Bernstein 2020-03-19 21:26 ` Stefan Monnier 2020-03-19 21:45 ` Andrea Corallo 2020-03-19 23:07 ` Rocky Bernstein 2020-03-19 20:34 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Alan Mackenzie 2020-03-19 20:43 ` Andrea Corallo 2020-03-20 19:18 ` Alan Mackenzie 2020-03-21 11:22 ` Andrea Corallo 2020-03-21 15:30 ` Correct line/column numbers in byte compiler messages Alan Mackenzie 2020-03-21 16:28 ` Andrea Corallo 2020-03-21 18:37 ` Andrea Corallo 2020-03-21 20:19 ` Alan Mackenzie 2020-03-21 21:08 ` Andrea Corallo 2020-03-21 23:39 ` Andrea Corallo 2020-03-22 11:26 ` Alan Mackenzie 2020-03-19 20:56 ` Correct line/column numbers in byte compiler messages [Was: GNU is looking for Google Summer of Code Projects] Rocky Bernstein 2020-03-19 22:05 ` Stefan Monnier 2020-03-20 19:25 ` Alan Mackenzie 2020-03-19 21:41 ` Stefan Monnier 2020-03-19 22:09 ` Stefan Monnier 2020-03-20 20:10 ` Alan Mackenzie 2020-03-20 21:23 ` Rocky Bernstein 2020-03-20 21:27 ` Clément Pit-Claudel 2020-03-20 23:46 ` Stefan Monnier 2020-03-20 21:30 ` Stefan Monnier
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).