Thanks, this is an extremely worthwhile project. When I built elisp-refs, I really missed having a reader that could report the positions of sexps. I ended up using scan-sexps: https://github.com/Wilfred/elisp-refs/blob/master/elisp-refs.el#L60-L98 and taking advantage of the fact that read moves point. Solving this properly would open up lots of opportunities for better elisp tools. On 15 October 2017 at 01:17, John Williams wrote: > Elisp is a fun language to work in, for the most part, but one thing I > find very irritating compared to other languages is that there's no > way to get a stack trace with line numbers. I'm wondering if others > feel the same way and would be open to accepting a change to add > better support for line numbers. Here's my plan: > > 1. Revise the reader to attach source references (i.e. filename, line > number, and column number) to forms as they are read. > 2. Update the byte compiler to preserve source references in compiled code. > 3. Update the debugger to display source references in backtraces > whenever possible. > 4. Add a simple API for users to retrieve a stack trace suitable for > writing to logs, etc. (There's already a stack trace API, but the > information you can get from it isn't all that useful.) > 5. Possibly add some facilities for macro authors to control the > source refs in macro expansions. I'm not sure about that part because > I believe most macros will propagate source information in a > reasonable way simply by virtue of embedding their arguments in the > expansions they generate. > > I already have a working proof of concept for the first part. What it > does is attach a vector of (file name, line number, column number) to > the head of each list as it is read. The information is "attached" > using cons cells as keys in a weak-key hash table. I also added a > little function to fetch data from the hash table so the > representation is abstracted a little bit. > > Here's my rationale for the engineering decisions I've made so far: > > - I'm using a hash table because the other alternatives I looked at > involved changing the representation of (some) cons cells, which > doesn't sound so bad until you start looking at all the > performance-critical code paths that would need to change, and all the > parts of Emacs (e.g. the garbage collector) where the low-level > representation of cons cells is handled as a special case. > > - I'm storing the information in vectors because it seems like a > reasonably efficient use of memory. Certainly better than a list. It > would be easy enough to encode all the relevant information in a > string, but then the reader would be spending time building strings > that will need to be decoded later, and I'm not sure it would help > anyway, because each string would be unique, whereas with a vector, > the same string object can be used for every reference in a file. > Adding a new primitive type would also be an option, but it hardly > seems worth the complexity to save a couple of words per source ref > when 99% of them will probably only be retained long enough to > byte-compile the code. > > - I'm saving line and column numbers rather that just byte/character > offsets, because that's what developers need, and if it wasn't saved > in that format, displaying a stack trace would involve opening the > original source code to compute that information from the file > contents. If I dropped the column numbers I could store a source ref > in a cons cell rather than a vector, but it seems like a shame to > throw away that kind of information when it's so easy to collect. (I > could even pack the line and column number into a single integer, > since I don't think it would be a big deal if there was an overflow > for an incredibly large file, or a file with very long lines, but > again, that seems like unnecessary complexity to me.) > > - I'm only attaching information to lists because only lists can be > function calls, and attaching information to things like symbols would > be problematic because every occurrence of a given symbol is > represented by the same Lisp object. Of course some lists aren't > function calls, but attaching a source ref to every list is a lot > simpler and more reliable than trying to guess which lists are > ultimately going to become function calls. > > - I'm only attaching information to the head of each list purely as a > memory-saving measure. I can't think of scenario where you'd need a > source reference for a list without having its head available, except > maybe in the expansion of a macro that disassembles its arguments and > puts them back together in a new list. If it's an issue in practice, I > think a better solution would be for the macro expander to propagate > source refs to every cons cell in a macro argument at the point where > macro expansion takes place. > > Thoughts? > >