Thanks, this is an extremely worthwhile project.

When I built elisp-refs, I really missed having a reader that could report
the positions of sexps. I ended up using scan-sexps:
https://github.com/Wilfred/elisp-refs/blob/master/elisp-refs.el#L60-L98 and
taking advantage of the fact that read moves point.

Solving this properly would open up lots of opportunities for better elisp
tools.

On 15 October 2017 at 01:17, John Williams <jrw@pobox.com> wrote:

> Elisp is a fun language to work in, for the most part, but one thing I
> find very irritating compared to other languages is that there's no
> way to get a stack trace with line numbers. I'm wondering if others
> feel the same way and would be open to accepting a change to add
> better support for line numbers. Here's my plan:
>
> 1. Revise the reader to attach source references (i.e. filename, line
> number, and column number) to forms as they are read.
> 2. Update the byte compiler to preserve source references in compiled code.
> 3. Update the debugger to display source references in backtraces
> whenever possible.
> 4. Add a simple API for users to retrieve a stack trace suitable for
> writing to logs, etc. (There's already a stack trace API, but the
> information you can get from it isn't all that useful.)
> 5. Possibly add some facilities for macro authors to control the
> source refs in macro expansions. I'm not sure about that part because
> I believe most macros will propagate source information in a
> reasonable way simply by virtue of embedding their arguments in the
> expansions they generate.
>
> I already have a working proof of concept for the first part. What it
> does is attach a vector of (file name, line number, column number) to
> the head of each list as it is read. The information is "attached"
> using cons cells as keys in a weak-key hash table. I also added a
> little function to fetch data from the hash table so the
> representation is abstracted a little bit.
>
> Here's my rationale for the engineering decisions I've made so far:
>
> - I'm using a hash table because the other alternatives I looked at
> involved changing the representation of (some) cons cells, which
> doesn't sound so bad until you start looking at all the
> performance-critical code paths that would need to change, and all the
> parts of Emacs (e.g. the garbage collector) where the low-level
> representation of cons cells is handled as a special case.
>
> - I'm storing the information in vectors because it seems like a
> reasonably efficient use of memory. Certainly better than a list. It
> would be easy enough to encode all the relevant information in a
> string, but then the reader would be spending time building strings
> that will need to be decoded later, and I'm not sure it would help
> anyway, because each string would be unique, whereas with a vector,
> the same string object can be used for every reference in a file.
> Adding a new primitive type would also be an option, but it hardly
> seems worth the complexity to save a couple of words per source ref
> when 99% of them will probably only be retained long enough to
> byte-compile the code.
>
> - I'm saving line and column numbers rather that just byte/character
> offsets, because that's what developers need, and if it wasn't saved
> in that format, displaying a stack trace would involve opening the
> original source code to compute that information from the file
> contents. If I dropped the column numbers I could store a source ref
> in a cons cell rather than a vector, but it seems like a shame to
> throw away that kind of information when it's so easy to collect. (I
> could even pack the line and column number into a single integer,
> since I don't think it would be a big deal if there was an overflow
> for an incredibly large file, or a file with very long lines, but
> again, that seems like unnecessary complexity to me.)
>
> - I'm only attaching information to lists because only lists can be
> function calls, and attaching information to things like symbols would
> be problematic because every occurrence of a given symbol is
> represented by the same Lisp object. Of course some lists aren't
> function calls, but attaching a source ref to every list is a lot
> simpler and more reliable than trying to guess which lists are
> ultimately going to become function calls.
>
> - I'm only attaching information to the head of each list purely as a
> memory-saving measure. I can't think of scenario where you'd need a
> source reference for a list without having its head available, except
> maybe in the expansion of a macro that disassembles its arguments and
> puts them back together in a new list. If it's an issue in practice, I
> think a better solution would be for the macro expander to propagate
> source refs to every cons cell in a macro argument at the point where
> macro expansion takes place.
>
> Thoughts?
>
>