From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Wilfred Hughes Newsgroups: gmane.emacs.devel Subject: Re: Proposal: stack traces with line numbers Date: Mon, 16 Oct 2017 22:51:13 +0100 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="94eb2c04382694c333055bb10204" X-Trace: blaine.gmane.org 1508190711 15476 195.159.176.226 (16 Oct 2017 21:51:51 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 16 Oct 2017 21:51:51 +0000 (UTC) Cc: emacs-devel To: John Williams Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Oct 16 23:51:45 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1e4DIT-0002uZ-Ds for ged-emacs-devel@m.gmane.org; Mon, 16 Oct 2017 23:51:41 +0200 Original-Received: from localhost ([::1]:35388 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e4DIa-0002Vf-Uo for ged-emacs-devel@m.gmane.org; Mon, 16 Oct 2017 17:51:48 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37965) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e4DIS-0002VI-CM for emacs-devel@gnu.org; Mon, 16 Oct 2017 17:51:42 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e4DIO-0001Mt-EZ for emacs-devel@gnu.org; Mon, 16 Oct 2017 17:51:40 -0400 Original-Received: from mail-qt0-x22e.google.com ([2607:f8b0:400d:c0d::22e]:47105) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1e4DIO-0001LH-77 for emacs-devel@gnu.org; Mon, 16 Oct 2017 17:51:36 -0400 Original-Received: by mail-qt0-x22e.google.com with SMTP id z50so34856668qtj.4 for ; Mon, 16 Oct 2017 14:51:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wilfred-me-uk.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=j6QsesTco5+t/C4ipQPWTrGHM0xqcC8Gb2HB/7Ozfr4=; b=LXCHCGIsNBivAm/cHJW9yu65VR+QCm5pb4ZfLPpssmttVUkr+TK5mH+590caFLO342 BlSWHO1WgGDvPX+ao3p1ydGdTeMWPzUVxKPbmiNN+rcDvfhbnrwCp72m3H6nqFfd1KAg 5oBmVU+PYwokC3vyiRQEl7mNaugdimGAhYEbFBeB3ntEbDMyJE9vvmV7MBmD7zJirRfr WOjmCajmUhlEKYxEo+otF6O7W69TMTfHniJ97YDMQpSkjuj85o1rXSVOT/Yb94I0KFmT 1GFf0hJHkC6f2SSyoS1BYa4Od2FKfQ5OfX/XubV0GoA5NqvBtL/UaVBRU3qyXqFKd3en rx1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=j6QsesTco5+t/C4ipQPWTrGHM0xqcC8Gb2HB/7Ozfr4=; b=e+GOoI8oOA8TsRrh9J2bbkdkdXDT1klf5Ia7X0PDRog/UHhK4XF6KcbfAFERLRgwmI OXnGUqaDPHs6bsV/gbWhLte4Kp+2MkC6XvHpuYU/QKpMWCXiaHZ+vvomnD3FH+DjXB5J uvI8gZkVyrCtpWT1FgJVkPKOH7ZWfHIpGbYSblBFKnb/RmYMU/Va3zY9uBXa3+xj3eZA aRbPkPhHJY/xeWkggs75avZuX1wWYUgIGvvOa0D1GJAuaZOshmn5xLiJzOK14ZX0g90o nZ0sbrkbBXhHjKnoZKV5nkXC+p0o4FXq3ybuVucoYuHgzq5+5igtIKZf+tIsOQuh9xBR 9gwA== X-Gm-Message-State: AMCzsaWhknqQIXgU+DJwi48SCJ0DEic3lM5EVXd75palh3Jv3zSso+Ws LOkzHX7ZeVkT76AsL+IKfLlBnicIbz2TI/tPJpipfA== X-Google-Smtp-Source: ABhQp+SDScZI7oF9ohAQT7m7McX4ZqcKu+/x7ABQrPVWTlMM6VlIBdLLuvy9EwIQ71yI2PPU9MU7ji87uvWHqNQVwfE= X-Received: by 10.200.15.204 with SMTP id f12mr13195563qtk.161.1508190694190; Mon, 16 Oct 2017 14:51:34 -0700 (PDT) Original-Received: by 10.200.61.73 with HTTP; Mon, 16 Oct 2017 14:51:13 -0700 (PDT) X-Originating-IP: [92.233.19.214] In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400d:c0d::22e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:219592 Archived-At: --94eb2c04382694c333055bb10204 Content-Type: text/plain; charset="UTF-8" Thanks, this is an extremely worthwhile project. When I built elisp-refs, I really missed having a reader that could report the positions of sexps. I ended up using scan-sexps: https://github.com/Wilfred/elisp-refs/blob/master/elisp-refs.el#L60-L98 and taking advantage of the fact that read moves point. Solving this properly would open up lots of opportunities for better elisp tools. On 15 October 2017 at 01:17, John Williams wrote: > Elisp is a fun language to work in, for the most part, but one thing I > find very irritating compared to other languages is that there's no > way to get a stack trace with line numbers. I'm wondering if others > feel the same way and would be open to accepting a change to add > better support for line numbers. Here's my plan: > > 1. Revise the reader to attach source references (i.e. filename, line > number, and column number) to forms as they are read. > 2. Update the byte compiler to preserve source references in compiled code. > 3. Update the debugger to display source references in backtraces > whenever possible. > 4. Add a simple API for users to retrieve a stack trace suitable for > writing to logs, etc. (There's already a stack trace API, but the > information you can get from it isn't all that useful.) > 5. Possibly add some facilities for macro authors to control the > source refs in macro expansions. I'm not sure about that part because > I believe most macros will propagate source information in a > reasonable way simply by virtue of embedding their arguments in the > expansions they generate. > > I already have a working proof of concept for the first part. What it > does is attach a vector of (file name, line number, column number) to > the head of each list as it is read. The information is "attached" > using cons cells as keys in a weak-key hash table. I also added a > little function to fetch data from the hash table so the > representation is abstracted a little bit. > > Here's my rationale for the engineering decisions I've made so far: > > - I'm using a hash table because the other alternatives I looked at > involved changing the representation of (some) cons cells, which > doesn't sound so bad until you start looking at all the > performance-critical code paths that would need to change, and all the > parts of Emacs (e.g. the garbage collector) where the low-level > representation of cons cells is handled as a special case. > > - I'm storing the information in vectors because it seems like a > reasonably efficient use of memory. Certainly better than a list. It > would be easy enough to encode all the relevant information in a > string, but then the reader would be spending time building strings > that will need to be decoded later, and I'm not sure it would help > anyway, because each string would be unique, whereas with a vector, > the same string object can be used for every reference in a file. > Adding a new primitive type would also be an option, but it hardly > seems worth the complexity to save a couple of words per source ref > when 99% of them will probably only be retained long enough to > byte-compile the code. > > - I'm saving line and column numbers rather that just byte/character > offsets, because that's what developers need, and if it wasn't saved > in that format, displaying a stack trace would involve opening the > original source code to compute that information from the file > contents. If I dropped the column numbers I could store a source ref > in a cons cell rather than a vector, but it seems like a shame to > throw away that kind of information when it's so easy to collect. (I > could even pack the line and column number into a single integer, > since I don't think it would be a big deal if there was an overflow > for an incredibly large file, or a file with very long lines, but > again, that seems like unnecessary complexity to me.) > > - I'm only attaching information to lists because only lists can be > function calls, and attaching information to things like symbols would > be problematic because every occurrence of a given symbol is > represented by the same Lisp object. Of course some lists aren't > function calls, but attaching a source ref to every list is a lot > simpler and more reliable than trying to guess which lists are > ultimately going to become function calls. > > - I'm only attaching information to the head of each list purely as a > memory-saving measure. I can't think of scenario where you'd need a > source reference for a list without having its head available, except > maybe in the expansion of a macro that disassembles its arguments and > puts them back together in a new list. If it's an issue in practice, I > think a better solution would be for the macro expander to propagate > source refs to every cons cell in a macro argument at the point where > macro expansion takes place. > > Thoughts? > > --94eb2c04382694c333055bb10204 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks, this is an extremely worthwhile project.=

When I built elisp-refs, I really missed having a reader that= could report the positions of sexps. I ended up using scan-sexps: https://github.com/Wilfred/elisp-refs/blob/master/elisp-refs.el#L60-L98 and taking advantage of the fact that read moves point.

Sol= ving this properly would open up lots of opportunities for better elisp too= ls.

On 1= 5 October 2017 at 01:17, John Williams <jrw@pobox.com> wrote:
Elisp is a fun language to work in, for the= most part, but one thing I
find very irritating compared to other languages is that there's no
way to get a stack trace with line numbers. I'm wondering if others
feel the same way and would be open to accepting a change to add
better support for line numbers. Here's my plan:

1. Revise the reader to attach source references (i.e. filename, line
number, and column number) to forms as they are read.
2. Update the byte compiler to preserve source references in compiled code.=
3. Update the debugger to display source references in backtraces
whenever possible.
4. Add a simple API for users to retrieve a stack trace suitable for
writing to logs, etc. (There's already a stack trace API, but the
information you can get from it isn't all that useful.)
5. Possibly add some facilities for macro authors to control the
source refs in macro expansions. I'm not sure about that part because I believe most macros will propagate source information in a
reasonable way simply by virtue of embedding their arguments in the
expansions they generate.

I already have a working proof of concept for the first part. What it
does is attach a vector of (file name, line number, column number) to
the head of each list as it is read. The information is "attached"= ;
using cons cells as keys in a weak-key hash table. I also added a
little function to fetch data from the hash table so the
representation is abstracted a little bit.

Here's my rationale for the engineering decisions I've made so far:=

- I'm using a hash table because the other alternatives I looked at
involved changing the representation of (some) cons cells, which
doesn't sound so bad until you start looking at all the
performance-critical code paths that would need to change, and all the
parts of Emacs (e.g. the garbage collector) where the low-level
representation of cons cells is handled as a special case.

- I'm storing the information in vectors because it seems like a
reasonably efficient use of memory. Certainly better than a list. It
would be easy enough to encode all the relevant information in a
string, but then the reader would be spending time building strings
that will need to be decoded later, and I'm not sure it would help
anyway, because each string would be unique, whereas with a vector,
the same string object can be used for every reference in a file.
Adding a new primitive type would also be an option, but it hardly
seems worth the complexity to save a couple of words per source ref
when 99% of them will probably only be retained long enough to
byte-compile the code.

- I'm saving line and column numbers rather that just byte/character offsets, because that's what developers need, and if it wasn't save= d
in that format, displaying a stack trace would involve opening the
original source code to compute that information from the file
contents. If I dropped the column numbers I could store a source ref
in a cons cell rather than a vector, but it seems like a shame to
throw away that kind of information when it's so easy to collect. (I could even pack the line and column number into a single integer,
since I don't think it would be a big deal if there was an overflow
for an incredibly large file, or a file with very long lines, but
again, that seems like unnecessary complexity to me.)

- I'm only attaching information to lists because only lists can be
function calls, and attaching information to things like symbols would
be problematic because every occurrence of a given symbol is
represented by the same Lisp object. Of course some lists aren't
function calls, but attaching a source ref to every list is a lot
simpler and more reliable than trying to guess which lists are
ultimately going to become function calls.

- I'm only attaching information to the head of each list purely as a memory-saving measure. I can't think of scenario where you'd need a=
source reference for a list without having its head available, except
maybe in the expansion of a macro that disassembles its arguments and
puts them back together in a new list. If it's an issue in practice, I<= br> think a better solution would be for the macro expander to propagate
source refs to every cons cell in a macro argument at the point where
macro expansion takes place.

Thoughts?


--94eb2c04382694c333055bb10204--