unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: John Williams <jrw@pobox.com>
To: Helmut Eller <eller.helmut@gmail.com>
Cc: emacs-devel@gnu.org
Subject: Re: Proposal: stack traces with line numbers
Date: Wed, 18 Oct 2017 08:00:42 -0700	[thread overview]
Message-ID: <CAEdRJLCJE8E2gcgbiQnKnnNM=KeSdE_0P8nkuCZ_65H7yny2PA@mail.gmail.com> (raw)
In-Reply-To: <CAEdRJLBZ5QaJF+b05GdBrZK80ijQcaHW+FSAps0ME0iQ+5qhBA@mail.gmail.com>

I just remembered something else about weak hash maps that I forgot to
mention earlier. Most people who try to use weak references eventually
get burned by surprising nondeterministic behavior, but the design I'm
proposing has an interesting property I first learned about in the
context of the JavaScript WeakMap type: as long as the map's :test is
'eq and :weakness is 'key, and the map is only ever accessed using
gethash and puthash, nondeterministic behavior is impossible to
observe, because a given entry becomes eligible for garbage collection
precisely when it is no longer accessible using gethash. This property
allows you to effectively add a new field to certain instances of an
existing type without altering the type itself.

On Mon, Oct 16, 2017 at 3:43 PM, John Williams <jrw@pobox.com> wrote:
> On Sun, Oct 15, 2017 at 3:01 AM, Helmut Eller <eller.helmut@gmail.com> wrote:
>> On Sat, Oct 14 2017, John Williams wrote:
>>> The information is "attached"
>>> using cons cells as keys in a weak-key hash table.  [...]
>>
>> Unless you care about interpreted code, a non-weak hash-table should be
>> enough.  I think this hash table should work similar to
>> read-symbol-positions-list.
>
> What I had in mind was a single global hashtable, because that way
> it's easy to make it look as if the source refs are physically part of
> the annotated cons cells, and users of the API don't need to be aware
> that a supplementary data structure even exists. But of course using a
> global hashtable with strong keys would create a huge space leak in
> the reader.
>
> Is there any particular disadvantage to using weak keys?
>
>>> - I'm storing the information in vectors because it seems like a
>>> reasonably efficient use of memory. [...]
>>
>> It's debatable whether a [file line column] vector is an efficent
>> representation.  E.g. all lists in a source form come from the same file
>> (or buffer or string) so storing the same filename many times seems
>> redundant.  It might also be reasonable to use different representations
>> in the debug info than for the data-structures used by the reader or
>> compiler.
>
> The file name would be a single string object shared by every ref in a
> given file (or nil when there is no file), so we'd only be saving a
> few words per source ref (one for the string itself, plus one or two
> saved by using a cons cell instead of a two-element vector.)
>
>>> - I'm saving line and column numbers rather that just byte/character
>>> offsets [...]
>>
>> Line/column pairs have the (minor) advantage that line numbers have a
>> higher porbability to stay the same after small edits to the source.
>> But other than that, it seems to me that character offsets encode the
>> same information more compactly.
>
> It seems like we may be talking about different things. I'm speaking
> strictly about the in-memory representation produced by the reader,
> which will be quickly garbage-collected in most cases (assuming most
> elisp code is compiled, either as part of the Emacs build processes,
> or by the package manager). I haven't even thought about how to
> represent the same information in bytecode, but I assume it will be
> quite different, and more focused on compactness.
>
>>> - I'm only attaching information to the head of each list purely as a
>>> memory-saving measure. I can't think of scenario where you'd need a
>>> source reference for a list without having its head available, except
>>> maybe in the expansion of a macro that disassembles its arguments and
>>> puts them back together in a new list.  If it's an issue in practice,
>>
>> In Lisp almost everything is a macro, so I bet that this is an issue.
>
> Maybe. From what I can tell, most function calls in macro arguments
> are copied directly into the expansion, so no important information
> would be lost in the expansion process, except for a few outliers like
> iter-defun, which appears to completely re-assemble the code it's
> given. Attaching the same information to every cons cell wouldn't be
> difficult, though. Every cell in a given list could share the same
> source ref, so the main overhead would be the extra hash table
> entries. My guess is that doing so would roughly double or even triple
> the average memory footprint of a cons cell produced by the reader,
> but I don't think that would be a problem unless you're trying to run
> Emacs on an embedded platform, and it's a feature that could be easily
> compiled out or disabled at runtime if necessary.
>
>>> I think a better solution would be for the macro expander to propagate
>>> source refs to every cons cell in a macro argument at the point where
>>> macro expansion takes place.
>>
>> It's clearly desirable that source positions are propagated
>> automatically as often as possible.  That job will be easier if the
>> reader records more information.
>
> It would definitely be simpler. I'll defer to others' opinions
> regarding the relative merits of each approach.
>
>> So, I think the reader should, at least optionally, also record
>> positions of every cons cell not just the first in a list.  Also, in
>> addition to the start position the reader could/should also record the
>> end position.
>
> That would not be difficult to implement.



  parent reply	other threads:[~2017-10-18 15:00 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-15  0:17 Proposal: stack traces with line numbers John Williams
2017-10-15  1:54 ` Daniele Nicolodi
2017-10-15  2:42 ` raman
2017-10-15  3:20   ` Noam Postavsky
2017-10-15  3:40 ` Robert Weiner
2017-10-15 10:01 ` Helmut Eller
2017-10-15 16:20   ` Stefan Monnier
2017-10-16 22:43   ` John Williams
2017-10-17  0:00     ` Helmut Eller
2017-10-18 15:00     ` John Williams [this message]
2017-10-16  1:53 ` Richard Stallman
2017-10-16 21:51 ` Wilfred Hughes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEdRJLCJE8E2gcgbiQnKnnNM=KeSdE_0P8nkuCZ_65H7yny2PA@mail.gmail.com' \
    --to=jrw@pobox.com \
    --cc=eller.helmut@gmail.com \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).