Re: Update 1 on Bytecode Offset tracking

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Zach Shaftel <zshaftel@gmail.com>
Cc: emacs-devel@gnu.org
Subject: Re: Update 1 on Bytecode Offset tracking
Date: Fri, 17 Jul 2020 18:08:34 -0400	[thread overview]
Message-ID: <jwvd04tltmh.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <87wo31sxmu.fsf@gmail.com> (Zach Shaftel's message of "Fri, 17 Jul 2020 16:19:05 -0400")

>> While waiting for the paperwork to go through, you can prepare the patch
>> and we can start discussing it.
> Sure, does that just mean the 'git format-patch -1' emailed to
> bug-gnu-emacs@gnu.org, as mentioned in CONTRIBUTE? If that's the gist of
> it then I can do that shortly.

Pretty much, yes.  You can add some text to give extra background on the
design, the motivation for some of the choices, or ask questions about
particular details, but that's not indispensable.

You can also send an email that just refers to a branch in emacs.git.
But for the discussion to work well, it's usually better to make sure
this branch is "small" so people aren't discouraged to read the large
diff ;-)

> I was able to speed that function up to the point that it's about the
> same as one using `read`. Those functions are doing a whole lot of IO
> (reading and writing hundreds of files) so it's not really a fair
> comparison. I've done more tests with functions that just read a whole
> buffer, collecting what they read into a list. In a 9600 line file with
> just over 500 sexps, the `read` version took about ~.02-.04 seconds
> (according to `benchmark-run-compiled`), and the `source-map-read`
> version took ~.08 seconds when it didn't GC, but unlike with `read` it
> did cause a GC 10-20% of the time.

IME when the time is in the sub-second range the measurements are very
imprecise, so better measure the time to repeat the same `read` N times
so the total time is a few seconds (and since it's the same `read`,
it won't suffer from extra IO overhead).

>> For macros, OTOH, it's really fundamentally hard (or impossible, in
>> general).
> Helmut Eller mentioned before that most macros do use at least some of
> the original code in their expansion.

We can definitely hope to use some heuristics that will preserve "most"
source info for "most" existing macros, yes.
But it's still a fundamentally impossible problem in general ;-)

>> We could/should introduce some new way to define macros which
>> knows about "source code annotated with locations".
> I've wondered about this too but don't know what the right approach
> would be.

The first step is to define a `defmacro2` which works like `defmacro`
but is defined to take as arguments (and to return) annotated-sexps
instead of "bare sexps".  It'll be less convenient to use, but

In Scheme "annotated sexps" are called "syntax objects".

> I doubt anyone would want to use something like macro-cons/list/append
> etc. functions,

Scheme avoids the problem by defining additional higher-level layers,
where macros are defined in a more restrictive way using templates, so
for most macros the programmer doesn't need to use care very much about
the difference between bare sexps and syntax objects.

The main motivation for it was hygiene (the framework takes care of
adding the needed `gensym`s where applicable) rather than tracking
source-location, but fundamentally the issue is the same: an AST node is
not just some random sexp.

IOW "code and data aren't quite the same, after all" ;-)

See for example `syntax-case` https://www.gnu.org/software/guile/manual/html_node/Syntax-Case.html
Note that Scheme uses the #' notation for syntax objects.  Adapting the
example for `when` to an Elisp syntax could look like:

    (defmacro2 when (form)
      (elisp-case form
        ((_ test e e* ...) (elisp (if test (progn e e* ...))))))

[ Where I used `elisp` instead of Scheme's `syntax` since we already use
  the prefix "syntax-" for things related to syntax-tables.  ]

Notice how it's `elisp-case` which extracts `test`, `e`, and `e*` and
then it's `syntax` which builds the new chunk of code, so all the
replacement of `car` with `elisp-car` can be hidden within the definition
of `elisp-case` and `elisp`.

>> There's a lot of work on Scheme macros we could leverage for that.
> Interesting, so far I've had some difficulty finding documentation about
> how other Lisps track source locations.

It's not really discussed, but the distinction between "sexp" and
"syntax object" is the key.  It's largely not discussed because Scheme
macros have never officially included the equivalent of `defmacro`
operating on raw sexps, so they've never really had to deal with the
issue (tho Gambit does provide a `define-macro` which operates like our
`defmacro` but it's rarely used so Gambit just punts on the
source-location issue in that case).

        Stefan

next prev parent reply	other threads:[~2020-07-17 22:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15 23:10 Update 1 on Bytecode Offset tracking Zach Shaftel
2020-07-16  3:55 ` Stefan Monnier
2020-07-16 22:45   ` Zach Shaftel
2020-07-17  3:44     ` Eli Zaretskii
2020-07-17 16:20     ` Stefan Monnier
2020-07-17 20:19       ` Zach Shaftel
2020-07-17 22:08         ` Stefan Monnier [this message]
2020-07-18 21:41           ` Zach Shaftel
2020-07-19  2:34             ` Stefan Monnier
2020-07-21  0:28               ` Zach Shaftel
2020-07-21  2:51                 ` Stefan Monnier
2020-07-16  7:25 ` Andrea Corallo via Emacs development discussions.
2020-07-17  0:24   ` Zach Shaftel
2020-07-17 13:47     ` Rocky Bernstein
2020-07-28 19:19 ` Update 2 " Zach Shaftel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvd04tltmh.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    --cc=zshaftel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.