Re: Update 1 on Bytecode Offset tracking

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Zach Shaftel <zshaftel@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: Update 1 on Bytecode Offset tracking
Date: Thu, 16 Jul 2020 18:45:00 -0400	[thread overview]
Message-ID: <87blkfoz9v.fsf@gmail.com> (raw)
In-Reply-To: <jwvpn8wt9w5.fsf-monnier+emacs@gnu.org>

Hi Stefan,

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> The second branch saves the offset only before a call. Therefore, the
>> traceback on all of the functions other than the current one are
>> accurate, but the current one is not accurate if the error happens in
>> a byte op.
>
> IIUC this has a negligible performance impact.  The info it provides in
> not 100% accurate, but I think it's a "sweet spot": it does provide the
> byte-offset info and is cheap enough to be acceptable into `master` with
> no real downside.

Great! I just followed up on my copyright assignment as I still haven't
finished that process. I don't know whether this could be exempt or if
Rocky's assignment is sufficient, but hopefully I will hear back from
copyright-clerk soon.

> I'd look at it as a "step" along the way: subsequent steps can be to
> make use of that info, or to improve the accuracy of that info.

Absolutely, it would be great to have that in place as a basis for
further improvement.

>> The third branch bypasses invoking Ffuncall from within
>> exec_byte_code, and instead does essentially the same thing that
>> Ffuncall does, right in the Bcall ops.
>
> This would be useful in its own right.
> So I suggest you try and get this code into shape for `master` as well.

I will definitely continue work on this.

> I expect this will tend to suffer from some amount of code duplication.
> Maybe we can avoid it via refactoring, or maybe by "clever" macro
> tricks, but if the speedup is important enough, we can probably live
> with some amount of duplication.

That seems to be the case. I'll keep looking to see if there's any low
hanging fruit in terms of splitting up the funcall logic without slowing
things down. More testing is necessary, but if a moderate chunk of
duplicated code is acceptable then there may not be as much work needed
on that branch as I had thought.

>> All of them print the offset next to function names in the backtrace like this:
>>
>>
>> Debugger entered--Lisp error: (wrong-type-argument stringp t)
>>        string-match(t t nil)
>>     13 test-condition-case()
>>        load("/home/zach/.repos/bench-compare.el/test/test-debug...")
>>     78 byte-recompile-file("/home/zach/.repos/bench-compare.el/test/test-debug..." nil 0 t)
>>     35 emacs-lisp-byte-compile-and-load()
>>        funcall-interactively(emacs-lisp-byte-compile-and-load)
>>        call-interactively(emacs-lisp-byte-compile-and-load record nil)
>>    101 command-execute(emacs-lisp-byte-compile-and-load record)
>
> Cool!
>
>> With respect to reporting offsets, using code from edebug we have
>> a Lisp-Expression reader that will track source-code locations and
>> store the information in a source-map-expression cl-struct.  The code
>> in progress is here.
>
> How does the performance of this code compare to that of the "native" `read?

Rough tests indicate it's about three times slower. Running it on all
274 files in the `lisp` directory of the GNU Emacs sources takes ~11-12
seconds (after removing the string from the struct and not
pretty-printing). A similar function which just calls `read` takes ~4
seconds. There are probably ways to further improve the performance of
`source-map-read`, but I don't know much more speed can realistically be
gained.

> And to put it into perspective, have you looked at the relative
> proportion of time spent in `read` during a "typical" byte compilation?

I have not yet, but I'll evaluate that and keep it in mind.

> There's no doubt that preserving source code information will slow down
> byte-compilation but depending on how slow it gets we may find it's not
> "worth it".
>
>> Information currently saved is:
>>
>> * The expression itself
>> * The exact string that was read
>> * Begin and end points of the sexp in the buffer
>> * source-map-expression children (for conses and vectors)
>
> Sounds like a lot of information, which in turn implies a potentially
> high overhead (e.g. the "exact string" sounds like it might cost O(N²)
> in corner cases, yet provides redundant info that can be recovered from
> begin+end points).

Removing the string did improve performance, but not by as much as I
expected. The function that constructs the tree of "children" may be
slower than it needs to be, so I'll look into improving that. It may not
be necessary to create the children for vectors since they're constants
(outside of backquote, at least).

>Note also that while `read` returns a sexp made exclusively of data
>coming from a particular buffer, the code after macro-expansion can
>include chunks coming from other buffers, so if we want to keep the
>same representation of "sexp with extra info" in both cases, we can't
>just assume "the buffer".

Yes, and it won't be easy to maintain the read locations across
macroexpansion, byte-opt and cconv. It's tough to say at this point how
much the final product will slow down compilation, but I suspect it will
be significant.

>         Stefan

next prev parent reply	other threads:[~2020-07-16 22:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15 23:10 Update 1 on Bytecode Offset tracking Zach Shaftel
2020-07-16  3:55 ` Stefan Monnier
2020-07-16 22:45   ` Zach Shaftel [this message]
2020-07-17  3:44     ` Eli Zaretskii
2020-07-17 16:20     ` Stefan Monnier
2020-07-17 20:19       ` Zach Shaftel
2020-07-17 22:08         ` Stefan Monnier
2020-07-18 21:41           ` Zach Shaftel
2020-07-19  2:34             ` Stefan Monnier
2020-07-21  0:28               ` Zach Shaftel
2020-07-21  2:51                 ` Stefan Monnier
2020-07-16  7:25 ` Andrea Corallo via Emacs development discussions.
2020-07-17  0:24   ` Zach Shaftel
2020-07-17 13:47     ` Rocky Bernstein
2020-07-28 19:19 ` Update 2 " Zach Shaftel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87blkfoz9v.fsf@gmail.com \
    --to=zshaftel@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.