From: Joost Kremers <joostkremers@fastmail.fm>
To: Emacs development discussions. <emacs-devel@gnu.org>
Subject: Re: `format` slows down my function even though it shouldn't be called at all...
Date: Sun, 15 Dec 2024 23:31:20 +0100 [thread overview]
Message-ID: <864j34wpc7.fsf@fastmail.fm> (raw)
In-Reply-To: <86r068wrqk.fsf@fastmail.fm> (Joost Kremers's message of "Sun, 15 Dec 2024 22:39:31 +0100")
I did it again... Right after posting my message, I find the cause of the
problem. Sigh... Ignore please.
(For anyone wondering: the parser actually signals many errors, but most of
them are caught using `condition-case`...)
On Sun, Dec 15 2024, Joost Kremers wrote:
> Hi,
>
> I have a library for parsing `.bib` files[1] and today a user reported an
> issue[2] that lead me to a weird discovery. Basically, a call to `format`
> that is never even executed slows down execution tremendously.
>
> The reader part of the parser consists of a couple of functions that all
> have the following structure:
>
> ```
> (defun parsebib--chars (chars &optional noerror)
> "Read the character at point.
> CHARS is a list of characters. If the character at point matches
> a character in CHARS, return it and move point, otherwise signal
> an error, unless NOERROR is non-nil, in which case return nil."
> (parsebib--skip-whitespace)
> (if (memq (char-after) chars)
> (prog1
> (char-after)
> (forward-char 1))
> (unless noerror
> (signal 'parsebib-error (list (format "Expected one of %S, got %c at position %d,%d"
> chars
> (following-char)
> (line-number-at-pos) (current-column)))))))
> ```
>
> In short, after skipping whitespace, the reader functions try to read some
> element (character, keyword, etc. depending on the function), return it if
> it's found and signal an error if it's not found.
>
> The call to `signal` contains a call to `format` to provide a useful error
> message. It's this `format` that slows down parsing, even though no errors
> are ever signalled.
>
> This is an excerpt from a profiler report:
>
> ============================================================
> 17758 81% - parsebib--chars
> 17755 80% - if
> 17749 80% - if
> 17737 80% - signal
> 17737 80% - list
> 17515 79% format
> ============================================================
>
> This is from parsing a 28MB .bib file: The 79% of processing time spent in
> `format` seems very weird to me, given that the file contains no errors and
> no error is ever signalled.
>
> After removing the `format` calls, replacing them with a simple string, the
> parser runs much, much, much faster. To give an idea of the speed increase:
> without `format`, the 28MB .bib file I mentioned above is parsed in 1-2
> seconds (on my machine); with `format`, I don't even have enough patience
> to wait for parsing to finish... (I let it run for at least 20-30 seconds
> before interrupting it.)
>
> Anyone know what's going on here? Am I missing something, or could this be
> a bug in Emacs? (I'm running Emacs 29.4, BTW).
>
> TIA
>
> Joost
>
>
>
> Footnotes:
> [1] At https://github.com/joostkremers/parsebib
>
> [2] https://github.com/joostkremers/parsebib/issues/34
--
Joost Kremers
Life has its moments
prev parent reply other threads:[~2024-12-15 22:31 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-15 21:39 `format` slows down my function even though it shouldn't be called at all Joost Kremers
2024-12-15 22:31 ` Joost Kremers [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=864j34wpc7.fsf@fastmail.fm \
--to=joostkremers@fastmail.fm \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.