all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Joost Kremers <joostkremers@fastmail.fm>
To: Emacs development discussions. <emacs-devel@gnu.org>
Subject: Re: `format` slows down my function even though it shouldn't be called at all...
Date: Sun, 15 Dec 2024 23:31:20 +0100	[thread overview]
Message-ID: <864j34wpc7.fsf@fastmail.fm> (raw)
In-Reply-To: <86r068wrqk.fsf@fastmail.fm> (Joost Kremers's message of "Sun, 15 Dec 2024 22:39:31 +0100")

I did it again... Right after posting my message, I find the cause of the
problem. Sigh... Ignore please.

(For anyone wondering: the parser actually signals many errors, but most of
them are caught using `condition-case`...)

On Sun, Dec 15 2024, Joost Kremers wrote:
> Hi,
>
> I have a library for parsing `.bib` files[1] and today a user reported an
> issue[2] that lead me to a weird discovery. Basically, a call to `format`
> that is never even executed slows down execution tremendously.
>
> The reader part of the parser consists of a couple of functions that all
> have the following structure:
>
> ```
> (defun parsebib--chars (chars &optional noerror)
>   "Read the character at point.
> CHARS is a list of characters.  If the character at point matches
> a character in CHARS, return it and move point, otherwise signal
> an error, unless NOERROR is non-nil, in which case return nil."
>   (parsebib--skip-whitespace)
>   (if (memq (char-after) chars)
>       (prog1
>           (char-after)
>         (forward-char 1))
>     (unless noerror
>       (signal 'parsebib-error (list (format "Expected one of %S, got %c at position %d,%d"
>                                             chars
>                                             (following-char)
>                                             (line-number-at-pos) (current-column)))))))
> ```
>
> In short, after skipping whitespace, the reader functions try to read some
> element (character, keyword, etc. depending on the function), return it if
> it's found and signal an error if it's not found.
>
> The call to `signal` contains a call to `format` to provide a useful error
> message. It's this `format` that slows down parsing, even though no errors
> are ever signalled.
>
> This is an excerpt from a profiler report:
>
> ============================================================
>        17758  81%                     - parsebib--chars
>        17755  80%                      - if
>        17749  80%                       - if
>        17737  80%                        - signal
>        17737  80%                         - list
>        17515  79%                            format
> ============================================================
>
> This is from parsing a 28MB .bib file: The 79% of processing time spent in
> `format` seems very weird to me, given that the file contains no errors and
> no error is ever signalled.
>
> After removing the `format` calls, replacing them with a simple string, the
> parser runs much, much, much faster. To give an idea of the speed increase:
> without `format`, the 28MB .bib file I mentioned above is parsed in 1-2
> seconds (on my machine); with `format`, I don't even have enough patience
> to wait for parsing to finish... (I let it run for at least 20-30 seconds
> before interrupting it.)
>
> Anyone know what's going on here? Am I missing something, or could this be
> a bug in Emacs? (I'm running Emacs 29.4, BTW).
>
> TIA
>
> Joost
>
>
>
> Footnotes:
> [1]  At https://github.com/joostkremers/parsebib
>
> [2]  https://github.com/joostkremers/parsebib/issues/34

-- 
Joost Kremers
Life has its moments



      reply	other threads:[~2024-12-15 22:31 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-15 21:39 `format` slows down my function even though it shouldn't be called at all Joost Kremers
2024-12-15 22:31 ` Joost Kremers [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=864j34wpc7.fsf@fastmail.fm \
    --to=joostkremers@fastmail.fm \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.