all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Filipp Gunbin <fgunbin@fastmail.fm>
To: 51620@debbugs.gnu.org
Subject: bug#51620: add-hook repeatedly adds same function into hook--depth-alist
Date: Sat, 06 Nov 2021 02:26:32 +0300	[thread overview]
Message-ID: <m2wnlmi3pz.fsf@fastmail.fm> (raw)


I have the following case: in my "javaimp" package I have code which
scans java files "in the same project":

(defun javaimp--get-directory-classes (dir)
  (when (file-accessible-directory-p dir)
    (seq-mapcat #'javaimp--get-file-classes
                (seq-filter (lambda (file)
                              (not (file-symlink-p file)))
                            (directory-files-recursively dir "\\.java\\'")))))

There're some perfomance problems to fix, so I started with measuring
time via M-x benchmark.  Surpisingly, I saw that with each run (in a
large project) the time increased by 8 seconds or so.  Profiling lead
me to add-hook and the fix made in bug#46326.

In short, javaimp--get-file-classes visits file in temp buffer and
uses syntax-ppss to parse Java code.  On a large project, this is done
many times, and next invocation of javaimp--get-directory-classes does
everything again (this is what I wanted to fix, as well as look at
fewer things during parsing).  So I stumbled into problem which
perhaps goes unnoticed in normal file editing, where you don't process
tens of thousands of files with syntax-ppss (which calls add-hook)
repeatedly.

For reference, syntax.el does this:
		(add-hook 'before-change-functions
			  #'syntax-ppss-flush-cache
                          ;; We should be either the very last function on
                          ;; before-change-functions or the very first on
                          ;; after-change-functions.


This is what I get when I run my test:

(length (get 'before-change-functions 'hook--depth-alist)) => 58063
<call javaimp--get-directory-classes on a large project>
(length (get 'before-change-functions 'hook--depth-alist)) => 65303

All elements are `(syntax-ppss-flush-cache . 99)'.


A simple reproducer:
- $ echo 'print("Hello world!");' > /tmp/hello.py
- emacs -Q
- C-x C-f /tmp/hello.py
- M-: (length (get 'before-change-functions 'hook--depth-alist)), observe number N
- revisit the same file via C-x C-v
- M-: (length (get 'before-change-functions 'hook--depth-alist)), observe number N+1
- on each revisit the number increments


Next, to the code.
There was this change in the patch for bug#46326:

>         (when (or (get hook 'hook--depth-alist) (not (zerop depth)))
>           ;; Note: The main purpose of the above `when' test is to avoid running
>           ;; this `setf' before `gv' is loaded during bootstrap.
> -        (setf (alist-get function (get hook 'hook--depth-alist)
> -                         0 'remove #'equal)
> -              depth))
> +        (push (cons function depth) (get hook 'hook--depth-alist)))

(Probably the comment and the first test in "or" should have been
removed with the change, but I'm not suggesting that because I'm
suggesting restoring setf)

setf with #'eq test would be a better option than push, because it won't
repeatedly add the same (as in "eq") element, if we reach this code
somehow.

In our case we reach this code for each new buffer, because the check
is:

    (unless (member function hook-value)

and `before-change-functions' is of course buffer-local.  So we keep
pushing elements into (get hook 'hook--depth-alist) for each new buffer.

And, unrelated to this, I fail to understand why copy-sequence is here
in the code further down in add-hook, could someone please explain?

          (setq hook-value
                (sort (if (< 0 depth) hook-value (copy-sequence hook-value))


I suggest this one-liner, which fixes the problem for me, however I
certainly need someone (Stefan M.?) to look at this.

TIA, Filipp

diff --git a/lisp/subr.el b/lisp/subr.el
index 8ff403e113..2b8b6deeb0 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -1868,7 +1868,7 @@ add-hook
       (when (or (get hook 'hook--depth-alist) (not (zerop depth)))
         ;; Note: The main purpose of the above `when' test is to avoid running
         ;; this `setf' before `gv' is loaded during bootstrap.
-        (push (cons function depth) (get hook 'hook--depth-alist)))
+        (setf (alist-get function (get hook 'hook--depth-alist) 0) depth))
       (setq hook-value
 	    (if (< 0 depth)
 		(append hook-value (list function))



In GNU Emacs 28.0.60 (build 4, x86_64-apple-darwin20.6.0, NS appkit-2022.60 Version 11.6 (Build 20G165))
 of 2021-11-05 built on fgunbin.local
Repository revision: d8c9a9dc23e0c6f38c5138cb8fbb4109a5729a35
Repository branch: emacs-28
System Description:  macOS 11.6

Configured using:
 'configure --enable-check-lisp-object-type --with-file-notification=no'

Configured features:
ACL GLIB GNUTLS LCMS2 LIBXML2 MODULES NS PDUMPER PNG RSVG THREADS
TOOLKIT_SCROLL_BARS XIM ZLIB





             reply	other threads:[~2021-11-05 23:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-05 23:26 Filipp Gunbin [this message]
2021-11-06  0:13 ` bug#51620: add-hook repeatedly adds same function into hook--depth-alist Michael Heerdegen
2021-11-06 20:31   ` Filipp Gunbin
2021-11-11 16:55 ` bug#51620: control message for bug #51620 Filipp Gunbin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2wnlmi3pz.fsf@fastmail.fm \
    --to=fgunbin@fastmail.fm \
    --cc=51620@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.