all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Francesco Potortì" <pot@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: dmitry@gutov.dev, 73484@debbugs.gnu.org, spwhitton@spwhitton.name
Subject: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
Date: Thu, 10 Oct 2024 16:25:28 +0200	[thread overview]
Message-ID: <874j5khw6f.fsf@tucano.isti.cnr.it> (raw)
In-Reply-To: <86y12w1hjp.fsf@gnu.org> (eliz@gnu.org)

>> I had a quick look at the whole code and in fact the only place I can find where ou have O^2 behaviour seems to be file name comparison, and it still looks so strange to me that this can in facrt cause significant delay.
>
>We are using etags on a huge tree: about 375K files.  I think that's
>the reason, because non-linear behaviors are like that: they are
>insignificant with small sets, but huge with larger ones...
>
>Profiles don't lie...

Ok, makes sense.  I must have missed the number of files in your previous explanations, sorry.  The only other place where I found O^2 behaviour is when managing #line directives, but you already tried to disable them without much change.  So let's concentrate on file name comparison which is done in process_file_name at

  for (fdp = fdhead; fdp != NULL; fdp = fdp->next)
    {
      assert (fdp->infname != NULL);
      if (streq (uncompressed_name, fdp->infname))
	goto cleanup;
    }

This is a simple O^2 comparison, which is repeated sum(1,N,N-1)=~N^2/2, which for ~375k files means ~70G comparisons.  If you can count the number of times streq is called and 70G is a substantial portion of that number, then we have the culprit.  To check, just remove the above test and see if the running time drops.

In that case, using a hash rather than a comparison would probably make sense.  Alternatively, rather than managing file names in a single loop, do a first loop on all file names to canonicalise them, but without searching for tags (essentially, remove the call to process_file from process_file_name), then uniquify the list of canonicalised file names, then run process_file on them.





  reply	other threads:[~2024-10-10 14:25 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-20  9:20 etags-regen-mode: handling extensionless files Sean Whitton
2024-09-20 18:23 ` Dmitry Gutov
2024-09-22 12:02   ` Sean Whitton
2024-09-23 17:00     ` Dmitry Gutov
2024-09-25  6:21       ` Sean Whitton
2024-09-25 11:41         ` Dmitry Gutov
2024-09-25 19:27           ` bug#73484: 31.0.50; Abolishing etags-regen-file-extensions Sean Whitton
2024-09-25 22:30             ` Dmitry Gutov
2024-09-26  7:43               ` Francesco Potortì
2024-09-26 12:18                 ` Dmitry Gutov
2024-09-29  8:25               ` Eli Zaretskii
2024-09-29 10:56                 ` Eli Zaretskii
2024-09-29 17:15                   ` Francesco Potortì
2024-09-30 23:19                 ` Dmitry Gutov
2024-10-01 15:00                   ` Eli Zaretskii
2024-10-01 22:01                     ` Dmitry Gutov
2024-10-02 11:28                   ` Eli Zaretskii
2024-10-02 18:00                     ` Dmitry Gutov
2024-10-02 18:56                       ` Eli Zaretskii
2024-10-02 22:03                         ` Dmitry Gutov
2024-10-03  6:27                           ` Eli Zaretskii
2024-10-04  1:25                             ` Dmitry Gutov
2024-10-04  6:45                               ` Eli Zaretskii
2024-10-04 23:01                                 ` Dmitry Gutov
2024-10-05  7:02                                   ` Eli Zaretskii
2024-10-05 14:29                                     ` Dmitry Gutov
2024-10-05 15:27                                       ` Eli Zaretskii
2024-10-05 20:27                                         ` Dmitry Gutov
2024-10-05 16:38                                       ` Francesco Potortì
2024-10-05 17:12                                         ` Eli Zaretskii
2024-10-06  0:56                                         ` Dmitry Gutov
2024-10-06  6:22                                           ` Eli Zaretskii
2024-10-06 19:14                                             ` Dmitry Gutov
2024-10-07  2:33                                               ` Eli Zaretskii
2024-10-07  7:11                                                 ` Dmitry Gutov
2024-10-07 16:05                                                   ` Eli Zaretskii
2024-10-07 17:36                                                     ` Dmitry Gutov
2024-10-07 19:05                                                       ` Eli Zaretskii
2024-10-07 22:08                                                         ` Dmitry Gutov
2024-10-08 13:04                                                           ` Eli Zaretskii
2024-10-09 18:23                                                             ` Dmitry Gutov
2024-10-09 19:11                                                               ` Eli Zaretskii
2024-10-09 22:22                                                                 ` Dmitry Gutov
2024-10-10  5:13                                                                   ` Eli Zaretskii
2024-10-10  1:07                                                               ` Francesco Potortì
2024-10-10  5:41                                                                 ` Eli Zaretskii
2024-10-10  8:27                                                                   ` Francesco Potortì
2024-10-10  8:35                                                                     ` Eli Zaretskii
2024-10-10 14:25                                                                       ` Francesco Potortì [this message]
2024-10-10 16:28                                                                         ` Eli Zaretskii
2024-10-11 10:37                                                                           ` Francesco Potortì
2024-10-10 10:17                                                                 ` Dmitry Gutov
2024-10-10  1:39                                                               ` Francesco Potortì
2024-10-10  5:45                                                                 ` Eli Zaretskii
2024-09-25 12:10         ` etags-regen-mode: handling extensionless files Eli Zaretskii
2024-09-25 21:19           ` Francesco Potortì
2024-09-26  6:22             ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874j5khw6f.fsf@tucano.isti.cnr.it \
    --to=pot@gnu.org \
    --cc=73484@debbugs.gnu.org \
    --cc=dmitry@gutov.dev \
    --cc=eliz@gnu.org \
    --cc=spwhitton@spwhitton.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.