From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Dmitry Gutov" Newsgroups: gmane.emacs.bugs Subject: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions Date: Thu, 10 Oct 2024 12:17:52 +0200 Message-ID: <4f454dfa-1fab-4f97-ac19-0cc6914ca5de@app.fastmail.com> References: <87tteaznog.fsf@zephyr.silentflame.com> <87jzezzg87.fsf_-_@zephyr.silentflame.com> <37e4b3cd-6363-4f55-9921-92a1182679dc@gutov.dev> <86ttdy50ja.fsf@gnu.org> <75fe4289-da41-454d-ba92-22a92ea7002f@gutov.dev> <86frpe2186.fsf@gnu.org> <8e305b6d-8ca8-4437-990f-183ebc007d18@gutov.dev> <865xqa1ggi.fsf@gnu.org> <86ttdtzoof.fsf@gnu.org> <8d7dc133-9828-4023-821f-e4403f899f81@gutov.dev> <86ttdsxt6x.fsf@gnu.org> <52cb1caa-9e7e-45df-b328-d60948d397f6@gutov.dev> <864j5rxca1.fsf@gnu.org> <87a5fiijy9.fsf@tucano.isti.cnr.it> <86jzelvjh4.fsf@gnu.org> <8b6560a9-e2d6-42ae-ac1d-014700f21804@gutov.dev> <86wmiktzez.fsf@gnu.org> <86ldyzucdd.fsf@gnu.org> <021c625b-adc9-4e19-819c-fe929583e503@gutov.dev> <86ed4ru41x.fsf@gnu.org> <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@gutov.dev> <86ploasq35.fsf@gnu.org> <3e63f532-c6af-4923-880b-01a32cc667ec@gutov.dev> <878quwix4c.fsf@tucano.isti.cnr.it> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=df510841c8144fd398cc60b60b5dcd22 Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27212"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , 73484@debbugs.gnu.org, spwhitton@spwhitton.name To: Francesco =?UTF-8?Q?Potort=C3=AC?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 10 12:19:08 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1syqGB-0006sC-7Y for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 10 Oct 2024 12:19:07 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1syqFy-000673-Au; Thu, 10 Oct 2024 06:18:54 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1syqFv-00066r-Oc for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 06:18:52 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1syqFv-0003Bt-Gt for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 06:18:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=References:In-Reply-To:From:Date:MIME-Version:To:Subject; bh=f8KGo//n+aGXlD0U+2AaYpJFZcxPpp0nY4Gj6RFY7qw=; b=eNAonIg7TGpCqAxEgXHajgQzUhTNoJAwm8hp8Q3Gef4SYIIIsLLGG53nIABT6O/PknCvTDWrn6neh6fxt4W4hb/xC5XHqmSZnVpuHxlfdNTqKYNwYTPPQreKU2IJwJA7PqONH4t6uj7Wbg3fGon+0enqPgeK3sN1q75GLG1wKsFZNmU8h3wAqGvXhfZLH02uW7IInX4gfKtg5NBpqHBnhWYoY9mPe3PcARjMlzW06YWjjiiGNVX42xHQsW7Z8WoGSOqrck7eMv4q1vi+oSUW3g2q1e2VJs4vzXrWX/eKuLbG6nJxWkvTwLICJDgGaJFCMN9GXyBKAK8/lbjk1//cVQ==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1syqG6-0008Uf-Ay for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 06:19:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: "Dmitry Gutov" Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 10 Oct 2024 10:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73484 X-GNU-PR-Package: emacs Original-Received: via spool by 73484-submit@debbugs.gnu.org id=B73484.172855551632613 (code B ref 73484); Thu, 10 Oct 2024 10:19:02 +0000 Original-Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 10:18:36 +0000 Original-Received: from localhost ([127.0.0.1]:58757 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syqFf-0008Tw-RK for submit@debbugs.gnu.org; Thu, 10 Oct 2024 06:18:36 -0400 Original-Received: from fout-a7-smtp.messagingengine.com ([103.168.172.150]:38959) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syqFd-0008Ti-U6 for 73484@debbugs.gnu.org; Thu, 10 Oct 2024 06:18:35 -0400 Original-Received: from phl-compute-06.internal (phl-compute-06.phl.internal [10.202.2.46]) by mailfout.phl.internal (Postfix) with ESMTP id 8EE4E13805F1; Thu, 10 Oct 2024 06:18:17 -0400 (EDT) Original-Received: from phl-imap-04 ([10.202.2.82]) by phl-compute-06.internal (MEProxy); Thu, 10 Oct 2024 06:18:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1728555497; x=1728641897; bh=f8KGo//n+a GXlD0U+2AaYpJFZcxPpp0nY4Gj6RFY7qw=; b=U3cFNSgiL+trNptlK6WjX+sDfM yPIigvYBywznm+KXugN5seohY+S8RihSKF6vTkco9a50eD9MbTrNdEHhwgjn8RfP X7kg+9SixRxlK0RMIlP+jmJbOhXBfgZS2DT63+GA2GByArJHqWPczfwr6toJXGpW 6/y+nrhBE0pBJvWs+u+7m24TcK7c030MOSSSzvzggo6RP9woL+QiUSW/YBmAEB9u 8vrMsDfKwXNvCAvimVUGFbS00lIcFd8pehE/84wgo5w+VAYS01CPDyfHLfIzYuYs vyOIOw/x/Kb3GCBEEnq8cf35DNVcURaYxRkS2jBBVG3uV4Qe9yFCjYptsCmw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1728555497; x=1728641897; bh=f8KGo//n+aGXlD0U+2AaYpJFZcxP pp0nY4Gj6RFY7qw=; b=L4Vx30qQhQ3snUNbYJn9Yt/n9vBnvrjoazYiJqRXRd5i NmhjWfrufypjmmmjdetyZUOiedbpgrH5z00cniFQjOzpNM4I2eYRFrts4HusIuPe /095KiXPqkmLHwRQAadc9byv7PgaAANbePZBAC2QtN+PhTSzkmqGZiYfKQJLHtJW dDGXrGt1M6KXQGou4d6Wyy5DJYez8+0EV3I6W0BvN2c/S8TSf8mivCD1D0rqq3sE uiQeixCQKwIp0cV4PUTkYwkzsnCaFdJwGQcqn4rxgoOkKhdab1d3i9Cyh4q3QcZ3 WunzPZjZHvGHpunXK6Yr09x0EyrZABHxL19XL1AOdA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefhedgvdekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepofggfffhvfevkfgjfhfutgesrgdtreerredtjeen ucfhrhhomhepfdffmhhithhrhicuifhuthhovhdfuceoughmihhtrhihsehguhhtohhvrd guvghvqeenucggtffrrghtthgvrhhnpeekveejuddvlefgueelvefhffekudetgfelvdeu leeghfelieeghfevteegfffhffenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht ohepgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepjeefgeekgeesuggvsggsuh hgshdrghhnuhdrohhrghdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrghdprhgtphht thhopehpohhtsehgnhhurdhorhhgpdhrtghpthhtohepshhpfihhihhtthhonhesshhpfi hhihhtthhonhdrnhgrmhgv X-ME-Proxy: Feedback-ID: i07de48aa:Fastmail Original-Received: by mailuser.phl.internal (Postfix, from userid 501) id 26D982E60084; Thu, 10 Oct 2024 06:18:17 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface In-Reply-To: <878quwix4c.fsf@tucano.isti.cnr.it> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293265 Archived-At: --df510841c8144fd398cc60b60b5dcd22 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Thu, Oct 10, 2024, at 3:07 AM, Francesco Potort=C3=AC wrote: > >Here is the nested loop, which if I comment out, makes the parse fini= sh=20 > >in ~20 seconds, with all the extra files (except *.js), or in 15s whe= n=20 > >using with new flags. > > > >diff --git a/lib-src/etags.c b/lib-src/etags.c > >index a822a823a90..331e3ffe816 100644 > >--- a/lib-src/etags.c > >+++ b/lib-src/etags.c > >@@ -1697,14 +1697,14 @@ process_file_name (char *file, language *lang) > > uncompressed_name =3D file; > > } > > > >- /* If the canonicalized uncompressed name > >- has already been dealt with, skip it silently. */ > >- for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next) > >- { > >- assert (fdp->infname !=3D NULL); > >- if (streq (uncompressed_name, fdp->infname)) > >- goto cleanup; > >- } > >+ /* /\* If the canonicalized uncompressed name */ > >+ /* has already been dealt with, skip it silently. *\/ */ > >+ /* for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next) */ > >+ /* { */ > >+ /* assert (fdp->infname !=3D NULL); */ > >+ /* if (streq (uncompressed_name, fdp->infname)) */ > >+ /* goto cleanup; */ > >+ /* } */ > > > > inf =3D fopen (file, "r" FOPEN_BINARY); > > if (inf) > > > >This is basically a "uniqueness" operation using linear search, O(N^2= ). >=20 > This is only for dealing with the case when the same file exists in bo= th compressed and uncompressed form, and we are currently hitting the se= cond one. In that case, we should skip it. Yes, this is a uniqueness t= est and yes, it is O^2 in the number of file names, but I doubt that thi= s can explain a serious slowdown. Like mentioned in a previous email, I did recompile with that step remov= ed, and the slowdown was gone. The whole scan went down to ~20 seconds. --df510841c8144fd398cc60b60b5dcd22 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
On Thu, Oct 10,= 2024, at 3:07 AM, Francesco Potort=C3=AC wrote:
>Here is the nested loop, which= if I comment out, makes the parse finish 
>in ~20= seconds, with all the extra files (except *.js), or in 15s when 
>using with new flags.
>
= >diff --git a/lib-src/etags.c b/lib-src/etags.c
>ind= ex a822a823a90..331e3ffe816 100644
>--- a/lib-src/etags= .c
>+++ b/lib-src/etags.c
>@@ -1697,14= +1697,14 @@ process_file_name (char *file, language *lang)
>        uncompressed_name =3D f= ile;
>      }
&g= t;
>-  /* If the canonicalized uncompressed name
>-     has already been dealt with, = skip it silently. */
>-  for (fdp =3D fdhead; fdp = !=3D NULL; fdp =3D fdp->next)
>-    {=
>-      assert (fdp->infna= me !=3D NULL);
>-      if (str= eq (uncompressed_name, fdp->infname))
>- goto cleanu= p;
>-    }
>+  /* = /\* If the canonicalized uncompressed name */
>+  = /*    has already been dealt with, skip it silently. *\/ = */
>+  /* for (fdp =3D fdhead; fdp !=3D NULL; fdp = =3D fdp->next) */
>+  /*   { */
>+  /*     assert (fdp->infname = !=3D NULL); */
>+  /*     if (= streq (uncompressed_name, fdp->infname)) */
>+ = /* goto cleanup; */
>+  /*   } */
<= /div>
>
>    inf =3D fopen (file,= "r" FOPEN_BINARY);
>    if (inf)
>
>This is basically a "uniqueness" operatio= n using linear search, O(N^2).

This is only= for dealing with the case when the same file exists in both compressed = and uncompressed form, and we are currently hitting the second one. = ; In that case, we should skip it.  Yes, this is a uniqueness test = and yes, it is O^2 in the number of file names, but I doubt that this ca= n explain a serious slowdown.
Like mentioned = in a previous email, I did recompile with that step removed, and the slo= wdown was gone.

The whole scan went down to= ~20 seconds.
--df510841c8144fd398cc60b60b5dcd22--