From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions Date: Mon, 07 Oct 2024 19:05:50 +0300 Message-ID: <86ldyzucdd.fsf@gnu.org> References: <87tteaznog.fsf@zephyr.silentflame.com> <8734lrrj4e.fsf@zephyr.silentflame.com> <87o74c1ce1.fsf@zephyr.silentflame.com> <87jzezzg87.fsf_-_@zephyr.silentflame.com> <37e4b3cd-6363-4f55-9921-92a1182679dc@gutov.dev> <86ttdy50ja.fsf@gnu.org> <75fe4289-da41-454d-ba92-22a92ea7002f@gutov.dev> <86frpe2186.fsf@gnu.org> <8e305b6d-8ca8-4437-990f-183ebc007d18@gutov.dev> <865xqa1ggi.fsf@gnu.org> <86ttdtzoof.fsf@gnu.org> <8d7dc133-9828-4023-821f-e4403f899f81@gutov.dev> <86ttdsxt6x.fsf@gnu.org> <52cb1caa-9e7e-45df-b328-d60948d397f6@gutov.dev> <864j5rxca1.fsf@gnu.org> <87a5fiijy9.fsf@tucano.isti.cnr.it> <86jzelvjh4.fsf@gnu.org> <8b6560a9-e2d6-42ae-ac1d-014700f21804@gutov.dev> <86wmiktzez.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3780"; mail-complaints-to="usenet@ciao.gmane.io" Cc: pot@gnu.org, 73484@debbugs.gnu.org, spwhitton@spwhitton.name To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Oct 07 18:07:22 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sxqGR-0000di-Fw for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 07 Oct 2024 18:07:15 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sxqGC-0008Ho-Pu; Mon, 07 Oct 2024 12:07:01 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sxqG8-0008HI-JM for bug-gnu-emacs@gnu.org; Mon, 07 Oct 2024 12:06:57 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sxqG6-00058s-Tx for bug-gnu-emacs@gnu.org; Mon, 07 Oct 2024 12:06:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=References:In-Reply-To:From:Date:To:Subject; bh=eOj7ib1xcnVUEa0J2c28m7DnbbILfWwjTgis1icLJq0=; b=Ckix7Jf1LBZuEISkTJbDn7NDfWomIfW7yJRs5Gxns9EnFKGJcZ0I9vuX/qLSScZE6F3+tMDs4iu5GzBAwopHWKBv+J1BsHJoY/upMVWKlSwbirQ8DsRvWBJg1+Pj/JRuAgBCLRzFJQB5MmIdLtdy72MurJAafgf3zozx1KOhYyLDteNy3zUD6gDjem+LyBGMHjxX/x1pW2vvYt1BkVHQGymVoTKhHEcVpR5TLoTwouY3tDajexWldmNYffAoTCmj8xoeBIX3hiNF8u5oCorDMJff8fg429N9ujcxyvAwPYhObl6EU0oAW9CmcbMqF8OpUbu2qvBg3YYFjNL2P3CXqg==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1sxqGD-0004lx-VO for bug-gnu-emacs@gnu.org; Mon, 07 Oct 2024 12:07:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 07 Oct 2024 16:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73484 X-GNU-PR-Package: emacs Original-Received: via spool by 73484-submit@debbugs.gnu.org id=B73484.172831717118275 (code B ref 73484); Mon, 07 Oct 2024 16:07:01 +0000 Original-Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 16:06:11 +0000 Original-Received: from localhost ([127.0.0.1]:47514 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sxqFO-0004kh-Hs for submit@debbugs.gnu.org; Mon, 07 Oct 2024 12:06:11 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:57662) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sxqFL-0004kQ-6u for 73484@debbugs.gnu.org; Mon, 07 Oct 2024 12:06:08 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sxqF7-0004wV-Vc; Mon, 07 Oct 2024 12:05:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=eOj7ib1xcnVUEa0J2c28m7DnbbILfWwjTgis1icLJq0=; b=ZnIp0956pFQT YJweGcVKdWditmZR2ytJ8wFvgXHgIouU91mcXAV9aLf0pwuDo6mEjbXMXF/U/P0lLw3nBzMOI+Efw aCB5wQETVlfyFSqexuAlAjjNJ5lU3lermIwNsZn/jLmWL70OKGDm6B05OtOn+7ju75obRvOV5uEZ+ 1Zmpakj4Y9aMYjzsoGEiGTQ1CoQVbmv/uFqq4fNRRGWA0MWH5X9+gmW9QWem9QA3eDyPVOiltB7g0 i135tspSXn7EvMo0xTv7KuLnI4tE3JCQ6FGRtJ0yXceFCiC85xtM4LpMF3EY1IqwWCp/dQwX4/UST vm/TPciOCH6gaLs2+DILCQ==; In-Reply-To: (message from Dmitry Gutov on Mon, 7 Oct 2024 10:11:08 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293123 Archived-At: > Date: Mon, 7 Oct 2024 10:11:08 +0300 > Cc: pot@gnu.org, spwhitton@spwhitton.name, 73484@debbugs.gnu.org > From: Dmitry Gutov > > > Can you please show the etags command line in each of these two cases > > that you are comparing? > > Both commands end with a '-' (scanning the list of files passed from stdin). > > >>> And if they don't have extensions, the code you > >>> removed would have caused etags to scan these files anyway, looking > >>> for Fortran or C tags. So how come the change slowed down etags so > >>> much? What am I missing? > >> I think it would also concern "unknown" extensions, right? Like .txt, > >> .png and so on. > > I have difficulty reasoning about this without knowing the command > > lines you used. E.g., I don't understand why in one case it would > > scan files with unknown extensions that were not scanned in the other. > > In one case the list is pre-filtered with etags-regen-file-extensions > (see 'etags-regen--all-files'), in the other - it is not, and all files > in project are passed. So you are comparing the speed of scanning ~60K files with the speed of scanning ~375K of files? I'm not generally surprised that the latter takes much longer, only that the slowdown is not proportional to the number of scanned files. But see below. Btw, did you exclude the .git/* files from the list submitted to etags? Here, scanning, with the unmodified etags from Emacs 30, of only those files with extensions in etags-regen-file-extensions takes 16.7 sec and produces a 80.5MB tags table, whereas scanning all the files with the same etags takes almost 16 min and produces 304MB tags table, of which more than 200MB are from files whose language is not recognized. >From my testing, it seems like the elapsed time depends non-linearly on the length of the list of files submitted to etags. For example, if I break the list of files in two, I get 3 min 20 sec and 1 min 40 sec, together 5 min. But if I submit a single list with all the files in those two lists, I get 14 min 30 sec. I guess some internal processing etags does depends non-linearly on the number of files it scans. The various loops in etags that scan all of the known files and/or the tags it previously found seem to confirm this hypothesis. So what is the conclusion from this? Are you saying that the long scan times in this large tree basically make this new no-fallbacks option not very useful, since we still need to carefully include or exclude certain files from the scan? Or should I go ahead and install these changes?