From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions Date: Thu, 10 Oct 2024 08:13:11 +0300 Message-ID: <86o73s35i0.fsf@gnu.org> References: <87tteaznog.fsf@zephyr.silentflame.com> <86ttdy50ja.fsf@gnu.org> <75fe4289-da41-454d-ba92-22a92ea7002f@gutov.dev> <86frpe2186.fsf@gnu.org> <8e305b6d-8ca8-4437-990f-183ebc007d18@gutov.dev> <865xqa1ggi.fsf@gnu.org> <86ttdtzoof.fsf@gnu.org> <8d7dc133-9828-4023-821f-e4403f899f81@gutov.dev> <86ttdsxt6x.fsf@gnu.org> <52cb1caa-9e7e-45df-b328-d60948d397f6@gutov.dev> <864j5rxca1.fsf@gnu.org> <87a5fiijy9.fsf@tucano.isti.cnr.it> <86jzelvjh4.fsf@gnu.org> <8b6560a9-e2d6-42ae-ac1d-014700f21804@gutov.dev> <86wmiktzez.fsf@gnu.org> <86ldyzucdd.fsf@gnu.org> <021c625b-adc9-4e19-819c-fe929583e503@gutov.dev> <86ed4ru41x.fsf@gnu.org> <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@gutov.dev> <86ploasq35.fsf@gnu.org> <3e63f532-c6af-4923-880b-01a32cc667ec@gutov.dev> <86v7y12ism.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4260"; mail-complaints-to="usenet@ciao.gmane.io" Cc: pot@gnu.org, 73484@debbugs.gnu.org, spwhitton@spwhitton.name To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 10 07:14:14 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sylV7-0000zr-M2 for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 10 Oct 2024 07:14:14 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sylUn-0000M3-KC; Thu, 10 Oct 2024 01:13:53 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sylUl-0000Ls-Oz for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 01:13:52 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sylUl-0002By-Gs for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 01:13:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=References:In-Reply-To:From:Date:To:Subject; bh=CJIS0MPSJW/+oBeDJ9A03sIEB0LxKrQAAR2zTfdD024=; b=U3QwDS222e1uXBm4koVR5WZmRntUprCxOlECQ29jzGj+rtXtLYd+2xRguFui6hArC3hb29on94/u9e+MlI03KHJZPHzYvb9gktD+N6ufLgtEnLJ/jrCy44H1dVb6XB5O8ruTeAD/BzNEtDa3xf4z1JtcN6jSwe1Phh/9WCSshGr2HqmaCtKFrP0phN/syaIhOJhDgI9Vey1YaGOJAp265rXmNJWX6oIPd7cI2kKtecNZbOv4Rrq+skXRJ0MNogUtdGYHJQbHlu7R8LvHaedJrKb44Jz9gjqq7iVtw6sWIm1AQqJ1t1FU13hOCiO5RqHmlTu0kcrbJ3oxHxvc51c6Pw==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1sylUv-00013C-Uz for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 01:14:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 10 Oct 2024 05:14:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73484 X-GNU-PR-Package: emacs Original-Received: via spool by 73484-submit@debbugs.gnu.org id=B73484.17285372173988 (code B ref 73484); Thu, 10 Oct 2024 05:14:01 +0000 Original-Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 05:13:37 +0000 Original-Received: from localhost ([127.0.0.1]:58351 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sylUW-00012F-FM for submit@debbugs.gnu.org; Thu, 10 Oct 2024 01:13:36 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:58056) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sylUU-00011z-3g for 73484@debbugs.gnu.org; Thu, 10 Oct 2024 01:13:34 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sylUC-00027t-Je; Thu, 10 Oct 2024 01:13:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=CJIS0MPSJW/+oBeDJ9A03sIEB0LxKrQAAR2zTfdD024=; b=lFU8g/yifZsn HEUdWj/FnFDXPIyKsQ7An7gVy7bx14bAY1OUsL4skLlAW6WheT5uzxSrfS2EVsXmriFeAd2fGk5qu YgLGD4RNcoRfFa2HIaUoAlRnbdZEJdyZY4jyWvsmTuf5hZemhiYerAicC1dV6xUXUKQU+7X+Q7Fst ucPn5tm56BTiUMmGz3aQ/VPjTK6KldtY/I+W4vyu/ame0Mh4XqdAEUgAlUcdhLBPU8AMtcPtZ1SnM UEj46/UlJG/P1i+ojcgyEpsJj8r8+iqsIK+dO/d+dfhteczhwd9JnikU+ivbW7eFKujZLnJc6qDJF 1gWJXhqY2omXzK+R11JZeg==; In-Reply-To: (message from Dmitry Gutov on Thu, 10 Oct 2024 01:22:13 +0300) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293243 Archived-At: > Date: Thu, 10 Oct 2024 01:22:13 +0300 > Cc: pot@gnu.org, spwhitton@spwhitton.name, 73484@debbugs.gnu.org > From: Dmitry Gutov > > On 09/10/2024 22:11, Eli Zaretskii wrote: > > >> This is basically a "uniqueness" operation using linear search, O(N^2). > > > > Yes, this seems to be a protection against the same file name > > mentioned more than once on the command line.. > > Or, maybe more likely, against having symlinks scanned if the symlink > target is also in the passed list. Yes, that, but also any other possible ways of specifying the same file twice, like having a file both compressed and uncompressed, etc. > >> Is there a hash table we could use? > > > > Something like that should do, yes. > > Can we use search.h? hcreate/hsearch/etc. IIUC it's on in the C stndard, > and > https://www.gnu.org/savannah-checkouts/gnu/gnulib/manual/html_node/hcreate.html > says it's available on certain platforms. I think we shouldn't: it is not sufficiently portable and Gnulib doesn't have an implementation for it for those platforms that don't have it. We could perhaps use the standard tsearch (although it will be more expensive). Alternatively, we could steal the hash table code from somewhere, for example, from Gawk. > >> Or perhaps we would skip the search when the canonicalized name is the > >> same as the original one. > > > > That's not the same as the loop above does, I think. > > If we assumed the duplicate check is only necessary for symlinks, and > there is on average a small number of them, I think we could avoid using > a hash table. But passing the same exact file 2 times would result in > duplicate tags. canonicalize_filename in etags.c does not resolve symlinks, AFAICT, so the symlink scenario will not be solved by that. We'd need realpath or its equivalent, I think?