From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Francesco =?UTF-8?Q?Potort=C3=AC?= Newsgroups: gmane.emacs.bugs Subject: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions Date: Thu, 10 Oct 2024 10:27:57 +0200 Organization: The GNU project Message-ID: <875xq0icqa.fsf@tucano.isti.cnr.it> References: <87tteaznog.fsf@zephyr.silentflame.com> <87jzezzg87.fsf_-_@zephyr.silentflame.com> <37e4b3cd-6363-4f55-9921-92a1182679dc@gutov.dev> <86ttdy50ja.fsf@gnu.org> <75fe4289-da41-454d-ba92-22a92ea7002f@gutov.dev> <86frpe2186.fsf@gnu.org> <8e305b6d-8ca8-4437-990f-183ebc007d18@gutov.dev> <865xqa1ggi.fsf@gnu.org> <86ttdtzoof.fsf@gnu.org> <8d7dc133-9828-4023-821f-e4403f899f81@gutov.dev> <86ttdsxt6x.fsf@gnu.org> <52cb1caa-9e7e-45df-b328-d60948d397f6@gutov.dev> <864j5rxca1.fsf@gnu.org> <87a5fiijy9.fsf@tucano.isti.cnr.it> <86jzelvjh4.fsf@gnu.org> <8b6560a9-e2d6-42ae-ac1d-014700f21804@gutov.dev> <86wmiktzez.fsf@gnu.org> <86ldyzucdd.fsf@gnu.org> <021c625b-adc9-4e19-819c-fe929583e503@gutov.dev> <86ed4ru41x.fsf@gnu.org> <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@gutov.dev> <86ploasq35.fsf@gnu.org> <3e63f532-c6af-4923-880b-01a32cc667ec@gutov.dev> <878quwix4c.fsf@tucano.isti.cnr.it> <86ldyw3467.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38595"; mail-complaints-to="usenet@ciao.gmane.io" Cc: dmitry@gutov.dev, 73484@debbugs.gnu.org, spwhitton@spwhitton.name To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 10 10:29:03 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1syoXe-0009t8-G1 for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 10 Oct 2024 10:29:02 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1syoXW-0004um-Ax; Thu, 10 Oct 2024 04:28:54 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1syoXU-0004uY-Gp for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 04:28:52 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1syoXU-00067F-8u for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 04:28:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:References:In-Reply-To:Date:From:To:Subject; bh=gsfRpcHkt7i0X2k8PbY6wgH/1mpgTn+t1pQRJbV/L9U=; b=ANOnjMzf8/kq8KwV+MrDiGjEhhcuDvS5lIEh75dy69/R4+rrS7KC25WHq23VsSA8gZult9wVHBaP060Jtoqpo863E2yy9KKfaytFlXvmcYatIVw/S6Yd2WzCpDLXrAYqkz2GRdjT74+pKuKRUVZh0rYkFxInVuNSMZX/gfqFP5yzQAktji42v99GgWHNOYHLQaIP+jYaaRBPGLnaRGxirjAR2Ey4vJAxFcPUn2mooh2deNwm/BFz4zAcQitZYPP6cdmvBr6wJbplksKP777okamlyckNEkBvzPGDFZE6iceeP+btI7Wzkq2+XFBiRL6l9+DG7uAdpLmtV7p7PzQEjQ==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1syoXe-0002pp-AV for bug-gnu-emacs@gnu.org; Thu, 10 Oct 2024 04:29:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Francesco =?UTF-8?Q?Potort=C3=AC?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 10 Oct 2024 08:29:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73484 X-GNU-PR-Package: emacs Original-Received: via spool by 73484-submit@debbugs.gnu.org id=B73484.172854890010845 (code B ref 73484); Thu, 10 Oct 2024 08:29:02 +0000 Original-Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 08:28:20 +0000 Original-Received: from localhost ([127.0.0.1]:58622 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syoWx-0002oq-KV for submit@debbugs.gnu.org; Thu, 10 Oct 2024 04:28:20 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:33810) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syoWw-0002ob-69 for 73484@debbugs.gnu.org; Thu, 10 Oct 2024 04:28:18 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1syoWf-00063L-FU; Thu, 10 Oct 2024 04:28:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date: From; bh=gsfRpcHkt7i0X2k8PbY6wgH/1mpgTn+t1pQRJbV/L9U=; b=GV6wHKwt1NaqT/BIUrfg 3ojpwtwmhwpSWUWHZjuNcTW+qpizf4XP3DrEehqekIa3eyVIQyW4y9YOgdOc1mwjWH+iwOGYQcp2L G/fpK1390YiX2o/cyEv3lPqky8j+87Htss6uwPZ/eJCbrsHFx9e3+xRWf6ghSanLupapIORzSsTU2 SvzGW8p9oxM0+4dcSlViLgpQzet5cu8nwzBPIyEiiB5jsjwtm5MEjvnNlP6kX1NRIybbwan5hqkax OMq2aUUGme2Qt60/slLNwdp3ZLgmH8+d+M57NUZNDAItDE4xQXJqN+/N/WwNGiImEw7xf/r8Oy0iu FNLNIUZsueuYGA==; In-Reply-To: <86ldyw3467.fsf@gnu.org> (eliz@gnu.org) X-fingerprint: 4B02 6187 5C03 D6B1 2E31 7666 09DF 2DC9 BE21 6115 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293260 Archived-At: >> >This is basically a "uniqueness" operation using linear search, O(N^2). > Thus, I >believe the intent is to avoid duplicate tags if the same file was >encountered twice in some way. Yes. Sorry, I spoke from memory and I was inaccurate. >Note that canonicalize_filename in this case doesn't really do what >its name seems to imply, e.g., relative file names will generally stay >relative. It canonicalises, that is, reduces to a standard common form. It retains r= elative vs absolute difference. =20 >So specifying the same file once as relative and the other >time as absolute will still process the file more than once. From=20memory, I would tell so, yes. Have not checked right now. >We need >to use an inode test or equivalent, and probably use realpath or >equivalent, to make the duplicate test reliable. >Or maybe having the >same file processed under different names is okay, since TAGS is for >helping Emacs find the file, and so using relative names and symlinks >is okay? Yes, I think so. And from memory I think it should be left unchanged. >> I do not think that it makes sense to build a hash table for file names = given on the command line, because the number of comparisons made on those = names is generally vastly inferior to the number of comparisons used to sea= rch for tags. > >That's not what I see in the code. But it should be easy to count the >number of loop iterations in the use case we are talking about >(running etags on the geck-dev tree), so we don't need to argue about >facts. Yes. If finding a bottleneck is the objective, you should maybe instrument= the string comparison functions so that you can count how many times they = are called from different places. I had a quick look at the whole code and in fact the only place I can find = where ou have O^2 behaviour seems to be file name comparison, and it still = looks so strange to me that this can in facrt cause significant delay. I may certainly have missed something, but if that's really the case, first= thing is looking for code inefficiencies. If this is really structural, o= ne should first read all filenames, canonicalise and uniquify them, and onl= y then create the tags.