From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions Date: Thu, 10 Oct 2024 01:22:13 +0300 Message-ID: References: <87tteaznog.fsf@zephyr.silentflame.com> <86ttdy50ja.fsf@gnu.org> <75fe4289-da41-454d-ba92-22a92ea7002f@gutov.dev> <86frpe2186.fsf@gnu.org> <8e305b6d-8ca8-4437-990f-183ebc007d18@gutov.dev> <865xqa1ggi.fsf@gnu.org> <86ttdtzoof.fsf@gnu.org> <8d7dc133-9828-4023-821f-e4403f899f81@gutov.dev> <86ttdsxt6x.fsf@gnu.org> <52cb1caa-9e7e-45df-b328-d60948d397f6@gutov.dev> <864j5rxca1.fsf@gnu.org> <87a5fiijy9.fsf@tucano.isti.cnr.it> <86jzelvjh4.fsf@gnu.org> <8b6560a9-e2d6-42ae-ac1d-014700f21804@gutov.dev> <86wmiktzez.fsf@gnu.org> <86ldyzucdd.fsf@gnu.org> <021c625b-adc9-4e19-819c-fe929583e503@gutov.dev> <86ed4ru41x.fsf@gnu.org> <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@gutov.dev> <86ploasq35.fsf@gnu.org> <3e63f532-c6af-4923-880b-01a32cc667ec@gutov.dev> <86v7y12ism.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5904"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla Thunderbird Cc: pot@gnu.org, 73484@debbugs.gnu.org, spwhitton@spwhitton.name To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Oct 10 00:23:18 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1syf5R-0001G7-US for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 10 Oct 2024 00:23:18 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1syf55-00075Y-Cn; Wed, 09 Oct 2024 18:22:55 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1syf53-00075J-5X for bug-gnu-emacs@gnu.org; Wed, 09 Oct 2024 18:22:53 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1syf52-0001jy-M3 for bug-gnu-emacs@gnu.org; Wed, 09 Oct 2024 18:22:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=In-Reply-To:From:References:MIME-Version:Date:To:Subject; bh=UTYe95bCKJAVbWeuv5Pd9jTMVo0fKETobenHyKZ3ePs=; b=GphLTYX9dUfF2FZlo2KhmrR776eLxj3j6ThdNp37ww6rol9k/OGO2+ekw0FOh6a6RoV6cdfIqQblFWXgO2wBM//r69e6xuh5Ouv7WvxgxEUUgzmwHgE+7MizqEuZNtl9KjwSxtmPI77HId4STG8JHdbnbXF/8eDsp1OqtQ5KT9PzmBauI3QLzJAUG/eSFzKv0RkyCV2t+pUePYprJFIJyhkYBwPegkdrDfNpE2SKZCDLAvw5g3mjVHrk/ZwbjFABcFaEzHGu8j919topjf3yuxld3wn7rTPzfdvdnSHlaUxDeUa0JsXfIadhfu6h0lLVF2pNUWeoFb4z/SOrrldgEA==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1syf5C-0002vt-8m for bug-gnu-emacs@gnu.org; Wed, 09 Oct 2024 18:23:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 09 Oct 2024 22:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73484 X-GNU-PR-Package: emacs Original-Received: via spool by 73484-submit@debbugs.gnu.org id=B73484.172851255811236 (code B ref 73484); Wed, 09 Oct 2024 22:23:02 +0000 Original-Received: (at 73484) by debbugs.gnu.org; 9 Oct 2024 22:22:38 +0000 Original-Received: from localhost ([127.0.0.1]:57875 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syf4n-0002vA-N6 for submit@debbugs.gnu.org; Wed, 09 Oct 2024 18:22:38 -0400 Original-Received: from fout-a8-smtp.messagingengine.com ([103.168.172.151]:57695) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1syf4k-0002ut-0B for 73484@debbugs.gnu.org; Wed, 09 Oct 2024 18:22:36 -0400 Original-Received: from phl-compute-04.internal (phl-compute-04.phl.internal [10.202.2.44]) by mailfout.phl.internal (Postfix) with ESMTP id 2D50313801EC; Wed, 9 Oct 2024 18:22:17 -0400 (EDT) Original-Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-04.internal (MEProxy); Wed, 09 Oct 2024 18:22:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1728512537; x=1728598937; bh=UTYe95bCKJAVbWeuv5Pd9jTMVo0fKETobenHyKZ3ePs=; b= fLr+DNI0OL/njdGoVNoZS0CnhclIcqUnNQh0hm01Ge6cyQXYbzwnFc8FjlYCweFU l3NW3joOc9HxiNaqHvNYM6aPSvoxu1d2+AyQ9guuWWxRhOsy0fwNsUy8gTh+bk31 ZwFSKtEw9rLS2Pm8GIfpZbcDRGJ5k5pyl/BR0en9Q/OABLaf8GZQ1J1CLWbBY2pJ CsAthEHXKcOLtytrT2ETzeyzZd+DFmlPAsUk0hT6kun83UoCxF1maQ9/j2u8Di3q QS+Jid4/mFkAZ8+7bq3XFGC1xaxMZ3e0/kMNpHCEaGotYmCVc9RAiw9zH+mTSlpg 4w8J7AjoJ49zODR5fJaGDw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728512537; x= 1728598937; bh=UTYe95bCKJAVbWeuv5Pd9jTMVo0fKETobenHyKZ3ePs=; b=P 7Q+CEwnuANYI/wSrn70zyPFc21hgEyQPj4EKj9fWCeiJK/W9iC4As/WN+P+Euxzw tYZSfDpIxhEBwPiWJV7Fm44tP8WKob3XcLvkgFifUWypAUew/KWH+DnoR53jswQB /jvtdkp+cw9zdG6vtCVl3D95z13z8wp1obKOcM2/Hmhi2M40efq6mkd5A22NQ73t SGhi20ignTsi7tIgq6vUbPK1MCs4u1BW10wl7oA4OSeeY2HC03fh0Sv69eMsVqfz kUOqoQCHfP3qozwIJlbscpLhI00Y+bj3v169gwKetdNrYcdE8EvX7nXH3A3PRfaA lqrvR7CzRq8rHmuYo5Yxw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefgedgtdelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug gvvheqnecuggftrfgrthhtvghrnhepffeifedvleeukedtgfelieegudfgveekfeejveej ffetffeuueeugefhveeiuddvnecuffhomhgrihhnpehgnhhurdhorhhgnecuvehluhhsth gvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughmihhtrhihsehguhht ohhvrdguvghvpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhpohhuthdprhgtph htthhopegvlhhiiiesghhnuhdrohhrghdprhgtphhtthhopehpohhtsehgnhhurdhorhhg pdhrtghpthhtohepshhpfihhihhtthhonhesshhpfihhihhtthhonhdrnhgrmhgvpdhrtg hpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhnuhdrohhrgh X-ME-Proxy: Feedback-ID: i07de48aa:Fastmail Original-Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 9 Oct 2024 18:22:15 -0400 (EDT) Content-Language: en-US In-Reply-To: <86v7y12ism.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293227 Archived-At: On 09/10/2024 22:11, Eli Zaretskii wrote: >> This is basically a "uniqueness" operation using linear search, O(N^2). > > Yes, this seems to be a protection against the same file name > mentioned more than once on the command line.. Or, maybe more likely, against having symlinks scanned if the symlink target is also in the passed list. >> Is there a hash table we could use? > > Something like that should do, yes. Can we use search.h? hcreate/hsearch/etc. IIUC it's on in the C stndard, and https://www.gnu.org/savannah-checkouts/gnu/gnulib/manual/html_node/hcreate.html says it's available on certain platforms. >> Or perhaps we would skip the search when the canonicalized name is the >> same as the original one. > > That's not the same as the loop above does, I think. If we assumed the duplicate check is only necessary for symlinks, and there is on average a small number of them, I think we could avoid using a hash table. But passing the same exact file 2 times would result in duplicate tags. >> I guess someone might ask for flag "--no-decompress", sometime. > > Yes, but it's also easy to exclude them via 'find'. Or through etags-regen-ignores. >>> . Some files have their language identified by means other than their >>> names or extensions: those are the languages that have >>> "interpreters" defined in etags.c. Shell scripts is one such case, >>> but not the only one. So when etags-regen.el passes only files >>> with known extensions to etags, it misses those files from TAGS. >>> As one example, the file js/src/devtools/rootAnalysis/run_complete >>> in the gecko-dev tree is a Perl script, but has no .pl extension. >> >> This sounds the same as the "hashbang" files that we mentioned >> previously. It makes sense for the scan to take longer, of course, >> proportional to the number of the detected files. > > My point was that if someone wants all the Python files, say, > submitting only Python extensions to etags might miss some Python > scripts. Yes, that's the problem from the first comments of this report: to have hashbang files scanned, one can't use a whitelist of extensions. Using a blacklist should be fine, though.