From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: sbaugh@catern.com Newsgroups: gmane.emacs.bugs Subject: bug#62837: [PATCH] Add a semantic-symref backend which uses xref-matches-in-files Date: Sat, 15 Apr 2023 21:56:24 +0000 (UTC) Message-ID: <871qkkn720.fsf@catern.com> References: <5e6eddd5-4b38-5765-05f3-dd6c1927edd3@yandex.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="25564"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Spencer Baugh , 62837@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Apr 15 23:57:20 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pnntX-0006Pj-NN for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 15 Apr 2023 23:57:20 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pnntJ-00049y-Hz; Sat, 15 Apr 2023 17:57:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pnntH-00049k-9r for bug-gnu-emacs@gnu.org; Sat, 15 Apr 2023 17:57:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pnntG-0004GC-Ov for bug-gnu-emacs@gnu.org; Sat, 15 Apr 2023 17:57:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pnntG-000704-GL for bug-gnu-emacs@gnu.org; Sat, 15 Apr 2023 17:57:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: sbaugh@catern.com Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 15 Apr 2023 21:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 62837 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 62837-submit@debbugs.gnu.org id=B62837.168159579326863 (code B ref 62837); Sat, 15 Apr 2023 21:57:02 +0000 Original-Received: (at 62837) by debbugs.gnu.org; 15 Apr 2023 21:56:33 +0000 Original-Received: from localhost ([127.0.0.1]:50239 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pnnsn-0006zD-4E for submit@debbugs.gnu.org; Sat, 15 Apr 2023 17:56:33 -0400 Original-Received: from s.wrqvtbkv.outbound-mail.sendgrid.net ([149.72.123.24]:55360) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pnnsj-0006yz-Un for 62837@debbugs.gnu.org; Sat, 15 Apr 2023 17:56:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=catern.com; h=from:subject:in-reply-to:references:mime-version:to:cc:content-type: content-transfer-encoding:cc:content-type:from:subject:to; s=s1; bh=07tQshVT/WWFlAUv7JEh1t6vvanp1KE3ek56GSf3BbU=; b=AmCmlyhF27OhLEkc0JohpNvp/KVM0NyPiV+SENg/O3scTDPALE4L4MIe79wzUbRn0EUi M+P0F3lczoIEAfu2Z6Oohv4kdJW/8ZhLndX3pN3Xtn+CeQdkv2GPFf3w77mBOs69SS08Ch h8XjaH88ORM88lquW1leBltLdU6MJEouV4Ll41kDUSF7T3EI+gueiIEbw0JYc/GPI1mVhQ AnzeVtbkbHcjpINle1Ov2wVh7mVomI4+020nWPR7dqegdm/O842UNifmLxjvVQ0M14i54k k51hFEwpSuGbUv3YS4NeKW+Q+WUlMoh2TCqKea+m8AVHJkf04chvoVuB/3IoCA7g== Original-Received: by filterdrecv-59cb65cf6d-sfxbf with SMTP id filterdrecv-59cb65cf6d-sfxbf-1-643B1D88-6 2023-04-15 21:56:24.218152664 +0000 UTC m=+4181133.501188792 Original-Received: from earth.catern.com (unknown) by geopod-ismtpd-21 (SG) with ESMTP id AaY7TWcbTA-kXfcQ9GzfCA Sat, 15 Apr 2023 21:56:24.128 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=::1; helo=localhost; envelope-from=sbaugh@catern.com; receiver= Original-Received: from localhost (localhost [IPv6:::1]) by earth.catern.com (Postfix) with ESMTPSA id AD3A96009C; Sat, 15 Apr 2023 17:56:23 -0400 (EDT) In-Reply-To: <5e6eddd5-4b38-5765-05f3-dd6c1927edd3@yandex.ru> (Dmitry Gutov's message of "Sat, 15 Apr 2023 01:38:18 +0300") X-SG-EID: ZgbRq7gjGrt0q/Pjvxk7wM0yQFRdOkTJAtEbkjCkHbL6zzdHd5XYdx9oPpLbhEXktScYfJrqNWiEAtEymSC9fQvqHY0ZRop0Pb3VVdemI5TGx9DgntAE0AF3iwcAKTy6eVh4m88tD9QL0USO1xE0mA1/xyfZz/5RYgsp+xIHzH9wvQ+va+9HKBPlAt+2aXHetn1Erl2A1fhUGrVcgPSJSw== X-Entity-ID: d/0VcHixlS0t7iB1YKCv4Q== X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:260079 Archived-At: Dmitry Gutov writes: > Hi! > > On 14/04/2023 18:37, Spencer Baugh wrote: >> When project-files is available, this is a much more efficient >> fallback than the current grep fallback. Ultimately, this is >> motivated by making xref-find-references faster by default even in the >> absence of an index. > > It's a clever enough idea, but unfortunately it doesn't look like the > performance is always improved by this change. > > E.g. I have this checkout of gecko-dev (a big project, just for > testing: https://github.com/mozilla/gecko-dev) which contains > different types of files: cpp, js, py. > > If I do an xref-find-references search with the current code, it > finishes in around ~0.8s. 'find' is not that slow, actually: > > time find . -type f -name "*.cpp" >/dev/null > > reports just 400 ms here. > > Whereas with your patch the search, depending on the language (cpp -- > more files, py -- less files) can take 3 seconds and more. > > Why? First of all, project-files returns all files (which are then all > searched), whereas semantic-symref-filepattern-alist contains a > mapping from modes to file globs, limiting both the scan and > subsequent search to those. > > Second -- using project-files means we're forced to round-trip the > list of files names from the first project's stdout, to buffer, then > to a list of Lisp strings, and then back to another buffer, to use as > stdin. I have a couple of things planner in the medium term to improve > that, but some overhead is probably unavoidable (unless we get some > new primitive that would allow "piping" between process buffers). Yes, this is a very good point. > Perhaps you could describe your case where you *did* see a significant > improvement from this patch, and we can discuss the best steps to > address that. In short: I have a project.el backend for a large monorepo which has a project-files backend which returns only the subset of files which are relevant to work happening in a given clone. (Generally a user will have many clones and be doing different work in each one.) The relevant-files subset is determined by integration with the build system. So running find returns a vast number of files and then searches over those, whereas running a search over project-files searches a much smaller number of files. Regarding your medium-term plans to improve project-files performance - wildly guessing, but perhaps you have in mind a way to run a subprocess that outputs the project-files list? Let's call it "project-files-process". And then project-files-process could be piped to grep instead, for maximum efficiency? If that was the idea, then my own backend could certainly have a project-files-process implementation too, for maximum efficiency. > BTW, at first I figured you're using MacOS (which historically has > bundled outdated versions of find and grep, with worse > performance). But apparently not? Nope, Linux.