From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Gregory Heytings Newsgroups: gmane.emacs.bugs Subject: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time Date: Mon, 27 Sep 2021 09:23:28 +0000 Message-ID: <2391d2ad6f1301eed00b@heytings.org> References: <03aa81b5-6077-c35c-1a5f-ec4d867b59ac@yandex.ru> <63300a34-e487-02d1-c182-2b84438654d7@yandex.ru> <83k0j6trau.fsf@gnu.org> <838rzmtf00.fsf@gnu.org> <83sfxurr67.fsf@gnu.org> <83ilypsw31.fsf@gnu.org> <83h7e9stbm.fsf@gnu.org> <835yuprx1d.fsf@gnu.org> <2391d2ad6f5d14256480@heytings.org> <83fstqmwgu.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=us-ascii Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35874"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 50733@debbugs.gnu.org, dgutov@yandex.ru, mardani29@yahoo.es To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Sep 27 11:24:10 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mUmrp-00099M-Bz for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 27 Sep 2021 11:24:09 +0200 Original-Received: from localhost ([::1]:55228 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mUmrn-0003GL-Ud for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 27 Sep 2021 05:24:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:52952) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mUmri-0003GD-Cw for bug-gnu-emacs@gnu.org; Mon, 27 Sep 2021 05:24:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:56335) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mUmri-0005bA-42 for bug-gnu-emacs@gnu.org; Mon, 27 Sep 2021 05:24:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mUmrh-0008Vm-U3 for bug-gnu-emacs@gnu.org; Mon, 27 Sep 2021 05:24:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Gregory Heytings Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 27 Sep 2021 09:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 50733 X-GNU-PR-Package: emacs Original-Received: via spool by 50733-submit@debbugs.gnu.org id=B50733.163273461332681 (code B ref 50733); Mon, 27 Sep 2021 09:24:01 +0000 Original-Received: (at 50733) by debbugs.gnu.org; 27 Sep 2021 09:23:33 +0000 Original-Received: from localhost ([127.0.0.1]:39648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mUmrF-0008V2-Db for submit@debbugs.gnu.org; Mon, 27 Sep 2021 05:23:33 -0400 Original-Received: from heytings.org ([95.142.160.155]:49862) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mUmrC-0008Ut-1H for 50733@debbugs.gnu.org; Mon, 27 Sep 2021 05:23:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heytings.org; s=20210101; t=1632734609; bh=hYnieIEh0mrLFvyuxWUAno+v4SJlAcVLWELmeeaCMS4=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:From; b=vBPADDxSmlNStEFQJbBiV3p/ZxtIJh111LUVJyQwwp+DG7gUA1SKuSDTXkW2Uo8wk 6vpucJb4o9wIEkzpah7efPD8UhzjDK+I0TBGePvcDw2+bYiZ2AOe+t1bStU2Rl+tFU Z9oH272ohA12poKjQUUY7IbqRWigL4bukSTU5DzeBsbF4/lSdcuanQemNSd9T0LFsF tE6qKxwNBZ3n+RiBe1H0Q12Hh9VWsMccrqHWAsuGJOL1KxJwjByCF3Czxy+0MgQ/HA qnxHkLg2ZTtkED55kH6tgcryUAS2oXmXjNijulcCWw+H1Kfudp0QrAQZLlD/QVXBTB agXLT7EFqvZpg== In-Reply-To: <83fstqmwgu.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:215644 Archived-At: > > To get back to the issue at hand: we are talking (or at least I was > talking) about scalability of an algorithm, not about some particular > implementation of the algorithm. > Are you now again shifting the discussion to something else, a theoretical comparison between various algorithms? > > Ripgrep is a multithreaded program, whereas idutils is single-threaded. > So for a fair comparison of scalability of these two main ideas: > file-based search vs DB search, you need at the very least to limit > ripgrep to a single thread. And then you need to run each program on > code bases of various sizes, preferably those which differ by orders of > magnitude or close to that, and see their O(n) behavior. And exclude > from your comparison command-line options that require IDUtils to access > the files in addition to the DB. That would be at least an > approximation to comparing apples to apples. > You're asking me to disable everything that makes ripgrep a modern tool, and to disable everything that makes idutils an outdated tool, to make the outdated tool shine in comparison? Interesting viewpoint. > > But frankly, I don't understand why this all would be needed at all, > because it should be absolutely clear that searching the files in the > filesystem will always scale worse than reading a well-indexed DB. > Which is precisely what I don't believe. It is, at least to me, not at all "absolutely clear" when you look at the whole picture, IOW, when you include the necessity to create and keep a database up to date in your comparison, the added complexity of that solution, and the purpose of the tool. > > IDUtils is an example of the latter, and it beats many utilities that > search the files, including ripgrep, as long as it doesn't need to > access the files themselves. But even if it doesn't always beat them > (which you didn't yet demonstrate), it just means the ideas of its > design should be taken further and/or implemented better, that's all. > I provided you with many numbers and comparisons, which IMO demonstrate what they were meant to demonstrate. A tool which finds matches for a regexp in a O(100 MB) code base in O(10 ms), and in a O(1 GB) code base in O(100 ms), is clearly good enough in practice. (Note that I made these comparisons on a six or seven years old laptop, these numbers would be even lower on a more recent machine.) I'm still waiting for some numbers from you to demonstrate *your* viewpoint. > > I said that such tools are the future, not that IDUtils itself is > necessarily the future (though it could be, if someone picks up its > development). > Is it not simply because it's not useful/better in practice that nobody is picking its development (and pretty much nobody is using it)? > > Again, this is about looking for the best tools for this job, and I > still stand by my opinion: focusing only on general-purpose search tools > is sub-optimal. > The message to which you replied and which started this subtread did not suggest to "focus only on general-purpose search tools", it suggested to focus only on *one* particular general-purpose search tool, ripgrep, which is currently the best tool for the job, and to bundle it with Emacs. It has a public domain license, so I guess it should be possible.