unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* project-find-regexp using ripgrep
@ 2020-06-14 21:30 Dmitry Gutov
  2020-06-18  9:49 ` Ergus
  2020-06-20  1:06 ` Dmitry Gutov
  0 siblings, 2 replies; 14+ messages in thread
From: Dmitry Gutov @ 2020-06-14 21:30 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 893 bytes --]

Here's a small patch I've been toying with, inspired by bug#41766.

In my testing, it makes the project search an order of magnitude faster. 
Probably due to smart parallelization.

If people confirm this experience, I'm going to install it (or something 
similar), even though, well, it would be nice to consolidate this search 
tool into something smarter, and done in one package only. But that for 
the future.

How to try:

- M-x project-find-regexp in your favorite project.
- If you're feeling scientific, evaluate something like

    (benchmark 1 '(project-find-regexp "grep-regexp-alist"))

- Change the argument to something else if you're searching something 
other than the Emacs project.
- Try it a couple of times.
- Note the reported timings.

- Install ripgrep (e.g. with 'apt install ripgrep').
- Apply the patch.
- [Rebuild], restart Emacs.
- Repeat the first several steps.

[-- Attachment #2: xref-ripgrep.diff --]
[-- Type: text/x-patch, Size: 1895 bytes --]

diff --git a/lisp/progmodes/xref.el b/lisp/progmodes/xref.el
index 5b5fb4bc47..19fc362ddb 100644
--- a/lisp/progmodes/xref.el
+++ b/lisp/progmodes/xref.el
@@ -1246,12 +1246,20 @@ xref-matches-in-directory
 (declare-function tramp-tramp-file-p "tramp")
 (declare-function tramp-file-local-name "tramp")
 
+;; '-s' because 'git ls-files' can output broken symlinks.
+(defvar xref-grep-template
+  "xargs -0 rg <C> -nH --no-messages -g '!*/' -e <R>"
+  ;"xargs -0 grep <C> -snHE -e <R>"
+  )
+
 ;;;###autoload
 (defun xref-matches-in-files (regexp files)
   "Find all matches for REGEXP in FILES.
 Return a list of xref values.
 FILES must be a list of absolute file names."
   (cl-assert (consp files))
+  (require 'grep)
+  (defvar grep-highlight-matches)
   (pcase-let*
       ((output (get-buffer-create " *project grep output*"))
        (`(,grep-re ,file-group ,line-group . ,_) (car grep-regexp-alist))
@@ -1261,13 +1269,12 @@ xref-matches-in-files
        ;; first file is remote, they all are, and on the same host.
        (dir (file-name-directory (car files)))
        (remote-id (file-remote-p dir))
-       ;; 'git ls-files' can output broken symlinks.
-       (command (format "xargs -0 grep %s -snHE -e %s"
-                        (if (and case-fold-search
-                                 (isearch-no-upper-case-p regexp t))
-                            "-i"
-                          "")
-                        (shell-quote-argument (xref--regexp-to-extended regexp)))))
+       ;; The 'auto' default would be fine too, but ripgrep can't handle
+       ;; the options we pass in that case.
+       (grep-highlight-matches)
+       (command (grep-expand-template xref-grep-template
+                                      (xref--regexp-to-extended regexp)
+                                      regexp)))
     (when remote-id
       (require 'tramp)
       (setq files (mapcar

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-14 21:30 project-find-regexp using ripgrep Dmitry Gutov
@ 2020-06-18  9:49 ` Ergus
  2020-06-18  9:55   ` Dmitry Gutov
  2020-06-20  1:06 ` Dmitry Gutov
  1 sibling, 1 reply; 14+ messages in thread
From: Ergus @ 2020-06-18  9:49 UTC (permalink / raw)
  To: dgutov@yandex.ru, emacs-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

Hi:

I am trying the latest changes in master and when I try to use any project-* command I always get an error:

```
project-current: Wrong type argument: listp, ~/projects/nanos_cluster/
```

Where ~/projects/nanos_cluster/ is actually the root of my current project

With this bt:

Debugger entered--Lisp error: (wrong-type-argument listp ~/projects/nanos_cluster/)  project--add-to-project-list-front((vc . "~/projects/nanos_cluster/"))  project-current(t)  project-switch-to-buffer()  funcall-interactively(project-switch-to-buffer)  call-interactively(project-switch-to-buffer nil nil)  command-execute(project-switch-to-buffer)

Do I need an extra setting/config or anything?
Best,Ergus

[-- Attachment #2: Type: text/html, Size: 1286 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-18  9:49 ` Ergus
@ 2020-06-18  9:55   ` Dmitry Gutov
  2020-06-18 10:20     ` Ergus
  0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Gutov @ 2020-06-18  9:55 UTC (permalink / raw)
  To: Ergus, emacs-devel@gnu.org

Hi!

On 18.06.2020 12:49, Ergus wrote:
> I am trying the latest changes in master and when I try to use any 
> project-* command I always get an error:
> 
> ```
> project-current: Wrong type argument: listp, ~/projects/nanos_cluster/
> ```

Please try deleting ~/.emacs.d/projects.

And either restart or (setq project--list 'unset).



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-18  9:55   ` Dmitry Gutov
@ 2020-06-18 10:20     ` Ergus
  0 siblings, 0 replies; 14+ messages in thread
From: Ergus @ 2020-06-18 10:20 UTC (permalink / raw)
  To: dgutov@yandex.ru, emacs-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 583 bytes --]

Yes, it worked fine, thanks.


-----Original Message-----
From: Dmitry Gutov <dgutov@yandex.ru>
To: Ergus <spacibba@aol.com>; emacs-devel@gnu.org <emacs-devel@gnu.org>
Sent: Thu, Jun 18, 2020 11:55 am
Subject: Re: project-find-regexp using ripgrep

Hi!

On 18.06.2020 12:49, Ergus wrote:
> I am trying the latest changes in master and when I try to use any 
> project-* command I always get an error:
> 
> ```
> project-current: Wrong type argument: listp, ~/projects/nanos_cluster/
> ```

Please try deleting ~/.emacs.d/projects.

And either restart or (setq project--list 'unset).

[-- Attachment #2: Type: text/html, Size: 1192 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-14 21:30 project-find-regexp using ripgrep Dmitry Gutov
  2020-06-18  9:49 ` Ergus
@ 2020-06-20  1:06 ` Dmitry Gutov
  2020-06-20  4:09   ` andres.ramirez
  1 sibling, 1 reply; 14+ messages in thread
From: Dmitry Gutov @ 2020-06-20  1:06 UTC (permalink / raw)
  To: emacs-devel

On 15.06.2020 00:30, Dmitry Gutov wrote:
> Here's a small patch I've been toying with, inspired by bug#41766.
> 
> In my testing, it makes the project search an order of magnitude faster. 
> Probably due to smart parallelization.
> 
> If people confirm this experience ...

Anybody?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-20  1:06 ` Dmitry Gutov
@ 2020-06-20  4:09   ` andres.ramirez
  2020-06-22  0:01     ` Dmitry Gutov
  0 siblings, 1 reply; 14+ messages in thread
From: andres.ramirez @ 2020-06-20  4:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Hi. Dmitry.

>>>>> "Dmitry" == Dmitry Gutov <dgutov@yandex.ru> writes:

    Dmitry> On 15.06.2020 00:30, Dmitry Gutov wrote:
    >> Here's a small patch I've been toying with, inspired by bug#41766.

[...]


    Dmitry> Anybody?

Before the patch:
--8<---------------cut here---------------start------------->8---
Elapsed time: 6.010101s
Elapsed time: 5.863914s
--8<---------------cut here---------------end--------------->8---

After installing ripgrep and patch:
--8<---------------cut here---------------start------------->8---
Elapsed time: 3.261737s
Elapsed time: 1.742008s
--8<---------------cut here---------------end--------------->8---

Best Regards



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-20  4:09   ` andres.ramirez
@ 2020-06-22  0:01     ` Dmitry Gutov
  2020-06-22  2:30       ` Eli Zaretskii
  2020-06-22  3:12       ` andrés ramírez
  0 siblings, 2 replies; 14+ messages in thread
From: Dmitry Gutov @ 2020-06-22  0:01 UTC (permalink / raw)
  To: andres.ramirez; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]

On 20.06.2020 07:09, andres.ramirez wrote:
>      Dmitry> Anybody?
> 
> Before the patch:
> --8<---------------cut here---------------start------------->8---
> Elapsed time: 6.010101s
> Elapsed time: 5.863914s
> --8<---------------cut here---------------end--------------->8---
> 
> After installing ripgrep and patch:
> --8<---------------cut here---------------start------------->8---
> Elapsed time: 3.261737s
> Elapsed time: 1.742008s
> --8<---------------cut here---------------end--------------->8---

Thanks, Andres. Looks promising.

Here's the latest version of the patch, if you'd like to test. I don't 
expect major changes in performance, but it does add a pipe to 'sort', 
which creates some overhead proportional to the number of search results.

To enable ripgrep with this, one needs to 'M-x customize-variable 
xref-search-command-template'.

[-- Attachment #2: xref-ripgrep.diff --]
[-- Type: text/x-patch, Size: 3019 bytes --]

diff --git a/lisp/progmodes/xref.el b/lisp/progmodes/xref.el
index 3e3a37f6da..a8283d0d4a 100644
--- a/lisp/progmodes/xref.el
+++ b/lisp/progmodes/xref.el
@@ -1246,12 +1246,45 @@ xref-matches-in-directory
 (declare-function tramp-tramp-file-p "tramp")
 (declare-function tramp-file-local-name "tramp")
 
+;; '-s' because 'git ls-files' can output broken symlinks.
+(defconst xref-grep-search-template
+  "xargs -0 grep <C> -snHE -e <R>"
+  "Use Grep to search a list of files piped from stdin.")
+
+;; See https://github.com/BurntSushi/ripgrep/issues/152 on
+;; the subject of non-deterministic output.
+(defconst xref-ripgrep-search-template
+  "xargs -0 rg <C> -nH --no-messages -g '!*/' -e <R> | sort -t: -k1 -k2n"
+  "Use ripgrep to search a list of files piped from stdin.
+
+The arguments are chosen carefully so that the output format is
+compatible with Grep.  As well as its '-s' argument.
+
+Note: by default, ripgrep's output order is non-deterministic
+because it does the search in parallel.  You can use the template
+without the '| sort ...' part if GNU sort is not available on
+your system and/or stable ordering is not important to you.")
+
+(defcustom xref-search-command-template xref-grep-search-template
+  "Command template to search a list of files piped from stdin.
+
+Allowed fields:
+
+  <C> for extra arguments such as -i and --color
+  <R> for the regexp itself (in Extended format)"
+  :type `(choice
+          (const :tag "Use Grep" ,xref-grep-search-template)
+          (const :tag "Use ripgrep" ,xref-ripgrep-search-template)
+          (string :tag "User defined")))
+
 ;;;###autoload
 (defun xref-matches-in-files (regexp files)
   "Find all matches for REGEXP in FILES.
 Return a list of xref values.
 FILES must be a list of absolute file names."
   (cl-assert (consp files))
+  (require 'grep)
+  (defvar grep-highlight-matches)
   (pcase-let*
       ((output (get-buffer-create " *project grep output*"))
        (`(,grep-re ,file-group ,line-group . ,_) (car grep-regexp-alist))
@@ -1261,13 +1294,12 @@ xref-matches-in-files
        ;; first file is remote, they all are, and on the same host.
        (dir (file-name-directory (car files)))
        (remote-id (file-remote-p dir))
-       ;; 'git ls-files' can output broken symlinks.
-       (command (format "xargs -0 grep %s -snHE -e %s"
-                        (if (and case-fold-search
-                                 (isearch-no-upper-case-p regexp t))
-                            "-i"
-                          "")
-                        (shell-quote-argument (xref--regexp-to-extended regexp)))))
+       ;; The 'auto' default would be fine too, but ripgrep can't handle
+       ;; the options we pass in that case.
+       (grep-highlight-matches)
+       (command (grep-expand-template xref-search-command-template
+                                      (xref--regexp-to-extended regexp)
+                                      regexp)))
     (when remote-id
       (require 'tramp)
       (setq files (mapcar

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-22  0:01     ` Dmitry Gutov
@ 2020-06-22  2:30       ` Eli Zaretskii
  2020-06-22 13:10         ` Dmitry Gutov
  2020-06-22  3:12       ` andrés ramírez
  1 sibling, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2020-06-22  2:30 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: rrandresf, emacs-devel

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Mon, 22 Jun 2020 03:01:42 +0300
> Cc: emacs-devel <emacs-devel@gnu.org>
> 
> +(defcustom xref-search-command-template xref-grep-search-template
> +  "Command template to search a list of files piped from stdin.
> +
> +Allowed fields:
> +
> +  <C> for extra arguments such as -i and --color
> +  <R> for the regexp itself (in Extended format)"
> +  :type `(choice
> +          (const :tag "Use Grep" ,xref-grep-search-template)
> +          (const :tag "Use ripgrep" ,xref-ripgrep-search-template)
> +          (string :tag "User defined")))

Please don't forget the :version tag.

Also, I think this new option should be in NEWS.

Thanks.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-22  0:01     ` Dmitry Gutov
  2020-06-22  2:30       ` Eli Zaretskii
@ 2020-06-22  3:12       ` andrés ramírez
  2020-12-04  1:56         ` Dmitry Gutov
  1 sibling, 1 reply; 14+ messages in thread
From: andrés ramírez @ 2020-06-22  3:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Hello. Dmitry.

>>>>> "Dmitry" == Dmitry Gutov <dgutov@yandex.ru> writes:

    Dmitry> [1 <text/plain; utf-8 (7bit)>] On 20.06.2020 07:09, andres.ramirez wrote: Anybody?
    >> 
    >> Before the patch: --8<---------------cut here---------------start------------->8--- Elapsed
    >> time: 6.010101s Elapsed time: 5.863914s --8<---------------cut
    >> here---------------end--------------->8---
    >> 
    >> After installing ripgrep and patch: --8<---------------cut
    >> here---------------start------------->8--- Elapsed time: 3.261737s Elapsed time: 1.742008s
    >> --8<---------------cut here---------------end--------------->8---

    Dmitry> Thanks, Andres. Looks promising.

    Dmitry> Here's the latest version of the patch, 
[...]

--8<---------------cut here---------------start------------->8---
Elapsed time: 3.017855s
Elapsed time: 1.403502s
--8<---------------cut here---------------end--------------->8---

Best Regards



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-22  2:30       ` Eli Zaretskii
@ 2020-06-22 13:10         ` Dmitry Gutov
  2020-06-22 14:53           ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Gutov @ 2020-06-22 13:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rrandresf, emacs-devel

On 22.06.2020 05:30, Eli Zaretskii wrote:
> Please don't forget the :version tag.
> 
> Also, I think this new option should be in NEWS.

Yes, of course. I'm going to write something unusually verbose in there, 
so you might have to cut it down after.

Have you tried the patch yourself, BTW? I'm curious about the impact on 
slower machines (with fewer cores, among other things) and/or 
unoptimized Emacs builds.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-22 13:10         ` Dmitry Gutov
@ 2020-06-22 14:53           ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2020-06-22 14:53 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: rrandresf, emacs-devel

> Cc: rrandresf@gmail.com, emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Mon, 22 Jun 2020 16:10:36 +0300
> 
> On 22.06.2020 05:30, Eli Zaretskii wrote:
> > Please don't forget the :version tag.
> > 
> > Also, I think this new option should be in NEWS.
> 
> Yes, of course. I'm going to write something unusually verbose in there, 
> so you might have to cut it down after.

Thanks.

> Have you tried the patch yourself, BTW?

No, not yet.

> I'm curious about the impact on slower machines (with fewer cores,
> among other things) and/or unoptimized Emacs builds.

I use neither, at least not most of the time.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-06-22  3:12       ` andrés ramírez
@ 2020-12-04  1:56         ` Dmitry Gutov
  2020-12-09  4:14           ` andrés ramírez
  0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Gutov @ 2020-12-04  1:56 UTC (permalink / raw)
  To: andrés ramírez; +Cc: emacs-devel

Hi again!

On 22.06.2020 06:12, andrés ramírez wrote:
> Hello. Dmitry.
> 
>>>>>> "Dmitry" == Dmitry Gutov <dgutov@yandex.ru> writes:
> 
>      Dmitry> [1 <text/plain; utf-8 (7bit)>] On 20.06.2020 07:09, andres.ramirez wrote: Anybody?
>      >>
>      >> Before the patch: --8<---------------cut here---------------start------------->8--- Elapsed
>      >> time: 6.010101s Elapsed time: 5.863914s --8<---------------cut
>      >> here---------------end--------------->8---
>      >>
>      >> After installing ripgrep and patch: --8<---------------cut
>      >> here---------------start------------->8--- Elapsed time: 3.261737s Elapsed time: 1.742008s
>      >> --8<---------------cut here---------------end--------------->8---
> 
>      Dmitry> Thanks, Andres. Looks promising.
> 
>      Dmitry> Here's the latest version of the patch,
> [...]
> 
> --8<---------------cut here---------------start------------->8---
> Elapsed time: 3.017855s
> Elapsed time: 1.403502s
> --8<---------------cut here---------------end--------------->8---

Sorry for the considerable wait, and thanks.

I have just pushed the updated patch to master (f2a3d6e). If you track 
that branch, after updating you can enjoy ripgrep support with:

   (setq xref-search-program 'ripgrep)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-12-04  1:56         ` Dmitry Gutov
@ 2020-12-09  4:14           ` andrés ramírez
  2020-12-09 21:46             ` Dmitry Gutov
  0 siblings, 1 reply; 14+ messages in thread
From: andrés ramírez @ 2020-12-09  4:14 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Hi. Dmitry.

>>>>> "Dmitry" == Dmitry Gutov <dgutov@yandex.ru> writes:


[...]


    Dmitry> Sorry for the considerable wait, and thanks.

NP. Thanks for the work. and the email also.

    Dmitry> I have just pushed the updated patch to master (f2a3d6e). If you track that branch,
    Dmitry> after updating you can enjoy ripgrep support with:

Since I am or ARM-arch (not on x86). I stay with the bundles provided by the maintainers (pretest and
released version). But from time to time I try to compile master when times permits it. So today I
am replacing emacs 27.1 with master.

    Dmitry>   (setq xref-search-program 'ripgrep)

Added to my dot emacs.

BTW. I have been following the thread about emacs-28 slower than emacs-27. But With this brand new
version of Emacs. Trying Emacs inside Xterm feels slower. On the virtual console there is no problem with
Emacs. As I prefer the virtual console ( Ctrl+Meta+F3) everything is fine from my side. I just have
noticed it. But perhaps I am not the only one noticing it.

Best Regards



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: project-find-regexp using ripgrep
  2020-12-09  4:14           ` andrés ramírez
@ 2020-12-09 21:46             ` Dmitry Gutov
  0 siblings, 0 replies; 14+ messages in thread
From: Dmitry Gutov @ 2020-12-09 21:46 UTC (permalink / raw)
  To: andrés ramírez; +Cc: emacs-devel

On 09.12.2020 06:14, andrés ramírez wrote:

>      Dmitry> I have just pushed the updated patch to master (f2a3d6e). If you track that branch,
>      Dmitry> after updating you can enjoy ripgrep support with:
> 
> Since I am or ARM-arch (not on x86). I stay with the bundles provided by the maintainers (pretest and
> released version).

Sounds like ripgrep is really faster than grep on ARM. Interesting.

> But from time to time I try to compile master when times permits it. So today I
> am replacing emacs 27.1 with master.

Good move, it's pretty stable for me.

But should you decide to go back to Emacs 27 for a while, you can 
install the newer version of xref from GNU ELPA. I just bumped its 
package version.

>      Dmitry>   (setq xref-search-program 'ripgrep)
> 
> Added to my dot emacs.
> 
> BTW. I have been following the thread about emacs-28 slower than emacs-27. But With this brand new
> version of Emacs. Trying Emacs inside Xterm feels slower. On the virtual console there is no problem with
> Emacs. As I prefer the virtual console ( Ctrl+Meta+F3) everything is fine from my side. I just have
> noticed it. But perhaps I am not the only one noticing it.

Can't really comment on that, I only use the virtual console for 
troubleshooting.



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-12-09 21:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-14 21:30 project-find-regexp using ripgrep Dmitry Gutov
2020-06-18  9:49 ` Ergus
2020-06-18  9:55   ` Dmitry Gutov
2020-06-18 10:20     ` Ergus
2020-06-20  1:06 ` Dmitry Gutov
2020-06-20  4:09   ` andres.ramirez
2020-06-22  0:01     ` Dmitry Gutov
2020-06-22  2:30       ` Eli Zaretskii
2020-06-22 13:10         ` Dmitry Gutov
2020-06-22 14:53           ` Eli Zaretskii
2020-06-22  3:12       ` andrés ramírez
2020-12-04  1:56         ` Dmitry Gutov
2020-12-09  4:14           ` andrés ramírez
2020-12-09 21:46             ` Dmitry Gutov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).