From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: project-find-regexp using ripgrep Date: Mon, 15 Jun 2020 00:30:08 +0300 Message-ID: <49f66d46-da8d-9658-ec85-ced39a99ad87@yandex.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------5D4644706943B47AC8620C64" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="22243"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 To: emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Jun 14 23:31:20 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jkaDn-0005kx-NK for ged-emacs-devel@m.gmane-mx.org; Sun, 14 Jun 2020 23:31:19 +0200 Original-Received: from localhost ([::1]:48816 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jkaDm-0002tD-Lh for ged-emacs-devel@m.gmane-mx.org; Sun, 14 Jun 2020 17:31:18 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45660) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jkaCl-0002Mf-9b for emacs-devel@gnu.org; Sun, 14 Jun 2020 17:30:15 -0400 Original-Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:46686) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jkaCj-0001YX-6N for emacs-devel@gnu.org; Sun, 14 Jun 2020 17:30:15 -0400 Original-Received: by mail-wr1-x42a.google.com with SMTP id x6so15050103wrm.13 for ; Sun, 14 Jun 2020 14:30:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:to:from:subject:message-id:date:user-agent:mime-version :content-language; bh=AM2w33KEPW0xuU6mBrVNke2t4x/E4HNcY9XJSaSbfYQ=; b=R0Vm7hzi7SBCTQ3qGs5pCj+ux+VVcSspUqvhIbv35JXEVXjvU1gQ3X60Q5O9G8OExa 7Vkyr34aVMmyDOuA9YAsHM+e4d0V4pWZ89gYEeH069gc6yzlnnncC2+K+8nokKgA9rAz 5jdvzLQj05kzpGkjtlN0CvIW9Vp+wg1IqplJTR6n93ggbnaKofuaOkJdJuWMqRUTK5Kg n732nSpoGZCILcY79lAVubgsu3FaAHEI1JucdMz9RqInAP9WRenZedM0eRgVtR9SsJ2M NC7xmQc2BjP6MYGZFRFzoNELX350UPkhQAYJEm6HI6vbNqP/haIShTcPkQgOrrzHnfsQ bMKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:to:from:subject:message-id:date :user-agent:mime-version:content-language; bh=AM2w33KEPW0xuU6mBrVNke2t4x/E4HNcY9XJSaSbfYQ=; b=Gn9Fz+CY2rW/w4pNFyD0AgckfUxH2RYm08xVIaLwciiFCcAswJ9W6EDwncBWgQlXkr 3avJpq67cLQE4R1tvbLDXx2vjXGSBuNYQ/m7sq1+WkM1BOqVr/6h2su6radjcdOeBG2k GW50BhoraWMMO36xGxGBF6s000Jxt/7rL/uHNnGGNqf0/9gjZAOHJuzoknNbyLcJ9fYD bBPh8y9ZJqRLRjZMgFQhSNm+v/J4aFPmieHlB2LCo/82OIwSc2oFo5a7YpJ3KMLFse8y Bd52Si9a4i8u+ybIFAWcspuQbP7yPI+M4Alhwwok1FKQLdlxo/jP8dRJ3Snu7Vz7bTB0 HREg== X-Gm-Message-State: AOAM531WByvjSrbRyLObPFlTSP+ITP+BIjsKDv3YDYlMPrHeRDYl2JOF SpAOLc1B7zm44X13VzvrNcnCVKe2 X-Google-Smtp-Source: ABdhPJzpWsU/vbwq96xIEhn95jBiTiHDtBf2MZe3P9hksRXi0jFaklKidqh/CGWLEyxHJicM/peSJA== X-Received: by 2002:a5d:50c9:: with SMTP id f9mr26668560wrt.9.1592170211101; Sun, 14 Jun 2020 14:30:11 -0700 (PDT) Original-Received: from [192.168.0.3] ([66.205.73.129]) by smtp.googlemail.com with ESMTPSA id b185sm32467109wmd.3.2020.06.14.14.30.09 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 14 Jun 2020 14:30:10 -0700 (PDT) Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=raaahh@gmail.com; helo=mail-wr1-x42a.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: 0 X-Spam_score: 0.0 X-Spam_bar: / X-Spam_report: (0.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=1, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:252252 Archived-At: This is a multi-part message in MIME format. --------------5D4644706943B47AC8620C64 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Here's a small patch I've been toying with, inspired by bug#41766. In my testing, it makes the project search an order of magnitude faster. Probably due to smart parallelization. If people confirm this experience, I'm going to install it (or something similar), even though, well, it would be nice to consolidate this search tool into something smarter, and done in one package only. But that for the future. How to try: - M-x project-find-regexp in your favorite project. - If you're feeling scientific, evaluate something like (benchmark 1 '(project-find-regexp "grep-regexp-alist")) - Change the argument to something else if you're searching something other than the Emacs project. - Try it a couple of times. - Note the reported timings. - Install ripgrep (e.g. with 'apt install ripgrep'). - Apply the patch. - [Rebuild], restart Emacs. - Repeat the first several steps. --------------5D4644706943B47AC8620C64 Content-Type: text/x-patch; charset=UTF-8; name="xref-ripgrep.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="xref-ripgrep.diff" diff --git a/lisp/progmodes/xref.el b/lisp/progmodes/xref.el index 5b5fb4bc47..19fc362ddb 100644 --- a/lisp/progmodes/xref.el +++ b/lisp/progmodes/xref.el @@ -1246,12 +1246,20 @@ xref-matches-in-directory (declare-function tramp-tramp-file-p "tramp") (declare-function tramp-file-local-name "tramp") +;; '-s' because 'git ls-files' can output broken symlinks. +(defvar xref-grep-template + "xargs -0 rg -nH --no-messages -g '!*/' -e " + ;"xargs -0 grep -snHE -e " + ) + ;;;###autoload (defun xref-matches-in-files (regexp files) "Find all matches for REGEXP in FILES. Return a list of xref values. FILES must be a list of absolute file names." (cl-assert (consp files)) + (require 'grep) + (defvar grep-highlight-matches) (pcase-let* ((output (get-buffer-create " *project grep output*")) (`(,grep-re ,file-group ,line-group . ,_) (car grep-regexp-alist)) @@ -1261,13 +1269,12 @@ xref-matches-in-files ;; first file is remote, they all are, and on the same host. (dir (file-name-directory (car files))) (remote-id (file-remote-p dir)) - ;; 'git ls-files' can output broken symlinks. - (command (format "xargs -0 grep %s -snHE -e %s" - (if (and case-fold-search - (isearch-no-upper-case-p regexp t)) - "-i" - "") - (shell-quote-argument (xref--regexp-to-extended regexp))))) + ;; The 'auto' default would be fine too, but ripgrep can't handle + ;; the options we pass in that case. + (grep-highlight-matches) + (command (grep-expand-template xref-grep-template + (xref--regexp-to-extended regexp) + regexp))) (when remote-id (require 'tramp) (setq files (mapcar --------------5D4644706943B47AC8620C64--