From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.bugs Subject: bug#44983: Truncate long lines of grep output Date: Wed, 09 Dec 2020 21:17:28 +0200 Organization: LINKOV.NET Message-ID: <87lfe6x1uf.fsf@mail.linkov.net> References: <87v9dlc3ti.fsf_-_@mail.linkov.net> <83ft4pik35.fsf@gnu.org> <87sg8p5kw0.fsf@mail.linkov.net> <83eek8hoyx.fsf@gnu.org> <87h7p4r1n9.fsf@mail.linkov.net> <62EB4762-278D-43E7-8699-BBDC47818A50@gnu.org> <87zh2w7ww1.fsf@mail.linkov.net> <83pn3reyjs.fsf@gnu.org> <87y2ie7for.fsf@mail.linkov.net> <87h7p0f611.fsf@mail.linkov.net> <87a6uqafmk.fsf@mail.linkov.net> <87zh2q61n6.fsf@mail.linkov.net> <3620abd0-ce79-cc9d-3fb2-255e91f13da1@yandex.ru> <87mtyo3x1z.fsf@mail.linkov.net> <857088a6-fe90-d989-9115-2c159b2a02e6@yandex.ru> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14519"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu) Cc: 44983@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Dec 09 20:53:43 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kn5Wx-0003fe-DG for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 09 Dec 2020 20:53:43 +0100 Original-Received: from localhost ([::1]:41130 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kn5Ww-0003kW-9K for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 09 Dec 2020 14:53:42 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39118) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kn53J-0007dT-UV for bug-gnu-emacs@gnu.org; Wed, 09 Dec 2020 14:23:06 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:52832) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kn53H-0005x3-3g for bug-gnu-emacs@gnu.org; Wed, 09 Dec 2020 14:23:04 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kn53H-0004FJ-0F for bug-gnu-emacs@gnu.org; Wed, 09 Dec 2020 14:23:03 -0500 X-Loop: help-debbugs@gnu.org In-Reply-To: <87v9dlc3ti.fsf_-_@mail.linkov.net> Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 09 Dec 2020 19:23:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44983 X-GNU-PR-Package: emacs Original-Received: via spool by 44983-submit@debbugs.gnu.org id=B44983.160754174216223 (code B ref 44983); Wed, 09 Dec 2020 19:23:02 +0000 Original-Received: (at 44983) by debbugs.gnu.org; 9 Dec 2020 19:22:22 +0000 Original-Received: from localhost ([127.0.0.1]:36139 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kn52c-0004Da-ES for submit@debbugs.gnu.org; Wed, 09 Dec 2020 14:22:22 -0500 Original-Received: from relay8-d.mail.gandi.net ([217.70.183.201]:46057) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kn52a-0004D7-5U for 44983@debbugs.gnu.org; Wed, 09 Dec 2020 14:22:20 -0500 X-Originating-IP: 91.129.99.98 Original-Received: from mail.gandi.net (m91-129-99-98.cust.tele2.ee [91.129.99.98]) (Authenticated sender: juri@linkov.net) by relay8-d.mail.gandi.net (Postfix) with ESMTPSA id C8BA01BF207; Wed, 9 Dec 2020 19:22:12 +0000 (UTC) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:195568 Archived-At: >>> Alternatively, xref--collect-matches-1 could apply the limit itself, no >>> matter whether grep or rg is used. And it could make sure to only do that >>> after the last match. This might be the slower option, but hard to say in >>> advance, some comparison benchmark could help here. >> I think until a long string is inserted to the buffer, truncating the >> string in the variable in xref--collect-matches-1 should be much faster. > > It would surely be faster, but how would that overhead compare to the > whole operation? > > Could be negligible, except in the most extreme cases. After all, the main > slowdown factor with long strings is the display engine, and it won't be in > play there. > > The upside is we'd be able to support column limiting with Grep too. Which > is the default configuration. And we'd extract the cutoff column into > a more visible user option. This is exactly what we need. After that this bug report/feature request can be closed. BTW, for sorting currently xref-search-program-alist uses: "| sort -t: -k1,1 -k2n,2" but fortunately ripgrep has a special option to do the same with: "--sort path" >>> That aside, could you explain the difference between the regexps? Do grep >>> and rg use different colors or something like that? Ideally, of course, >>> that would be just 1 regexp (if that's possible without loss in >>> performance, or significant loss in clarify). >> They should be merged into one regexp indeed. Because after customizing >> it >> to the rg regexp, grep output doesn't highlight matches anymore (I use both >> grep and rg interchangeably by different commands). >> Currently their separate regexps are: >> grep: >> "\033\\[0?1;31m >> \\(.*?\\) >> \033\\[[0-9]*m" >> rg: >> "\033\\[[0-9]*m >> \033\\[[0-9]*1m >> \033\\[[0-9]*1m >> \\(.*?\\) >> \033\\[[0-9]*0m" >> That could be combined into one regexp: >> "\033\\[[0-9?;]*m >> \\(?:\033\\[[0-9]*1m\\)\\{0,2\\} >> \\(.*?\\) >> \033\\[[0-9]*0?m" > > Makes sense. Is the parsing performance the same? Performance is not a problem. The problem is that more lax regexp causes more false positives. So the above regexp highlighted even the separator colons (':') between file names and column numbers. BTW, it's possible to see all highlighted parts of the output by changing the argument 'MODE' of 'compilation-start' in 'grep' from #'grep-mode to t (so it uses comint-mode in grep buffers). Anyway, I found the shortest change needed to support ripgrep, and pushed to master. > Also, with the increased complexity, I'd rather we added a couple of tests, > or a comment with output examples. Or maybe both. Fortunately, we have all possible cases listed in etc/grep.txt, so it was easy to check if everything is highlighted correctly now. Also I added ripgrep samples to etc/grep.txt.