unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Dmitry Gutov <dgutov@yandex.ru>
To: Juri Linkov <juri@linkov.net>
Cc: 44983@debbugs.gnu.org
Subject: bug#44983: Truncate long lines of grep output
Date: Wed, 9 Dec 2020 22:06:01 +0200	[thread overview]
Message-ID: <c7ee54eb-ee1a-3c7b-7c92-325a05c049c5@yandex.ru> (raw)
In-Reply-To: <87lfe6x1uf.fsf@mail.linkov.net>

On 09.12.2020 21:17, Juri Linkov wrote:
>>> I think until a long string is inserted to the buffer, truncating the
>>> string in the variable in xref--collect-matches-1 should be much faster.
>>
>> It would surely be faster, but how would that overhead compare to the
>> whole operation?
>>
>> Could be negligible, except in the most extreme cases. After all, the main
>> slowdown factor with long strings is the display engine, and it won't be in
>> play there.
>>
>> The upside is we'd be able to support column limiting with Grep too. Which
>> is the default configuration. And we'd extract the cutoff column into
>> a more visible user option.
> 
> This is exactly what we need.  After that this bug report/feature request
> can be closed.

Perhaps you would like to come up with the name for the new user option? 
The changes to xref--collect-matches-1 should be straightforward (it 
will include a choice, though: whether to cut off matches when they 
don't fit). Since you're the one who has experienced poor performance 
because of this, though, you can do the benchmarking. Basically, what we 
need to know is whether the new option indeed makes performance acceptable.

> BTW, for sorting currently xref-search-program-alist uses:
> 
>      "| sort -t: -k1,1 -k2n,2"
> 
> but fortunately ripgrep has a special option to do the same with:
> 
>      "--sort path"

Somehow, that option came out to be consistently slower in my 
benchmarking. Even when the results are only a few lines (that's 
actually when the difference should be most apparent, because with many 
lines Elisp takes up the most of CPU time). You can try it yourself:

(benchmark 10 '(project-find-regexp ":package-version '(xref"))

   0.86 with '| sort'
   1.33 with '--sort path'

$ rg --version
ripgrep 12.1.1 (rev 7cb211378a)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)

We can also document it in the docstring, though. For those who don't 
have 'sort' installed.

>>> They should be merged into one regexp indeed.  Because after customizing
>>> it
>>> to the rg regexp, grep output doesn't highlight matches anymore (I use both
>>> grep and rg interchangeably by different commands).
>>> Currently their separate regexps are:
>>> grep:
>>> "\033\\[0?1;31m
>>>    \\(.*?\\)
>>>    \033\\[[0-9]*m"
>>> rg:
>>> "\033\\[[0-9]*m
>>>    \033\\[[0-9]*1m
>>>    \033\\[[0-9]*1m
>>>    \\(.*?\\)
>>>    \033\\[[0-9]*0m"
>>> That could be combined into one regexp:
>>> "\033\\[[0-9?;]*m
>>>    \\(?:\033\\[[0-9]*1m\\)\\{0,2\\}
>>>    \\(.*?\\)
>>>    \033\\[[0-9]*0?m"
>>
>> Makes sense. Is the parsing performance the same?
> 
> Performance is not a problem.  The problem is that more lax regexp
> causes more false positives.  So the above regexp highlighted even
> the separator colons (':') between file names and column numbers.
> 
> BTW, it's possible to see all highlighted parts of the output
> by changing the argument 'MODE' of 'compilation-start' in 'grep'
> from #'grep-mode to t (so it uses comint-mode in grep buffers).

Because ansi-color-process-output is in comint-output-filter-functions?

> Anyway, I found the shortest change needed to support ripgrep,
> and pushed to master.

Excellent.

>> Also, with the increased complexity, I'd rather we added a couple of tests,
>> or a comment with output examples. Or maybe both.
> 
> Fortunately, we have all possible cases listed in etc/grep.txt,
> so it was easy to check if everything is highlighted correctly now.
> Also I added ripgrep samples to etc/grep.txt.

Thanks!





  reply	other threads:[~2020-12-09 20:06 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01  8:45 bug#44983: Truncate long lines of grep output Juri Linkov
2020-12-01 15:02 ` Dmitry Gutov
2020-12-01 16:09   ` Eli Zaretskii
2020-12-01 16:46     ` Andreas Schwab
2020-12-01 18:26       ` Eli Zaretskii
2020-12-01 20:35     ` Juri Linkov
2020-12-02  3:21       ` Eli Zaretskii
2020-12-02  9:35         ` Juri Linkov
2020-12-02 10:28           ` Eli Zaretskii
2020-12-02 20:53             ` Juri Linkov
2020-12-03 14:47               ` Eli Zaretskii
2020-12-03 16:30                 ` Rudolf Schlatte
2020-12-03 21:17                 ` Juri Linkov
2020-12-05 19:47                   ` Juri Linkov
2020-12-06 20:39                     ` Juri Linkov
2020-12-06 21:37                       ` Dmitry Gutov
2020-12-06 21:54                         ` Juri Linkov
2020-12-07  2:41                           ` Dmitry Gutov
2020-12-08 19:41                             ` Juri Linkov
2020-12-09  3:00                               ` Dmitry Gutov
2020-12-09 19:17                                 ` Juri Linkov
2020-12-09 20:06                                   ` Dmitry Gutov [this message]
2020-12-10  8:18                                     ` Juri Linkov
2020-12-10 20:48                                       ` Dmitry Gutov
2020-12-09 21:43                                   ` Jean Louis
2020-12-10  8:06                                     ` Juri Linkov
2020-12-10 10:08                                       ` Jean Louis
2020-12-12 20:42                                         ` Juri Linkov
2020-12-13 10:57                                           ` Jean Louis
2020-12-13 15:11                                           ` Eli Zaretskii
2020-12-13 15:37                                             ` Jean Louis
2020-12-13 20:17                                             ` Juri Linkov
2020-12-14 16:15                                               ` Eli Zaretskii
2020-12-14 20:09                                                 ` Dmitry Gutov
2020-12-24 20:33                                   ` Juri Linkov
2020-12-24 23:38                                     ` Dmitry Gutov
2020-12-08  5:35                         ` Richard Stallman
2020-12-08 19:15                           ` Dmitry Gutov
2022-04-29 11:39           ` Lars Ingebrigtsen
2022-04-29 12:22             ` Eli Zaretskii
2022-04-29 12:41               ` Lars Ingebrigtsen
2022-04-29 13:08                 ` Eli Zaretskii
2022-04-30  9:24                   ` Lars Ingebrigtsen
2022-04-30  9:36                     ` Lars Ingebrigtsen
2022-04-30 10:15                       ` Eli Zaretskii
2022-04-30 11:04                         ` Lars Ingebrigtsen
2022-04-29 16:02             ` Dmitry Gutov
2022-04-30  9:40               ` Lars Ingebrigtsen
2022-04-30  9:56                 ` Lars Ingebrigtsen
2022-04-30 10:09                   ` Eli Zaretskii
2022-04-30 10:59                     ` Lars Ingebrigtsen
2022-04-30 11:02                     ` Lars Ingebrigtsen
2022-04-30 11:12                       ` Eli Zaretskii
2022-04-29 17:15             ` Juri Linkov
2022-04-30  0:27               ` Dmitry Gutov
2022-05-01 17:14                 ` Juri Linkov
2020-12-01 20:34   ` Juri Linkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7ee54eb-ee1a-3c7b-7c92-325a05c049c5@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=44983@debbugs.gnu.org \
    --cc=juri@linkov.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).