* bug#44983: Truncate long lines of grep output @ 2020-12-01 8:45 Juri Linkov 2020-12-01 15:02 ` Dmitry Gutov 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-01 8:45 UTC (permalink / raw) To: 44983 [New bug report from emacs-devel] >>>> For grep output a bigger problem is that grep on binary data >>>> might output too long lines before the terminating newline. >>> >>> (*) We already have this kind of problem with "normal" files which contain >>> minified assets (JS or CSS). The file contents are usually normal ASCII, >>> but it's just one line which can reach several MBs in length. >>> >>> The usual way to deal with that is with project-ignores and >>> grep-find-ignored-files. That works for both cases. >> This is a bug problem - often grep output lines are so long >> that Emacs freezes, so need to kill the process. Updating >> manually ignored-files every time a new file causes freeze >> is very unreliable and time-consuming workaround. > > And a non-obvious one (for an average user). > > Is the same problem exhibited by commands using the Xref UI? I don't > remember seeing it, but of course our projects can be very different. No difference from grep, Xref output has the same problem. >> I tried to fix this problem, and fortunately the fix is simple >> with the 1-liner patch. >> It does exactly the same thing that we recently did to hide >> overly long grep command lines with 'grep-find-abbreviate'. >> The patch even uses the same 'grep-find-abbreviate-properties' >> to allow clicking the hidden part to expand it. >> diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el >> index dafba22f77..e0df2402ee 100644 >> --- a/lisp/progmodes/grep.el >> +++ b/lisp/progmodes/grep.el >> @@ -492,6 +492,9 @@ grep-mode-font-lock-keywords >> (0 grep-context-face) >> (1 (if (eq (char-after (match-beginning 1)) ?\0) >> `(face nil display ,(match-string 2))))) >> + ;; Hide excessive parts of grep output lines >> + ("^.+?:.\\{,64\\}\\(.*\\).\\{10\\}$" >> + 1 grep-find-abbreviate-properties) >> ;; Hide excessive part of rgrep command >> ("^find \\(\\. -type d .*\\\\)\\)" >> (1 (if grep-find-abbreviate grep-find-abbreviate-properties >> >> More customizability could be added later to define the >> length of the hidden part, etc. > > Maybe we'll want it to be dynamically determined by fill-column. > > Or just be a big enough value (e.g. 256) that the only lines where this > rule is hit are obviously too long. Or maybe determined by the frame width. This will avoid the need of using such workarounds as in bug#44941: grep -a "$@" | cut -c -200 ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 8:45 bug#44983: Truncate long lines of grep output Juri Linkov @ 2020-12-01 15:02 ` Dmitry Gutov 2020-12-01 16:09 ` Eli Zaretskii 2020-12-01 20:34 ` Juri Linkov 0 siblings, 2 replies; 57+ messages in thread From: Dmitry Gutov @ 2020-12-01 15:02 UTC (permalink / raw) To: Juri Linkov, 44983 On 01.12.2020 10:45, Juri Linkov wrote: > [New bug report from emacs-devel] >>>>> For grep output a bigger problem is that grep on binary data >>>>> might output too long lines before the terminating newline. >>>> >>>> (*) We already have this kind of problem with "normal" files which contain >>>> minified assets (JS or CSS). The file contents are usually normal ASCII, >>>> but it's just one line which can reach several MBs in length. >>>> >>>> The usual way to deal with that is with project-ignores and >>>> grep-find-ignored-files. That works for both cases. >>> This is a bug problem - often grep output lines are so long >>> that Emacs freezes, so need to kill the process. Updating >>> manually ignored-files every time a new file causes freeze >>> is very unreliable and time-consuming workaround. >> >> And a non-obvious one (for an average user). >> >> Is the same problem exhibited by commands using the Xref UI? I don't >> remember seeing it, but of course our projects can be very different. > > No difference from grep, Xref output has the same problem. Perhaps (setq truncate-lines t) could help in that case? Then the lines would be cut at the window width, as you suggest below. > This will avoid the need of using such workarounds as in bug#44941: > > grep -a "$@" | cut -c -200 That might cut filenames unnecessary. Even when those a long, we need them in their entirety. The Grep results parsing code could be changed to only take the first XY characters of each line, though. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 15:02 ` Dmitry Gutov @ 2020-12-01 16:09 ` Eli Zaretskii 2020-12-01 16:46 ` Andreas Schwab 2020-12-01 20:35 ` Juri Linkov 2020-12-01 20:34 ` Juri Linkov 1 sibling, 2 replies; 57+ messages in thread From: Eli Zaretskii @ 2020-12-01 16:09 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983, juri > From: Dmitry Gutov <dgutov@yandex.ru> > Date: Tue, 1 Dec 2020 17:02:09 +0200 > > >>> This is a bug problem - often grep output lines are so long > >>> that Emacs freezes, so need to kill the process. Updating > >>> manually ignored-files every time a new file causes freeze > >>> is very unreliable and time-consuming workaround. > >> > >> And a non-obvious one (for an average user). > >> > >> Is the same problem exhibited by commands using the Xref UI? I don't > >> remember seeing it, but of course our projects can be very different. > > > > No difference from grep, Xref output has the same problem. > > Perhaps (setq truncate-lines t) could help in that case? Not necessarily, because the truncated parts are still in the buffer, and the display code which is slow in that case basically moves through the buffer one character at a time in many cases. Only some specific scenarios (read: a small number of commands) can jump to the next physical line disregarding the truncated parts. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 16:09 ` Eli Zaretskii @ 2020-12-01 16:46 ` Andreas Schwab 2020-12-01 18:26 ` Eli Zaretskii 2020-12-01 20:35 ` Juri Linkov 1 sibling, 1 reply; 57+ messages in thread From: Andreas Schwab @ 2020-12-01 16:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Dmitry Gutov, 44983, juri On Dez 01 2020, Eli Zaretskii wrote: >> From: Dmitry Gutov <dgutov@yandex.ru> >> Date: Tue, 1 Dec 2020 17:02:09 +0200 >> >> >>> This is a bug problem - often grep output lines are so long >> >>> that Emacs freezes, so need to kill the process. Updating >> >>> manually ignored-files every time a new file causes freeze >> >>> is very unreliable and time-consuming workaround. >> >> >> >> And a non-obvious one (for an average user). >> >> >> >> Is the same problem exhibited by commands using the Xref UI? I don't >> >> remember seeing it, but of course our projects can be very different. >> > >> > No difference from grep, Xref output has the same problem. >> >> Perhaps (setq truncate-lines t) could help in that case? > > Not necessarily, because the truncated parts are still in the buffer, > and the display code which is slow in that case basically moves > through the buffer one character at a time in many cases. Only some > specific scenarios (read: a small number of commands) can jump to the > next physical line disregarding the truncated parts. But moving though the buffer is much faster than rendering it. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 16:46 ` Andreas Schwab @ 2020-12-01 18:26 ` Eli Zaretskii 0 siblings, 0 replies; 57+ messages in thread From: Eli Zaretskii @ 2020-12-01 18:26 UTC (permalink / raw) To: Andreas Schwab; +Cc: dgutov, 44983, juri > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: Dmitry Gutov <dgutov@yandex.ru>, 44983@debbugs.gnu.org, juri@linkov.net > Date: Tue, 01 Dec 2020 17:46:33 +0100 > > > Not necessarily, because the truncated parts are still in the buffer, > > and the display code which is slow in that case basically moves > > through the buffer one character at a time in many cases. Only some > > specific scenarios (read: a small number of commands) can jump to the > > next physical line disregarding the truncated parts. > > But moving though the buffer is much faster than rendering it. I meant moving in the likes of move_it_to. These simulate display, so they are almost as slow as rendering itself. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 16:09 ` Eli Zaretskii 2020-12-01 16:46 ` Andreas Schwab @ 2020-12-01 20:35 ` Juri Linkov 2020-12-02 3:21 ` Eli Zaretskii 1 sibling, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-01 20:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, Dmitry Gutov >> Perhaps (setq truncate-lines t) could help in that case? > > Not necessarily, because the truncated parts are still in the buffer, > and the display code which is slow in that case basically moves > through the buffer one character at a time in many cases. Only some > specific scenarios (read: a small number of commands) can jump to the > next physical line disregarding the truncated parts. It's very strange that after adding the text property 'display "[…]" on a very long line, motion commands are still very slow in that buffer. Could you help to understand why hiding long regions doesn't help to improve performance? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 20:35 ` Juri Linkov @ 2020-12-02 3:21 ` Eli Zaretskii 2020-12-02 9:35 ` Juri Linkov 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2020-12-02 3:21 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, dgutov > From: Juri Linkov <juri@linkov.net> > Cc: Dmitry Gutov <dgutov@yandex.ru>, 44983@debbugs.gnu.org > Date: Tue, 01 Dec 2020 22:35:55 +0200 > > >> Perhaps (setq truncate-lines t) could help in that case? > > > > Not necessarily, because the truncated parts are still in the buffer, > > and the display code which is slow in that case basically moves > > through the buffer one character at a time in many cases. Only some > > specific scenarios (read: a small number of commands) can jump to the > > next physical line disregarding the truncated parts. > > It's very strange that after adding the text property 'display "[…]" > on a very long line, motion commands are still very slow in that buffer. > > Could you help to understand why hiding long regions > doesn't help to improve performance? I can try, but please tell which commands are slow. Is it C-f/C-b, C-n/C-p, C-v/M-v, something else? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-02 3:21 ` Eli Zaretskii @ 2020-12-02 9:35 ` Juri Linkov 2020-12-02 10:28 ` Eli Zaretskii 2022-04-29 11:39 ` Lars Ingebrigtsen 0 siblings, 2 replies; 57+ messages in thread From: Juri Linkov @ 2020-12-02 9:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, dgutov >> It's very strange that after adding the text property 'display "[…]" >> on a very long line, motion commands are still very slow in that buffer. >> >> Could you help to understand why hiding long regions >> doesn't help to improve performance? > > I can try, but please tell which commands are slow. Is it C-f/C-b, > C-n/C-p, C-v/M-v, something else? Hmm, something strange is going on. After inserting million-char lines: (dotimes (_ 10) (insert (propertize (make-string 1000000 ?a) 'display "[…]" 'invisible t) "\n")) No problem, everything is still fast, C-f/C-b, C-n/C-p, C-v/M-v move fast. After saving to a file, grep on this file is fast with the previous patch that hides long lines. However, when grepping on minified web assets files where all styles and scripts are on one long line, then output becomes slower and slower as the line inserted by the grep process filter grows longer. It works this way: compilation-filter/grep-filter inserts the next chunk of the long line, then font-lock applies the rule from the previous patch that hides the inserted substring starting from the fixed position from the beginning of the line until the end of the line, and repeats the same for every new inserted chunk of the long line. Maybe instead of using font-lock to hide long parts of grep lines, it would be better to do the same directly in compilation-filter/grep-filter? Or maybe the problem is caused by special characters used in minified web assets that contain many '{' chars. And indeed, after inserting 100 thousands of '{' (insert (propertize (make-string 100000 ?{) 'display "[…]" 'invisible t) "\n") and saving to a file, later visiting such file Emacs becomes unresponsive for indefinite time. But visiting the file with 100 thousands '{' with find-file-literally causes no such problem. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-02 9:35 ` Juri Linkov @ 2020-12-02 10:28 ` Eli Zaretskii 2020-12-02 20:53 ` Juri Linkov 2022-04-29 11:39 ` Lars Ingebrigtsen 1 sibling, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2020-12-02 10:28 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, dgutov On December 2, 2020 11:35:38 AM GMT+02:00, Juri Linkov <juri@linkov.net> wrote: > > Or maybe the problem is caused by special characters > used in minified web assets that contain many '{' chars. > And indeed, after inserting 100 thousands of '{' > > (insert (propertize (make-string 100000 ?{) > 'display "[…]" 'invisible t) "\n") > > and saving to a file, later visiting such file > Emacs becomes unresponsive for indefinite time. > But visiting the file with 100 thousands '{' > with find-file-literally causes no such problem. Does it help to set bidi-inhibit-bpa non-nil? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-02 10:28 ` Eli Zaretskii @ 2020-12-02 20:53 ` Juri Linkov 2020-12-03 14:47 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-02 20:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, dgutov >> Or maybe the problem is caused by special characters >> used in minified web assets that contain many '{' chars. >> And indeed, after inserting 100 thousands of '{' >> >> (insert (propertize (make-string 100000 ?{) >> 'display "[…]" 'invisible t) "\n") >> >> and saving to a file, later visiting such file >> Emacs becomes unresponsive for indefinite time. >> But visiting the file with 100 thousands '{' >> with find-file-literally causes no such problem. > > Does it help to set bidi-inhibit-bpa non-nil? This helped to open the file with a lot of '{'. But on minified files grep.el is still very slow. Then instead of hiding long lines using font-lock, I tried to do the same using the process filter: (defun grep-filter () (save-excursion (let ((end (point-marker))) (goto-char compilation-filter-start) (forward-line 0) (while (< (point) end) (let ((eol (line-end-position))) (when (> (- eol (point)) 64) (put-text-property (+ 64 (point)) (line-end-position) 'display "[…]")) (forward-line 1)))))) Still very slow. Then tried to delete the excessive parts of long lines: (defun grep-filter-try () (save-excursion (let ((end (point-marker))) (goto-char compilation-filter-start) (forward-line 0) (while (< (point) end) (let ((eol (line-end-position))) (when (> (- eol (point)) 64) (delete-region (min (+ 64 (point)) (point-max)) (line-end-position))) (forward-line 1)))))) Now Emacs becomes more responsive. But still output processing takes too much time. Finally, the last thing to try was the same solution that Richard showed in bug#44941: grep -a "$@" | cut -c -200 that gives the best possible result. I doubt that it would be possible to invent something better. So the question is should this be customizable for adding `cut -c` automatically to the end of the grep command line, possibly with `stdbuf -oL` suggested by Andreas. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-02 20:53 ` Juri Linkov @ 2020-12-03 14:47 ` Eli Zaretskii 2020-12-03 16:30 ` Rudolf Schlatte 2020-12-03 21:17 ` Juri Linkov 0 siblings, 2 replies; 57+ messages in thread From: Eli Zaretskii @ 2020-12-03 14:47 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, dgutov > From: Juri Linkov <juri@linkov.net> > Cc: dgutov@yandex.ru, 44983@debbugs.gnu.org > Date: Wed, 02 Dec 2020 22:53:18 +0200 > > > Does it help to set bidi-inhibit-bpa non-nil? > > This helped to open the file with a lot of '{'. > But on minified files grep.el is still very slow. What are "minified files"? And when you say "slow" do you mean slow in receiving Grep output, slow in displaying the received output, or slow in moving though the *grep* buffer after everything was displayed? > Then instead of hiding long lines using font-lock, > I tried to do the same using the process filter: > > (defun grep-filter () > (save-excursion > (let ((end (point-marker))) > (goto-char compilation-filter-start) > (forward-line 0) > (while (< (point) end) > (let ((eol (line-end-position))) > (when (> (- eol (point)) 64) > (put-text-property (+ 64 (point)) (line-end-position) > 'display "[…]")) > (forward-line 1)))))) > > Still very slow. Same question as above. > Then tried to delete the excessive parts of long lines: > > (defun grep-filter-try () > (save-excursion > (let ((end (point-marker))) > (goto-char compilation-filter-start) > (forward-line 0) > (while (< (point) end) > (let ((eol (line-end-position))) > (when (> (- eol (point)) 64) > (delete-region (min (+ 64 (point)) (point-max)) (line-end-position))) > (forward-line 1)))))) > > Now Emacs becomes more responsive. But still output processing > takes too much time. What is "output processing", and how did you measure the time it takes? > Finally, the last thing to try was the same solution that Richard > showed in bug#44941: > > grep -a "$@" | cut -c -200 > > that gives the best possible result. > > I doubt that it would be possible to invent something better. > > So the question is should this be customizable for adding > `cut -c` automatically to the end of the grep command line, > possibly with `stdbuf -oL` suggested by Andreas. I suggested to request the equivalent of "cut -c" to be a feature added to Grep. Failing that, I don't think Emacs should do something like that, especially since 'cut' is not guaranteed to be available. Users who have such problems can, of course, modify the Grep command to do that. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-03 14:47 ` Eli Zaretskii @ 2020-12-03 16:30 ` Rudolf Schlatte 2020-12-03 21:17 ` Juri Linkov 1 sibling, 0 replies; 57+ messages in thread From: Rudolf Schlatte @ 2020-12-03 16:30 UTC (permalink / raw) To: 44983 Eli Zaretskii <eliz@gnu.org> writes: >> From: Juri Linkov <juri@linkov.net> >> Cc: dgutov@yandex.ru, 44983@debbugs.gnu.org >> Date: Wed, 02 Dec 2020 22:53:18 +0200 >> >> > Does it help to set bidi-inhibit-bpa non-nil? >> >> This helped to open the file with a lot of '{'. >> But on minified files grep.el is still very slow. > > What are "minified files"? Javascript libraries are often “minified” for deployment by shortening identifiers and eliminating whitespace, including linebreaks. So a 300kb library might be compressed into a 200kb one-line file. Trying to open such files makes Emacs unresponsive. Rudi ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-03 14:47 ` Eli Zaretskii 2020-12-03 16:30 ` Rudolf Schlatte @ 2020-12-03 21:17 ` Juri Linkov 2020-12-05 19:47 ` Juri Linkov 1 sibling, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-03 21:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, dgutov [-- Attachment #1: Type: text/plain, Size: 2664 bytes --] > And when you say "slow" do you mean slow in receiving Grep output, > slow in displaying the received output, or slow in moving though the > *grep* buffer after everything was displayed? Slow in receiving, slow in displaying, or but not slow in moving though the hidden parts of long lines. >> Then instead of hiding long lines using font-lock, >> I tried to do the same using the process filter: >> >> (defun grep-filter () >> (save-excursion >> (let ((end (point-marker))) >> (goto-char compilation-filter-start) >> (forward-line 0) >> (while (< (point) end) >> (let ((eol (line-end-position))) >> (when (> (- eol (point)) 64) >> (put-text-property (+ 64 (point)) (line-end-position) >> 'display "[…]")) >> (forward-line 1)))))) >> >> Still very slow. > > Same question as above. Slow in receiving and slow in displaying. >> Then tried to delete the excessive parts of long lines: >> >> (defun grep-filter-try () >> (save-excursion >> (let ((end (point-marker))) >> (goto-char compilation-filter-start) >> (forward-line 0) >> (while (< (point) end) >> (let ((eol (line-end-position))) >> (when (> (- eol (point)) 64) >> (delete-region (min (+ 64 (point)) (point-max)) (line-end-position))) >> (forward-line 1)))))) >> >> Now Emacs becomes more responsive. But still output processing >> takes too much time. > > What is "output processing", and how did you measure the time it > takes? Measuring visually, it takes too much time to output the long lines. >> Finally, the last thing to try was the same solution that Richard >> showed in bug#44941: >> >> grep -a "$@" | cut -c -200 >> >> that gives the best possible result. >> >> I doubt that it would be possible to invent something better. >> >> So the question is should this be customizable for adding >> `cut -c` automatically to the end of the grep command line, >> possibly with `stdbuf -oL` suggested by Andreas. > > I suggested to request the equivalent of "cut -c" to be a feature > added to Grep. > > Failing that, I don't think Emacs should do something like that, > especially since 'cut' is not guaranteed to be available. Users who > have such problems can, of course, modify the Grep command to do that. Finally I solved the long-standing problem by customizing grep-find-template to "find <D> <X> -type f <F> -print0 | sort -z | xargs -0 -e grep <C> --color=always -inH -e <R> | cut -c -200" I'm not sure if something like this could be added to grep, but here is an example how such a new option could look: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: gnu-sort-cut.patch --] [-- Type: text/x-diff, Size: 2051 bytes --] diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el index dafba22f77..a5a2142a9e 100644 --- a/lisp/progmodes/grep.el +++ b/lisp/progmodes/grep.el @@ -534,6 +534,7 @@ grep-find-use-xargs (const :tag "find -exec {} +" exec-plus) (const :tag "find -print0 | xargs -0" gnu) (const :tag "find -print0 | sort -z | xargs -0'" gnu-sort) + (const :tag "find -print0 | sort -z | xargs -0' ... | cut -c -200" gnu-sort-cut) string (const :tag "Not Set" nil)) :set #'grep-apply-setting @@ -722,7 +723,8 @@ grep-compute-defaults (goto-char (point-min)) (search-forward "--color" nil t)) ;; Windows and DOS pipes fail `isatty' detection in Grep. - (if (memq system-type '(windows-nt ms-dos)) + (if (or (eq grep-find-use-xargs 'gnu-sort-cut) + (memq system-type '(windows-nt ms-dos))) 'always 'auto))))) (unless (and grep-command grep-find-command @@ -775,6 +777,9 @@ grep-compute-defaults ((eq grep-find-use-xargs 'gnu-sort) (format "%s . -type f -print0 | sort -z | \"%s\" -0 %s" find-program xargs-program grep-command)) + ((eq grep-find-use-xargs 'gnu-sort-cut) + (format "%s . -type f -print0 | sort -z | \"%s\" -0 %s | cut -c -200" + find-program xargs-program grep-command)) ((memq grep-find-use-xargs '(exec exec-plus)) (let ((cmd0 (format "%s . -type f -exec %s" find-program grep-command)) @@ -803,6 +808,9 @@ grep-compute-defaults ((eq grep-find-use-xargs 'gnu-sort) (format "%s <D> <X> -type f <F> -print0 | sort -z | \"%s\" -0 %s" find-program xargs-program gcmd)) + ((eq grep-find-use-xargs 'gnu-sort-cut) + (format "%s <D> <X> -type f <F> -print0 | sort -z | \"%s\" -0 %s | cut -c -200" + find-program xargs-program gcmd)) ((eq grep-find-use-xargs 'exec) (format "%s <D> <X> -type f <F> -exec %s %s %s%s" find-program gcmd quot-braces null quot-scolon)) ^ permalink raw reply related [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-03 21:17 ` Juri Linkov @ 2020-12-05 19:47 ` Juri Linkov 2020-12-06 20:39 ` Juri Linkov 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-05 19:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, dgutov >> I suggested to request the equivalent of "cut -c" to be a feature >> added to Grep. >> >> Failing that, I don't think Emacs should do something like that, >> especially since 'cut' is not guaranteed to be available. Users who >> have such problems can, of course, modify the Grep command to do that. > > Finally I solved the long-standing problem by customizing > grep-find-template to > > "find <D> <X> -type f <F> -print0 | sort -z | xargs -0 -e grep <C> --color=always -inH -e <R> | cut -c -200" I noticed the problems caused by "cut -c": it counts bytes, not multi-byte characters. Even though it documentation says that -b selects bytes, and -c selects characters, still when used with "cut -c -200" it selects bytes, not UTF characters. Often it cuts in the middle of a multi-byte UTF-8 character, so octal codes are displayed at the end of grep lines. This is like the character limit for a SMS message is 160 characters, whereas actually this means not characters, but bytes, because on an UTF text the SMS limit is only 70 characters. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-05 19:47 ` Juri Linkov @ 2020-12-06 20:39 ` Juri Linkov 2020-12-06 21:37 ` Dmitry Gutov 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-06 20:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, dgutov > I noticed the problems caused by "cut -c": it counts bytes, > not multi-byte characters. Even though it documentation says > that -b selects bytes, and -c selects characters, still > when used with "cut -c -200" it selects bytes, not UTF characters. > > Often it cuts in the middle of a multi-byte UTF-8 character, > so octal codes are displayed at the end of grep lines. OTOH, ripgrep has the suitable options: -M, --max-columns NUM Don’t print lines longer than this limit in bytes. Longer lines are omitted, and only the number of matches in that line is printed. --max-columns-preview When the --max-columns flag is used, ripgrep will by default completely replace any line that is too long with a message indicating that a matching line was removed. When this flag is combined with --max-columns, a preview of the line (corresponding to the limit size) is shown instead, where the part of the line exceeding the limit is not shown. Wouldn't it be unthinkable to add support of ripgrep to grep.el? This will allow switching to ripgrep when there is a need to search in files with long lines. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-06 20:39 ` Juri Linkov @ 2020-12-06 21:37 ` Dmitry Gutov 2020-12-06 21:54 ` Juri Linkov 2020-12-08 5:35 ` Richard Stallman 0 siblings, 2 replies; 57+ messages in thread From: Dmitry Gutov @ 2020-12-06 21:37 UTC (permalink / raw) To: Juri Linkov, Eli Zaretskii; +Cc: 44983 On 06.12.2020 22:39, Juri Linkov wrote: >> I noticed the problems caused by "cut -c": it counts bytes, >> not multi-byte characters. Even though it documentation says >> that -b selects bytes, and -c selects characters, still >> when used with "cut -c -200" it selects bytes, not UTF characters. >> >> Often it cuts in the middle of a multi-byte UTF-8 character, >> so octal codes are displayed at the end of grep lines. > > OTOH, ripgrep has the suitable options: > > -M, --max-columns NUM > Don’t print lines longer than this limit in bytes. Longer lines are omitted, > and only the number of matches in that line is printed. > > --max-columns-preview > When the --max-columns flag is used, ripgrep will by default completely > replace any line that is too long with a message indicating that a matching > line was removed. When this flag is combined with --max-columns, a preview > of the line (corresponding to the limit size) is shown instead, where the > part of the line exceeding the limit is not shown. You can experiment with these Right Now(tm) by customizing xref-search-program-alist (as well as xref-search-program). They'll only affect commands that use xref-matches-in-files, though. > Wouldn't it be unthinkable to add support of ripgrep to grep.el? > This will allow switching to ripgrep when there is a need to > search in files with long lines. I'm fairly sure nothing in terms of politics is stopping us here, but if we wanted to update grep.el's abstractions to use different search programs, it looks like a bigger job to me. Though maybe you can get away with customizing a select number of variables? Like grep-template, grep-find-template, etc. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-06 21:37 ` Dmitry Gutov @ 2020-12-06 21:54 ` Juri Linkov 2020-12-07 2:41 ` Dmitry Gutov 2020-12-08 5:35 ` Richard Stallman 1 sibling, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-06 21:54 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983 >> OTOH, ripgrep has the suitable options: >> -M, --max-columns NUM >> Don’t print lines longer than this limit in bytes. Longer lines are omitted, >> and only the number of matches in that line is printed. >> --max-columns-preview >> When the --max-columns flag is used, ripgrep will by default completely >> replace any line that is too long with a message indicating that a matching >> line was removed. When this flag is combined with --max-columns, a preview >> of the line (corresponding to the limit size) is shown instead, where the >> part of the line exceeding the limit is not shown. > > You can experiment with these Right Now(tm) by customizing > xref-search-program-alist (as well as xref-search-program). They'll only > affect commands that use xref-matches-in-files, though. You mean adding "-M 200 --max-columns-preview" to xref-search-program-alist? It works nice, thanks. Should this be added by default? >> Wouldn't it be unthinkable to add support of ripgrep to grep.el? >> This will allow switching to ripgrep when there is a need to >> search in files with long lines. > > I'm fairly sure nothing in terms of politics is stopping us here, but if we > wanted to update grep.el's abstractions to use different search programs, > it looks like a bigger job to me. > > Though maybe you can get away with customizing a select number of > variables? Like grep-template, grep-find-template, etc. I customized grep-find-template to "find <D> <X> -type f <F> -print0 | sort -z | xargs -0 -e rg -inH --color always --no-heading -M 200 --max-columns-preview -e <R>" But this also requires customizing grep-match-regexp to the value "\033\\[[0-9]*m\033\\[[0-9]*1m\033\\[[0-9]*1m\\(.*?\\)\033\\[[0-9]*0m" provided by Simon in bug#41766. And also required a small fix in grep.el: diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el index dafba22f77..0a5fd6bf5d 100644 --- a/lisp/progmodes/grep.el +++ b/lisp/progmodes/grep.el @@ -412,7 +412,7 @@ grep-regexp-alist (- mend beg)))))) nil nil (3 '(face nil display ":"))) - ("^Binary file \\(.+\\) matches$" 1 nil nil 0 1)) + ("^Binary file \\(.+\\) matches" 1 nil nil 0 1)) "Regexp used to match grep hits. See `compilation-error-regexp-alist' for format details.") ^ permalink raw reply related [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-06 21:54 ` Juri Linkov @ 2020-12-07 2:41 ` Dmitry Gutov 2020-12-08 19:41 ` Juri Linkov 0 siblings, 1 reply; 57+ messages in thread From: Dmitry Gutov @ 2020-12-07 2:41 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983 On 06.12.2020 23:54, Juri Linkov wrote: >>> OTOH, ripgrep has the suitable options: >>> -M, --max-columns NUM >>> Don’t print lines longer than this limit in bytes. Longer lines are omitted, >>> and only the number of matches in that line is printed. >>> --max-columns-preview >>> When the --max-columns flag is used, ripgrep will by default completely >>> replace any line that is too long with a message indicating that a matching >>> line was removed. When this flag is combined with --max-columns, a preview >>> of the line (corresponding to the limit size) is shown instead, where the >>> part of the line exceeding the limit is not shown. >> >> You can experiment with these Right Now(tm) by customizing >> xref-search-program-alist (as well as xref-search-program). They'll only >> affect commands that use xref-matches-in-files, though. > > You mean adding "-M 200 --max-columns-preview" to xref-search-program-alist? Yup. > It works nice, thanks. Should this be added by default? Maybe someday? Currently, it has a certain side-effect: whenever there are matches that don't fit the specified width, they will be omitted from the resulting xref buffer. Depending on the user's intent, it can be a problem. Perhaps they did, after all, intend to search that minified JS file as well? This should be fixable (in xref--collect-matches-1, probably), but we'd have to consider carefully on what to do in situations like that. E.g., if we put some placeholder there, that would mean that "search and replace" won't work. Alternatively, xref--collect-matches-1 could apply the limit itself, no matter whether grep or rg is used. And it could make sure to only do that after the last match. This might be the slower option, but hard to say in advance, some comparison benchmark could help here. >>> Wouldn't it be unthinkable to add support of ripgrep to grep.el? >>> This will allow switching to ripgrep when there is a need to >>> search in files with long lines. >> >> I'm fairly sure nothing in terms of politics is stopping us here, but if we >> wanted to update grep.el's abstractions to use different search programs, >> it looks like a bigger job to me. >> >> Though maybe you can get away with customizing a select number of >> variables? Like grep-template, grep-find-template, etc. > > I customized grep-find-template to "find <D> <X> -type f <F> -print0 | sort -z | > xargs -0 -e rg -inH --color always --no-heading -M 200 --max-columns-preview -e <R>" > > But this also requires customizing grep-match-regexp to the value > "\033\\[[0-9]*m\033\\[[0-9]*1m\033\\[[0-9]*1m\\(.*?\\)\033\\[[0-9]*0m" > provided by Simon in bug#41766. It's odd your last suggestion in that bug was not applied (adding :type '(choice) to grep-match-regexp). Perhaps do that now? Although, personally, I've found a symbolic value to work better for a var like that (example: xref-search-program). This way we can ultimately consolidate info about a particular program in one place (some alist). That aside, could you explain the difference between the regexps? Do grep and rg use different colors or something like that? Ideally, of course, that would be just 1 regexp (if that's possible without loss in performance, or significant loss in clarify). > And also required a small fix in grep.el: > > diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el > index dafba22f77..0a5fd6bf5d 100644 > --- a/lisp/progmodes/grep.el > +++ b/lisp/progmodes/grep.el > @@ -412,7 +412,7 @@ grep-regexp-alist > (- mend beg)))))) > nil nil > (3 '(face nil display ":"))) > - ("^Binary file \\(.+\\) matches$" 1 nil nil 0 1)) > + ("^Binary file \\(.+\\) matches" 1 nil nil 0 1)) > "Regexp used to match grep hits. > See `compilation-error-regexp-alist' for format details.") Nice. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-07 2:41 ` Dmitry Gutov @ 2020-12-08 19:41 ` Juri Linkov 2020-12-09 3:00 ` Dmitry Gutov 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-08 19:41 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983 > Alternatively, xref--collect-matches-1 could apply the limit itself, no > matter whether grep or rg is used. And it could make sure to only do that > after the last match. This might be the slower option, but hard to say in > advance, some comparison benchmark could help here. I think until a long string is inserted to the buffer, truncating the string in the variable in xref--collect-matches-1 should be much faster. >> But this also requires customizing grep-match-regexp to the value >> "\033\\[[0-9]*m\033\\[[0-9]*1m\033\\[[0-9]*1m\\(.*?\\)\033\\[[0-9]*0m" >> provided by Simon in bug#41766. > > It's odd your last suggestion in that bug was not applied (adding :type > '(choice) to grep-match-regexp). Perhaps do that now? > > Although, personally, I've found a symbolic value to work better for a var > like that (example: xref-search-program). This way we can ultimately > consolidate info about a particular program in one place (some alist). > > That aside, could you explain the difference between the regexps? Do grep > and rg use different colors or something like that? Ideally, of course, > that would be just 1 regexp (if that's possible without loss in > performance, or significant loss in clarify). They should be merged into one regexp indeed. Because after customizing it to the rg regexp, grep output doesn't highlight matches anymore (I use both grep and rg interchangeably by different commands). Currently their separate regexps are: grep: "\033\\[0?1;31m \\(.*?\\) \033\\[[0-9]*m" rg: "\033\\[[0-9]*m \033\\[[0-9]*1m \033\\[[0-9]*1m \\(.*?\\) \033\\[[0-9]*0m" That could be combined into one regexp: "\033\\[[0-9?;]*m \\(?:\033\\[[0-9]*1m\\)\\{0,2\\} \\(.*?\\) \033\\[[0-9]*0?m" ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-08 19:41 ` Juri Linkov @ 2020-12-09 3:00 ` Dmitry Gutov 2020-12-09 19:17 ` Juri Linkov 0 siblings, 1 reply; 57+ messages in thread From: Dmitry Gutov @ 2020-12-09 3:00 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983 On 08.12.2020 21:41, Juri Linkov wrote: >> Alternatively, xref--collect-matches-1 could apply the limit itself, no >> matter whether grep or rg is used. And it could make sure to only do that >> after the last match. This might be the slower option, but hard to say in >> advance, some comparison benchmark could help here. > > I think until a long string is inserted to the buffer, truncating the > string in the variable in xref--collect-matches-1 should be much faster. It would surely be faster, but how would that overhead compare to the whole operation? Could be negligible, except in the most extreme cases. After all, the main slowdown factor with long strings is the display engine, and it won't be in play there. The upside is we'd be able to support column limiting with Grep too. Which is the default configuration. And we'd extract the cutoff column into a more visible user option. >> That aside, could you explain the difference between the regexps? Do grep >> and rg use different colors or something like that? Ideally, of course, >> that would be just 1 regexp (if that's possible without loss in >> performance, or significant loss in clarify). > > They should be merged into one regexp indeed. Because after customizing it > to the rg regexp, grep output doesn't highlight matches anymore (I use both > grep and rg interchangeably by different commands). > > Currently their separate regexps are: > > grep: > "\033\\[0?1;31m > \\(.*?\\) > \033\\[[0-9]*m" > > rg: > "\033\\[[0-9]*m > \033\\[[0-9]*1m > \033\\[[0-9]*1m > \\(.*?\\) > \033\\[[0-9]*0m" > > That could be combined into one regexp: > > "\033\\[[0-9?;]*m > \\(?:\033\\[[0-9]*1m\\)\\{0,2\\} > \\(.*?\\) > \033\\[[0-9]*0?m" Makes sense. Is the parsing performance the same? Also, with the increased complexity, I'd rather we added a couple of tests, or a comment with output examples. Or maybe both. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-09 3:00 ` Dmitry Gutov @ 2020-12-09 19:17 ` Juri Linkov 2020-12-09 20:06 ` Dmitry Gutov ` (2 more replies) 0 siblings, 3 replies; 57+ messages in thread From: Juri Linkov @ 2020-12-09 19:17 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983 >>> Alternatively, xref--collect-matches-1 could apply the limit itself, no >>> matter whether grep or rg is used. And it could make sure to only do that >>> after the last match. This might be the slower option, but hard to say in >>> advance, some comparison benchmark could help here. >> I think until a long string is inserted to the buffer, truncating the >> string in the variable in xref--collect-matches-1 should be much faster. > > It would surely be faster, but how would that overhead compare to the > whole operation? > > Could be negligible, except in the most extreme cases. After all, the main > slowdown factor with long strings is the display engine, and it won't be in > play there. > > The upside is we'd be able to support column limiting with Grep too. Which > is the default configuration. And we'd extract the cutoff column into > a more visible user option. This is exactly what we need. After that this bug report/feature request can be closed. BTW, for sorting currently xref-search-program-alist uses: "| sort -t: -k1,1 -k2n,2" but fortunately ripgrep has a special option to do the same with: "--sort path" >>> That aside, could you explain the difference between the regexps? Do grep >>> and rg use different colors or something like that? Ideally, of course, >>> that would be just 1 regexp (if that's possible without loss in >>> performance, or significant loss in clarify). >> They should be merged into one regexp indeed. Because after customizing >> it >> to the rg regexp, grep output doesn't highlight matches anymore (I use both >> grep and rg interchangeably by different commands). >> Currently their separate regexps are: >> grep: >> "\033\\[0?1;31m >> \\(.*?\\) >> \033\\[[0-9]*m" >> rg: >> "\033\\[[0-9]*m >> \033\\[[0-9]*1m >> \033\\[[0-9]*1m >> \\(.*?\\) >> \033\\[[0-9]*0m" >> That could be combined into one regexp: >> "\033\\[[0-9?;]*m >> \\(?:\033\\[[0-9]*1m\\)\\{0,2\\} >> \\(.*?\\) >> \033\\[[0-9]*0?m" > > Makes sense. Is the parsing performance the same? Performance is not a problem. The problem is that more lax regexp causes more false positives. So the above regexp highlighted even the separator colons (':') between file names and column numbers. BTW, it's possible to see all highlighted parts of the output by changing the argument 'MODE' of 'compilation-start' in 'grep' from #'grep-mode to t (so it uses comint-mode in grep buffers). Anyway, I found the shortest change needed to support ripgrep, and pushed to master. > Also, with the increased complexity, I'd rather we added a couple of tests, > or a comment with output examples. Or maybe both. Fortunately, we have all possible cases listed in etc/grep.txt, so it was easy to check if everything is highlighted correctly now. Also I added ripgrep samples to etc/grep.txt. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-09 19:17 ` Juri Linkov @ 2020-12-09 20:06 ` Dmitry Gutov 2020-12-10 8:18 ` Juri Linkov 2020-12-09 21:43 ` Jean Louis 2020-12-24 20:33 ` Juri Linkov 2 siblings, 1 reply; 57+ messages in thread From: Dmitry Gutov @ 2020-12-09 20:06 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983 On 09.12.2020 21:17, Juri Linkov wrote: >>> I think until a long string is inserted to the buffer, truncating the >>> string in the variable in xref--collect-matches-1 should be much faster. >> >> It would surely be faster, but how would that overhead compare to the >> whole operation? >> >> Could be negligible, except in the most extreme cases. After all, the main >> slowdown factor with long strings is the display engine, and it won't be in >> play there. >> >> The upside is we'd be able to support column limiting with Grep too. Which >> is the default configuration. And we'd extract the cutoff column into >> a more visible user option. > > This is exactly what we need. After that this bug report/feature request > can be closed. Perhaps you would like to come up with the name for the new user option? The changes to xref--collect-matches-1 should be straightforward (it will include a choice, though: whether to cut off matches when they don't fit). Since you're the one who has experienced poor performance because of this, though, you can do the benchmarking. Basically, what we need to know is whether the new option indeed makes performance acceptable. > BTW, for sorting currently xref-search-program-alist uses: > > "| sort -t: -k1,1 -k2n,2" > > but fortunately ripgrep has a special option to do the same with: > > "--sort path" Somehow, that option came out to be consistently slower in my benchmarking. Even when the results are only a few lines (that's actually when the difference should be most apparent, because with many lines Elisp takes up the most of CPU time). You can try it yourself: (benchmark 10 '(project-find-regexp ":package-version '(xref")) 0.86 with '| sort' 1.33 with '--sort path' $ rg --version ripgrep 12.1.1 (rev 7cb211378a) -SIMD -AVX (compiled) +SIMD +AVX (runtime) We can also document it in the docstring, though. For those who don't have 'sort' installed. >>> They should be merged into one regexp indeed. Because after customizing >>> it >>> to the rg regexp, grep output doesn't highlight matches anymore (I use both >>> grep and rg interchangeably by different commands). >>> Currently their separate regexps are: >>> grep: >>> "\033\\[0?1;31m >>> \\(.*?\\) >>> \033\\[[0-9]*m" >>> rg: >>> "\033\\[[0-9]*m >>> \033\\[[0-9]*1m >>> \033\\[[0-9]*1m >>> \\(.*?\\) >>> \033\\[[0-9]*0m" >>> That could be combined into one regexp: >>> "\033\\[[0-9?;]*m >>> \\(?:\033\\[[0-9]*1m\\)\\{0,2\\} >>> \\(.*?\\) >>> \033\\[[0-9]*0?m" >> >> Makes sense. Is the parsing performance the same? > > Performance is not a problem. The problem is that more lax regexp > causes more false positives. So the above regexp highlighted even > the separator colons (':') between file names and column numbers. > > BTW, it's possible to see all highlighted parts of the output > by changing the argument 'MODE' of 'compilation-start' in 'grep' > from #'grep-mode to t (so it uses comint-mode in grep buffers). Because ansi-color-process-output is in comint-output-filter-functions? > Anyway, I found the shortest change needed to support ripgrep, > and pushed to master. Excellent. >> Also, with the increased complexity, I'd rather we added a couple of tests, >> or a comment with output examples. Or maybe both. > > Fortunately, we have all possible cases listed in etc/grep.txt, > so it was easy to check if everything is highlighted correctly now. > Also I added ripgrep samples to etc/grep.txt. Thanks! ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-09 20:06 ` Dmitry Gutov @ 2020-12-10 8:18 ` Juri Linkov 2020-12-10 20:48 ` Dmitry Gutov 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-10 8:18 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983 > Perhaps you would like to come up with the name for the new user option? Maybe something like 'xref-search-truncate' with a number of columns, nil by default. >> BTW, for sorting currently xref-search-program-alist uses: >> "| sort -t: -k1,1 -k2n,2" >> but fortunately ripgrep has a special option to do the same with: >> "--sort path" > > Somehow, that option came out to be consistently slower in my > benchmarking. Even when the results are only a few lines (that's actually > when the difference should be most apparent, because with many lines Elisp > takes up the most of CPU time). You can try it yourself: > > (benchmark 10 '(project-find-regexp ":package-version '(xref")) > > 0.86 with '| sort' > 1.33 with '--sort path' I confirm that in my tests '--sort path' is 2 times slower than '| sort'. >> BTW, it's possible to see all highlighted parts of the output >> by changing the argument 'MODE' of 'compilation-start' in 'grep' >> from #'grep-mode to t (so it uses comint-mode in grep buffers). > > Because ansi-color-process-output is in comint-output-filter-functions? Exactly. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-10 8:18 ` Juri Linkov @ 2020-12-10 20:48 ` Dmitry Gutov 0 siblings, 0 replies; 57+ messages in thread From: Dmitry Gutov @ 2020-12-10 20:48 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983 On 10.12.2020 10:18, Juri Linkov wrote: >>> BTW, for sorting currently xref-search-program-alist uses: >>> "| sort -t: -k1,1 -k2n,2" >>> but fortunately ripgrep has a special option to do the same with: >>> "--sort path" >> Somehow, that option came out to be consistently slower in my >> benchmarking. Even when the results are only a few lines (that's actually >> when the difference should be most apparent, because with many lines Elisp >> takes up the most of CPU time). You can try it yourself: >> >> (benchmark 10 '(project-find-regexp ":package-version '(xref")) >> >> 0.86 with '| sort' >> 1.33 with '--sort path' > I confirm that in my tests '--sort path' is 2 times slower than '| sort'. And that's because '--sort path' forces single-threaded mode: https://github.com/BurntSushi/ripgrep/issues/152 ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-09 19:17 ` Juri Linkov 2020-12-09 20:06 ` Dmitry Gutov @ 2020-12-09 21:43 ` Jean Louis 2020-12-10 8:06 ` Juri Linkov 2020-12-24 20:33 ` Juri Linkov 2 siblings, 1 reply; 57+ messages in thread From: Jean Louis @ 2020-12-09 21:43 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, Dmitry Gutov Also see this: https://www.topbug.net/blog/2016/08/18/truncate-long-matching-lines-of-grep-a-solution-that-preserves-color/ ,---- | For the example above, the following command should print only 20 | characters before and after the searching keyword (This requires GNU | grep. If you are on Mac OS X and using the BSD grep, please consider | following this article to install GNU grep): | | grep -oE '.{0,20}jQuery.{0,20}' bootstrap.min.js `---- where I get this: grep -o --color -nH --null -E ".{0,20}setting.{0,20}" tmp-2020-11-26-01:3* tmp-2020-11-26-01:32:17986egO\03: supported, but its setting does not have prior Grep finished with 1 match found at Thu Dec 10 00:42:21 from this line long made-up line: ‘--color[=WHEN]’ ‘--colour[=WHEN]’ Surround the matched (non-empty) strings, matching lines, context lines, file names, line numbers, byte offsets, and separators (for fields and groups of context lines) with escape sequences to display them in color on the terminal. The colors are defined by the environment variable ‘GREP_COLORS’ and default to ‘ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36’ for bold red matched text, magenta file names, green line numbers, green byte offsets, cyan separators, and default terminal colors otherwise. The deprecated environment variable ‘GREP_COLOR’ is still supported, but its setting does not have priority; it defaults to ‘01;31’ (bold red) which only covers the color for matched text. WHEN is ‘never’, ‘always’, or ‘auto’. ‘-L’ ‘--files-without-match’ Suppress normal output; instead print the name of each input file from which no output would normally have been printed. The scanning of each file stops on the first match. ‘-l’ ‘--files-with-matches’ Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning of each file stops on the first match. (‘-l’ is specified by POSIX.) and that solves the problem of truncating long lines. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-09 21:43 ` Jean Louis @ 2020-12-10 8:06 ` Juri Linkov 2020-12-10 10:08 ` Jean Louis 0 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-10 8:06 UTC (permalink / raw) To: Jean Louis; +Cc: 44983, Dmitry Gutov > Also see this: > ,---- > | grep -oE '.{0,20}jQuery.{0,20}' bootstrap.min.js > `---- But what if the user enters such a regexp as "abc|xyz", then it will be composed into such command: grep -oE '.{0,20}abc|xyz.{0,20}' that matches either 20 characters before "abc", or 20 characters after "xyz". Then needs to add parentheses: grep -oE '.{0,20}(abc|xyz).{0,20}' What is worse is that the whole match is highlighted, including 20 characters before and after the real match. So it seems this solution is not perfect. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-10 8:06 ` Juri Linkov @ 2020-12-10 10:08 ` Jean Louis 2020-12-12 20:42 ` Juri Linkov 0 siblings, 1 reply; 57+ messages in thread From: Jean Louis @ 2020-12-10 10:08 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, Dmitry Gutov * Juri Linkov <juri@linkov.net> [2020-12-10 11:34]: > > Also see this: > > ,---- > > | grep -oE '.{0,20}jQuery.{0,20}' bootstrap.min.js > > `---- > > But what if the user enters such a regexp as "abc|xyz", > then it will be composed into such command: > > grep -oE '.{0,20}abc|xyz.{0,20}' > > that matches either 20 characters before "abc", or 20 characters > after "xyz". Then needs to add parentheses: > > grep -oE '.{0,20}(abc|xyz).{0,20}' I do not find it problematic. Grep is anyway kind of advanced tool. I think that Emacs "Search for files (grep)" menu option is anyway not user friendly. It is made for those who know what is GNU/Linux, UNIX, BSD. When user is faced with that option most probably will give up soon in using it. Because the prompt asks user to enter something like: grep --color -nH --null -e but does not tell the user what it means, neither that one has to put joker or file names after the term. Usability is degraded as the function is only for advanced users there. Majority of GNU/Linux users use GUI for any work. In that sense advanced users should know how to use grep to at least get results they need and want. You put good intentions to beautify the grep output. But it is probably not necessary. They will not mind of highlighting. They can do: grep -nH --null -e And there will be no highlighting. It gives the result. What would be more user friendly would be a form or wizard that would specify if all files are to be searched or recursively, and what would be the search term. That would degrade power of grep but it would be more user friendly to many people. In my opinion I believe that majority of users who ever clicked "Search Files (grep)" gave up after few attempts. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-10 10:08 ` Jean Louis @ 2020-12-12 20:42 ` Juri Linkov 2020-12-13 10:57 ` Jean Louis 2020-12-13 15:11 ` Eli Zaretskii 0 siblings, 2 replies; 57+ messages in thread From: Juri Linkov @ 2020-12-12 20:42 UTC (permalink / raw) To: Jean Louis; +Cc: 44983, Dmitry Gutov > I do not find it problematic. Grep is anyway kind of advanced tool. I > think that Emacs "Search for files (grep)" menu option is anyway not > user friendly. > ... > What would be more user friendly would be a form or wizard that would > specify if all files are to be searched or recursively, and what would > be the search term. That would degrade power of grep but it would be > more user friendly to many people. > > In my opinion I believe that majority of users who ever clicked > "Search Files (grep)" gave up after few attempts. Indeed, "Search for files (grep)" menu option is not user friendly. This is why we added a wizard command "Recursive Grep..." under it in the same menu. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-12 20:42 ` Juri Linkov @ 2020-12-13 10:57 ` Jean Louis 2020-12-13 15:11 ` Eli Zaretskii 1 sibling, 0 replies; 57+ messages in thread From: Jean Louis @ 2020-12-13 10:57 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, Dmitry Gutov * Juri Linkov <juri@linkov.net> [2020-12-13 00:09]: > > I do not find it problematic. Grep is anyway kind of advanced tool. I > > think that Emacs "Search for files (grep)" menu option is anyway not > > user friendly. > > ... > > What would be more user friendly would be a form or wizard that would > > specify if all files are to be searched or recursively, and what would > > be the search term. That would degrade power of grep but it would be > > more user friendly to many people. > > > > In my opinion I believe that majority of users who ever clicked > > "Search Files (grep)" gave up after few attempts. > > Indeed, "Search for files (grep)" menu option is not user friendly. > This is why we added a wizard command "Recursive Grep..." under it > in the same menu. Good for programmers, good for you and good for me. Emacs is for advanced users from that view point. From that view point everything fits into place. From view point of users coming to Emacs "Recursive Grep" will not have its meaning. Or any meaning at all. It would be good to have a popularity-contest package similar to Debian, where one could gather statistics what is actually used by some users and submit that statistics. Other good test could be to put 5 people together who used computers for last 10 years regardless of their operating system and tell them to open up Emacs and find files containing the term "Emacs" and watch how they are doing. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-12 20:42 ` Juri Linkov 2020-12-13 10:57 ` Jean Louis @ 2020-12-13 15:11 ` Eli Zaretskii 2020-12-13 15:37 ` Jean Louis 2020-12-13 20:17 ` Juri Linkov 1 sibling, 2 replies; 57+ messages in thread From: Eli Zaretskii @ 2020-12-13 15:11 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, bugs, dgutov > From: Juri Linkov <juri@linkov.net> > Date: Sat, 12 Dec 2020 22:42:13 +0200 > Cc: 44983@debbugs.gnu.org, Dmitry Gutov <dgutov@yandex.ru> > > > I do not find it problematic. Grep is anyway kind of advanced tool. I > > think that Emacs "Search for files (grep)" menu option is anyway not > > user friendly. > > ... > > What would be more user friendly would be a form or wizard that would > > specify if all files are to be searched or recursively, and what would > > be the search term. That would degrade power of grep but it would be > > more user friendly to many people. > > > > In my opinion I believe that majority of users who ever clicked > > "Search Files (grep)" gave up after few attempts. > > Indeed, "Search for files (grep)" menu option is not user friendly. In what way is it not user-friendly? It just invokes "M-x grep". ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-13 15:11 ` Eli Zaretskii @ 2020-12-13 15:37 ` Jean Louis 2020-12-13 20:17 ` Juri Linkov 1 sibling, 0 replies; 57+ messages in thread From: Jean Louis @ 2020-12-13 15:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Juri Linkov, 44983, dgutov * Eli Zaretskii <eliz@gnu.org> [2020-12-13 18:12]: > > From: Juri Linkov <juri@linkov.net> > > Date: Sat, 12 Dec 2020 22:42:13 +0200 > > Cc: 44983@debbugs.gnu.org, Dmitry Gutov <dgutov@yandex.ru> > > > > > I do not find it problematic. Grep is anyway kind of advanced tool. I > > > think that Emacs "Search for files (grep)" menu option is anyway not > > > user friendly. > > > ... > > > What would be more user friendly would be a form or wizard that would > > > specify if all files are to be searched or recursively, and what would > > > be the search term. That would degrade power of grep but it would be > > > more user friendly to many people. > > > > > > In my opinion I believe that majority of users who ever clicked > > > "Search Files (grep)" gave up after few attempts. > > > > Indeed, "Search for files (grep)" menu option is not user friendly. > > In what way is it not user-friendly? It just invokes "M-x grep". User of Emacs are many, just Debian GNU/Linux reports 16000 users known from the popularity contest package. It is probably small percentage of overall number of users. Recently there was Emacs survey and they interviewed 7000 users. Emacs has many bugs but we do not get enough bugs reported. The ratio is reported bugs does not nearly correspond to number of users. From our view point it is user friendly. For me is user friendly if we place Emacs functions in the menu without their descriptions. From view point of many thousands of users it is not user friendly and means nothing. What does Recursive grep means? You have to know command line to know what it means. Majority of GNU/Linux users do not even use command line or terminals. We use it, but we are not representative number of users. "Search files recursively" would be better useful meaning "Recursive grep" is reserved for power users. It is user friendly for subset of users, not for majority of users. Message from my staff member who was using Emacs and went thoroughly through Tutorial: [18:34] Happiness > > I have one analysis question, without expectation: > Would you know how to search files by using Emacs? > Do you know what means "grep"? > Do you know what is "recursive grep"? > No need to look up, just tell me I have learned it but it might need me to repeat again as in the tutorial I was practising, not yet well captured these terms on memory But tutorial is not related to those terms. She cannot know what I mean possibly. She can write reports but would not, without special explanation, understand what means "Recursive grep". ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-13 15:11 ` Eli Zaretskii 2020-12-13 15:37 ` Jean Louis @ 2020-12-13 20:17 ` Juri Linkov 2020-12-14 16:15 ` Eli Zaretskii 1 sibling, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-13 20:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 44983, bugs, dgutov >> > In my opinion I believe that majority of users who ever clicked >> > "Search Files (grep)" gave up after few attempts. >> >> Indeed, "Search for files (grep)" menu option is not user friendly. > > In what way is it not user-friendly? It just invokes "M-x grep". It's not friendly for users who don't know syntax of grep command line. OTOH, "Recursive grep" (rgrep) is easier to use, but its menu item text is not clear to users who don't know what is grep. Maybe a better title for 'rgrep' would be "Search text in files"? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-13 20:17 ` Juri Linkov @ 2020-12-14 16:15 ` Eli Zaretskii 2020-12-14 20:09 ` Dmitry Gutov 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2020-12-14 16:15 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, bugs, dgutov > From: Juri Linkov <juri@linkov.net> > Cc: bugs@gnu.support, 44983@debbugs.gnu.org, dgutov@yandex.ru > Date: Sun, 13 Dec 2020 22:17:23 +0200 > > >> > In my opinion I believe that majority of users who ever clicked > >> > "Search Files (grep)" gave up after few attempts. > >> > >> Indeed, "Search for files (grep)" menu option is not user friendly. > > > > In what way is it not user-friendly? It just invokes "M-x grep". > > It's not friendly for users who don't know syntax of grep command line. If someone wants to add a more user-friendly dialog for searching text (or perhaps reuse a dialog provided by the GUI toolkits), I think it will be welcome. It is not a simple job, though, because the dialog should allow access to most of the advanced features of Grep. > OTOH, "Recursive grep" (rgrep) is easier to use, but its menu item text > is not clear to users who don't know what is grep. Maybe a better title > for 'rgrep' would be "Search text in files"? FWIW, I don't think rgrep is significantly more user-friendly, so IMO it is not the model on which to base a better UI. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-14 16:15 ` Eli Zaretskii @ 2020-12-14 20:09 ` Dmitry Gutov 0 siblings, 0 replies; 57+ messages in thread From: Dmitry Gutov @ 2020-12-14 20:09 UTC (permalink / raw) To: Eli Zaretskii, Juri Linkov; +Cc: 44983, bugs On 14.12.2020 18:15, Eli Zaretskii wrote: >>>>> In my opinion I believe that majority of users who ever clicked >>>>> "Search Files (grep)" gave up after few attempts. >>>> Indeed, "Search for files (grep)" menu option is not user friendly. >>> In what way is it not user-friendly? It just invokes "M-x grep". >> It's not friendly for users who don't know syntax of grep command line. > If someone wants to add a more user-friendly dialog for searching text > (or perhaps reuse a dialog provided by the GUI toolkits), I think it > will be welcome. It is not a simple job, though, because the dialog > should allow access to most of the advanced features of Grep. Perhaps a better option would be to take advantage of the 'transient' package (currently in GNU ELPA, but unreleased). Here's an example of its UI (bottom window): https://camo.githubusercontent.com/f87497aec74dd0efee4ef78ba2b33b24d5535446b5d5cbef768653f4b945c38c/687474703a2f2f726561646d652e656d6163736169722e6d652f7472616e7369656e742e706e67 ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-09 19:17 ` Juri Linkov 2020-12-09 20:06 ` Dmitry Gutov 2020-12-09 21:43 ` Jean Louis @ 2020-12-24 20:33 ` Juri Linkov 2020-12-24 23:38 ` Dmitry Gutov 2 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2020-12-24 20:33 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983 [-- Attachment #1: Type: text/plain, Size: 283 bytes --] > Anyway, I found the shortest change needed to support ripgrep, > and pushed to master. Here is another patch needed to support rg because currently rg fails when --color is used without a value. OTOH, in grep --color is the same as --color=auto, so this is a win-win situation: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: grep-color-auto.patch --] [-- Type: text/x-diff, Size: 1938 bytes --] diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el index 5dc99cc7e9..ef73dac4c0 100644 --- a/lisp/progmodes/grep.el +++ b/lisp/progmodes/grep.el @@ -79,7 +79,7 @@ grep-highlight-matches markers for highlighting and adds the --color option in front of any explicit grep options before starting the grep. -When this option is `auto', grep uses `--color' to highlight +When this option is `auto', grep uses `--color=auto' to highlight matches only when it outputs to a terminal (when `grep' is the last command in the pipe), thus avoiding the use of any potentially-harmful escape sequences when standard output goes to a file or pipe. @@ -95,7 +95,7 @@ grep-highlight-matches :type '(choice (const :tag "Do not highlight matches with grep markers" nil) (const :tag "Highlight matches with grep markers" t) (const :tag "Use --color=always" always) - (const :tag "Use --color" auto) + (const :tag "Use --color=auto" auto) (other :tag "Not Set" auto-detect)) :set #'grep-apply-setting :version "22.1") @@ -743,7 +743,7 @@ grep-compute-defaults `(nil nil nil "--color" "x" ,(null-device)) nil 1) (if (eq grep-highlight-matches 'always) - "--color=always" "--color")) + "--color=always" "--color=auto")) "") grep-options))) (unless grep-template @@ -1000,7 +1000,7 @@ grep-expand-template ((eq grep-highlight-matches 'always) (push "--color=always" opts)) ((eq grep-highlight-matches 'auto) - (push "--color" opts))) + (push "--color=auto" opts))) opts)) (excl . ,excl) (dir . ,dir) ^ permalink raw reply related [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-24 20:33 ` Juri Linkov @ 2020-12-24 23:38 ` Dmitry Gutov 0 siblings, 0 replies; 57+ messages in thread From: Dmitry Gutov @ 2020-12-24 23:38 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983 On 24.12.2020 22:33, Juri Linkov wrote: > Here is another patch needed to support rg because currently rg fails > when --color is used without a value. OTOH, in grep --color is the same as > --color=auto, so this is a win-win situation: Makes sense. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-06 21:37 ` Dmitry Gutov 2020-12-06 21:54 ` Juri Linkov @ 2020-12-08 5:35 ` Richard Stallman 2020-12-08 19:15 ` Dmitry Gutov 1 sibling, 1 reply; 57+ messages in thread From: Richard Stallman @ 2020-12-08 5:35 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] What is xref-search? Is this something I could use instead of cut, to truncate long lines of grep output? -- Dr Richard Stallman Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-08 5:35 ` Richard Stallman @ 2020-12-08 19:15 ` Dmitry Gutov 0 siblings, 0 replies; 57+ messages in thread From: Dmitry Gutov @ 2020-12-08 19:15 UTC (permalink / raw) To: rms; +Cc: 44983, juri On 08.12.2020 07:35, Richard Stallman wrote: > What is xref-search? We don't actually employ such a notion, but if I was asked to define it, it would be the act of using a command based on xref-matches-in-files (which see). The main thing that separates that from 'M-x grep', though, is the implementation approach. > Is this something I could use instead of cut, to truncate > long lines of grep output? You can use the commands based on it. And we could change the implementation of the aforementioned function that it would "cut" such long lines. In that case, the cutting could be performed using Emacs Lisp. 'cut' could still be used instead, though. Or 'ripgrep' could be instructed to do that. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-02 9:35 ` Juri Linkov 2020-12-02 10:28 ` Eli Zaretskii @ 2022-04-29 11:39 ` Lars Ingebrigtsen 2022-04-29 12:22 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-29 11:39 UTC (permalink / raw) To: Juri Linkov; +Cc: 44983, dgutov Juri Linkov <juri@linkov.net> writes: > Maybe instead of using font-lock to hide long parts > of grep lines, it would be better to do the same > directly in compilation-filter/grep-filter? I now have a rough patch that does this, but the problem is that even if I splat a "..." display over the text, font-lock seems to insist on going over the data anyway, so the display is still dog slow. I thought I remembered there was a way to say to font-lock "ignore this bit of the buffer", but I can't find it now. Do I misremember? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 11:39 ` Lars Ingebrigtsen @ 2022-04-29 12:22 ` Eli Zaretskii 2022-04-29 12:41 ` Lars Ingebrigtsen 2022-04-29 16:02 ` Dmitry Gutov 2022-04-29 17:15 ` Juri Linkov 2 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2022-04-29 12:22 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: juri, 44983, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org>, 44983@debbugs.gnu.org, dgutov@yandex.ru > Date: Fri, 29 Apr 2022 13:39:41 +0200 > > I thought I remembered there was a way to say to font-lock "ignore this > bit of the buffer", but I can't find it now. Do I misremember? Make the text invisible? If that doesn't help either, I suggest to profile the code, because it could be the slow display is due to something else. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 12:22 ` Eli Zaretskii @ 2022-04-29 12:41 ` Lars Ingebrigtsen 2022-04-29 13:08 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-29 12:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, 44983, dgutov Eli Zaretskii <eliz@gnu.org> writes: >> I thought I remembered there was a way to say to font-lock "ignore this >> bit of the buffer", but I can't find it now. Do I misremember? > > Make the text invisible? The text is covered by a display property, which should be much the same thing. > If that doesn't help either, I suggest to profile the code, because it > could be the slow display is due to something else. Hm, yes... even if I disable font-lock-mode, it's still slow. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 12:41 ` Lars Ingebrigtsen @ 2022-04-29 13:08 ` Eli Zaretskii 2022-04-30 9:24 ` Lars Ingebrigtsen 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2022-04-29 13:08 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: juri, 44983, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: juri@linkov.net, 44983@debbugs.gnu.org, dgutov@yandex.ru > Date: Fri, 29 Apr 2022 14:41:49 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> I thought I remembered there was a way to say to font-lock "ignore this > >> bit of the buffer", but I can't find it now. Do I misremember? > > > > Make the text invisible? > > The text is covered by a display property, which should be much the same > thing. Not really, it isn't. The effect on the glass is the same, but the effect on the display code is different. > > If that doesn't help either, I suggest to profile the code, because it > > could be the slow display is due to something else. > > Hm, yes... even if I disable font-lock-mode, it's still slow. Then I think a profile should tell something interesting. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 13:08 ` Eli Zaretskii @ 2022-04-30 9:24 ` Lars Ingebrigtsen 2022-04-30 9:36 ` Lars Ingebrigtsen 0 siblings, 1 reply; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 9:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, 44983, dgutov Eli Zaretskii <eliz@gnu.org> writes: >> > If that doesn't help either, I suggest to profile the code, because it >> > could be the slow display is due to something else. >> >> Hm, yes... even if I disable font-lock-mode, it's still slow. > > Then I think a profile should tell something interesting. Turns out to be font lock anyway: 9152 88% - redisplay_internal (C function) 9148 88% - jit-lock-function 9148 88% - jit-lock-fontify-now 9148 88% - jit-lock--run-functions 9144 87% - run-hook-wrapped 9144 87% - #<compiled -0x1568eefe49e247c3> 9144 87% - font-lock-fontify-region 9144 87% - font-lock-default-fontify-region 9144 87% font-lock-fontify-keywords-region Apparently disabling font-lock-mode in the *grep* buffer wasn't sufficient to make it go away for some reason or other. Disabling global-font-lock-mode makes the problem go away. And using invisible text instead of a display property makes no difference -- font-lock seems to really want to do font locking on ever-growing lines that are inserted into the buffer by the process. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 9:24 ` Lars Ingebrigtsen @ 2022-04-30 9:36 ` Lars Ingebrigtsen 2022-04-30 10:15 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 9:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, 44983, dgutov I've instrumented some functions to try to see what's going on. I've set things up so that grep lines that are longer than 200 chars are invisible starting at the 200th character. While the grep is running, `jit-lock-fontify-now' is called repeatedly and takes longer time each time, but with the same region: Fontifying *grep* 392-1892 Fontifying *grep* 392-1892 Fontifying *grep* 392-1892 392 is the start of the line, and 1892 is in the invisible portion of the line. That's 1500 characters, so it should be fast -- but perhaps it's extending it to the end of the line anyway? But before I start trying to debug that, I'm wondering: Why is `jit-lock-fontify-now' called at all here? There have been no display changes -- the text was inserted, but as invisible text, so no font locking should be necessary. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 9:36 ` Lars Ingebrigtsen @ 2022-04-30 10:15 ` Eli Zaretskii 2022-04-30 11:04 ` Lars Ingebrigtsen 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2022-04-30 10:15 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: juri, 44983, dgutov > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: juri@linkov.net, 44983@debbugs.gnu.org, dgutov@yandex.ru > Date: Sat, 30 Apr 2022 11:36:37 +0200 > > But before I start trying to debug that, I'm wondering: Why is > `jit-lock-fontify-now' called at all here? There have been no display > changes -- the text was inserted, but as invisible text, so no font > locking should be necessary. Are you saying that buffer position 392 was in invisible text? If so, jit-lock-fontify-now should not have been called. But if position 392 is visible, then what you see is expected: the buffer text has changed, and therefore redisplay will arrange to redisplay the buffer. Part of redisplaying the buffer is making sure the text that might wind up on display is fontified. Which part will actually be on display can only be known _after_ the text is fontified (because fontification can change faces, and thus affect what's visible in the window). So we always fontify the 500-character chunk, per jit-lock.el's defaults. Did I answer your question? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 10:15 ` Eli Zaretskii @ 2022-04-30 11:04 ` Lars Ingebrigtsen 0 siblings, 0 replies; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 11:04 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, 44983, dgutov Eli Zaretskii <eliz@gnu.org> writes: > Part of redisplaying the buffer is making sure the text that might > wind up on display is fontified. Which part will actually be on > display can only be known _after_ the text is fontified (because > fontification can change faces, and thus affect what's visible in the > window). Yeah, that's true -- font-lock might end up making the text visible, even, I guess? But then we're being slightly inconsistent -- if the entire region is invisible, then we don't let font-lock do anything, you said. But it probably doesn't really matter much. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 11:39 ` Lars Ingebrigtsen 2022-04-29 12:22 ` Eli Zaretskii @ 2022-04-29 16:02 ` Dmitry Gutov 2022-04-30 9:40 ` Lars Ingebrigtsen 2022-04-29 17:15 ` Juri Linkov 2 siblings, 1 reply; 57+ messages in thread From: Dmitry Gutov @ 2022-04-29 16:02 UTC (permalink / raw) To: Lars Ingebrigtsen, Juri Linkov; +Cc: 44983 On 29.04.2022 14:39, Lars Ingebrigtsen wrote: > Juri Linkov<juri@linkov.net> writes: > >> Maybe instead of using font-lock to hide long parts >> of grep lines, it would be better to do the same >> directly in compilation-filter/grep-filter? > I now have a rough patch that does this, but the problem is that even if > I splat a "..." display over the text, font-lock seems to insist on > going over the data anyway, so the display is still dog slow. > > I thought I remembered there was a way to say to font-lock "ignore this > bit of the buffer", but I can't find it now. Do I misremember? FWIW, this is more or less solved for Xref output buffers these days. And the solution is based on the 'invisible' property. See bug#46859. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 16:02 ` Dmitry Gutov @ 2022-04-30 9:40 ` Lars Ingebrigtsen 2022-04-30 9:56 ` Lars Ingebrigtsen 0 siblings, 1 reply; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 9:40 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983, Juri Linkov Dmitry Gutov <dgutov@yandex.ru> writes: > FWIW, this is more or less solved for Xref output buffers these > days. And the solution is based on the 'invisible' property. Skimming the code there, it seems like xref just gets a list that it inserts into the buffer, and then applies the invisibility spec to the long lines? That's a bit different from what compilation-mode/grep is doing, where a process inserts text. I.e., invisible text, in general, works fine, but there's some bad interaction between processes/invisible/font-lock somewhere. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 9:40 ` Lars Ingebrigtsen @ 2022-04-30 9:56 ` Lars Ingebrigtsen 2022-04-30 10:09 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 9:56 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983, Juri Linkov This is the cause of the problem: (defvar grep-mode-font-lock-keywords '(;; Command output lines. (": \\(.+\\): \\(?:Permission denied\\|No such \\(?:file or directory\\|device or address\\)\\)$" 1 grep-error-face) With that removed, everything's nice and fast. Limiting that .+ to 200 characters also makes things fast: diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el index 17905dec2e..7620536b4b 100644 --- a/lisp/progmodes/grep.el +++ b/lisp/progmodes/grep.el @@ -456,7 +456,7 @@ grep-find-abbreviate-properties (defvar grep-mode-font-lock-keywords '(;; Command output lines. - (": \\(.+\\): \\(?:Permission denied\\|No such \\(?:file or directory\\|device or address\\)\\)$" + (": \\(.\\{,200\\}\\): \\(?:Permission denied\\|No such \\(?:file or directory\\|device or address\\)\\)$" 1 grep-error-face) ;; remove match from grep-regexp-alist before fontifying ("^Grep[/a-zA-Z]* started.*" But I guess the real question here is still why we're font-locking invisible text. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply related [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 9:56 ` Lars Ingebrigtsen @ 2022-04-30 10:09 ` Eli Zaretskii 2022-04-30 10:59 ` Lars Ingebrigtsen 2022-04-30 11:02 ` Lars Ingebrigtsen 0 siblings, 2 replies; 57+ messages in thread From: Eli Zaretskii @ 2022-04-30 10:09 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: dgutov, 44983, juri > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Sat, 30 Apr 2022 11:56:11 +0200 > Cc: 44983@debbugs.gnu.org, Juri Linkov <juri@linkov.net> > > But I guess the real question here is still why we're font-locking > invisible text. We are not. The display engine will never call jit-lock on a region that starts in invisible text. But a region that starts in visible text can end in invisible text, and font-lock doesn't pay attention to invisibility spec, AFAIR, it just looks at the buffer text disregarding everything else. For me, the more important question is: why the problem didn't disappear when you turned off font-lock-mode in the offending buffer. And I think I know why: you need to turn off jit-lock-mode as well. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 10:09 ` Eli Zaretskii @ 2022-04-30 10:59 ` Lars Ingebrigtsen 2022-04-30 11:02 ` Lars Ingebrigtsen 1 sibling, 0 replies; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 10:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dgutov, 44983, juri I've now implemented the line-hiding in Emacs 29. Grepping for "Grenadine" in the Emacs tree now takes approx two seconds, while it takes about a minute in Emacs 28 (and Emacs is unusable while it's running). -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 10:09 ` Eli Zaretskii 2022-04-30 10:59 ` Lars Ingebrigtsen @ 2022-04-30 11:02 ` Lars Ingebrigtsen 2022-04-30 11:12 ` Eli Zaretskii 1 sibling, 1 reply; 57+ messages in thread From: Lars Ingebrigtsen @ 2022-04-30 11:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: dgutov, 44983, juri Eli Zaretskii <eliz@gnu.org> writes: > We are not. The display engine will never call jit-lock on a region > that starts in invisible text. But a region that starts in visible > text can end in invisible text, and font-lock doesn't pay attention to > invisibility spec, AFAIR, it just looks at the buffer text > disregarding everything else. Yes, that's correct, I think. But shouldn't it be smarter here? That is, the display engine does know that all the text it inserted was invisible, so calling jit-lock again (with the same parameters as previous time) is futile. However, this is probably not something many modes do, so putting more effort into optimising this is probably not worth it. > For me, the more important question is: why the problem didn't > disappear when you turned off font-lock-mode in the offending buffer. > And I think I know why: you need to turn off jit-lock-mode as well. Probably. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 11:02 ` Lars Ingebrigtsen @ 2022-04-30 11:12 ` Eli Zaretskii 0 siblings, 0 replies; 57+ messages in thread From: Eli Zaretskii @ 2022-04-30 11:12 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: dgutov, 44983, juri > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: dgutov@yandex.ru, 44983@debbugs.gnu.org, juri@linkov.net > Date: Sat, 30 Apr 2022 13:02:59 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > We are not. The display engine will never call jit-lock on a region > > that starts in invisible text. But a region that starts in visible > > text can end in invisible text, and font-lock doesn't pay attention to > > invisibility spec, AFAIR, it just looks at the buffer text > > disregarding everything else. > > Yes, that's correct, I think. But shouldn't it be smarter here? That > is, the display engine does know that all the text it inserted was > invisible No, it doesn't know that. The display engine handles the 'fontified' property first, and the invisible property only after that. Even more importantly, the display engine handles these properties only when it gets to a character with that property, so it's enough that we have a single character with no invisible property that needs to be fontified, to have the display engine invoke jit-lock on a chunk of text starting with that visible character. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 11:39 ` Lars Ingebrigtsen 2022-04-29 12:22 ` Eli Zaretskii 2022-04-29 16:02 ` Dmitry Gutov @ 2022-04-29 17:15 ` Juri Linkov 2022-04-30 0:27 ` Dmitry Gutov 2 siblings, 1 reply; 57+ messages in thread From: Juri Linkov @ 2022-04-29 17:15 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 44983, dgutov >> Maybe instead of using font-lock to hide long parts >> of grep lines, it would be better to do the same >> directly in compilation-filter/grep-filter? > > I now have a rough patch that does this, but the problem is that even if > I splat a "..." display over the text, font-lock seems to insist on > going over the data anyway, so the display is still dog slow. > > I thought I remembered there was a way to say to font-lock "ignore this > bit of the buffer", but I can't find it now. Do I misremember? I don't remember such font-lock text property, but now I have no problems when long lines are hidden initially with: ``` (add-hook 'xref-after-update-hook (lambda () (setq-local outline-regexp (if (eq xref-file-name-display 'abs) "/" "[^ 0-9]") outline-default-state 1 outline-default-rules '(subtree-has-long-lines) outline-default-long-line 1000) (outline-minor-mode +1))) ``` ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-29 17:15 ` Juri Linkov @ 2022-04-30 0:27 ` Dmitry Gutov 2022-05-01 17:14 ` Juri Linkov 0 siblings, 1 reply; 57+ messages in thread From: Dmitry Gutov @ 2022-04-30 0:27 UTC (permalink / raw) To: Juri Linkov, Lars Ingebrigtsen; +Cc: 44983 On 29.04.2022 20:15, Juri Linkov wrote: > I don't remember such font-lock text property, but now I have no problems > when long lines are hidden initially with: When you apply this, do you disable the existing mechanism for dealing with long lines? By setting 'xref-truncation-width' to nil. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2022-04-30 0:27 ` Dmitry Gutov @ 2022-05-01 17:14 ` Juri Linkov 0 siblings, 0 replies; 57+ messages in thread From: Juri Linkov @ 2022-05-01 17:14 UTC (permalink / raw) To: Dmitry Gutov; +Cc: Lars Ingebrigtsen, 44983 >> I don't remember such font-lock text property, but now I have no problems >> when long lines are hidden initially with: > > When you apply this, do you disable the existing mechanism for dealing with > long lines? By setting 'xref-truncation-width' to nil. Oops, I forgot about xref-truncation-width. Maybe it's actually xref-truncation-width that fixed the problem. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#44983: Truncate long lines of grep output 2020-12-01 15:02 ` Dmitry Gutov 2020-12-01 16:09 ` Eli Zaretskii @ 2020-12-01 20:34 ` Juri Linkov 1 sibling, 0 replies; 57+ messages in thread From: Juri Linkov @ 2020-12-01 20:34 UTC (permalink / raw) To: Dmitry Gutov; +Cc: 44983 >>> Is the same problem exhibited by commands using the Xref UI? I don't >>> remember seeing it, but of course our projects can be very different. >> No difference from grep, Xref output has the same problem. > > Perhaps (setq truncate-lines t) could help in that case? I customized truncate-lines to t long ago, and still this doesn't help to improve performance on long lines in grep output. > Then the lines would be cut at the window width, as you suggest below. > >> This will avoid the need of using such workarounds as in bug#44941: >> grep -a "$@" | cut -c -200 > > That might cut filenames unnecessary. Even when those a long, we need them > in their entirety. > > The Grep results parsing code could be changed to only take the first XY > characters of each line, though. The proposed patch doesn't cut filenames, it hides only endings of long lines. But still performance is not much better on very long lines. ^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2022-05-01 17:14 UTC | newest] Thread overview: 57+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-12-01 8:45 bug#44983: Truncate long lines of grep output Juri Linkov 2020-12-01 15:02 ` Dmitry Gutov 2020-12-01 16:09 ` Eli Zaretskii 2020-12-01 16:46 ` Andreas Schwab 2020-12-01 18:26 ` Eli Zaretskii 2020-12-01 20:35 ` Juri Linkov 2020-12-02 3:21 ` Eli Zaretskii 2020-12-02 9:35 ` Juri Linkov 2020-12-02 10:28 ` Eli Zaretskii 2020-12-02 20:53 ` Juri Linkov 2020-12-03 14:47 ` Eli Zaretskii 2020-12-03 16:30 ` Rudolf Schlatte 2020-12-03 21:17 ` Juri Linkov 2020-12-05 19:47 ` Juri Linkov 2020-12-06 20:39 ` Juri Linkov 2020-12-06 21:37 ` Dmitry Gutov 2020-12-06 21:54 ` Juri Linkov 2020-12-07 2:41 ` Dmitry Gutov 2020-12-08 19:41 ` Juri Linkov 2020-12-09 3:00 ` Dmitry Gutov 2020-12-09 19:17 ` Juri Linkov 2020-12-09 20:06 ` Dmitry Gutov 2020-12-10 8:18 ` Juri Linkov 2020-12-10 20:48 ` Dmitry Gutov 2020-12-09 21:43 ` Jean Louis 2020-12-10 8:06 ` Juri Linkov 2020-12-10 10:08 ` Jean Louis 2020-12-12 20:42 ` Juri Linkov 2020-12-13 10:57 ` Jean Louis 2020-12-13 15:11 ` Eli Zaretskii 2020-12-13 15:37 ` Jean Louis 2020-12-13 20:17 ` Juri Linkov 2020-12-14 16:15 ` Eli Zaretskii 2020-12-14 20:09 ` Dmitry Gutov 2020-12-24 20:33 ` Juri Linkov 2020-12-24 23:38 ` Dmitry Gutov 2020-12-08 5:35 ` Richard Stallman 2020-12-08 19:15 ` Dmitry Gutov 2022-04-29 11:39 ` Lars Ingebrigtsen 2022-04-29 12:22 ` Eli Zaretskii 2022-04-29 12:41 ` Lars Ingebrigtsen 2022-04-29 13:08 ` Eli Zaretskii 2022-04-30 9:24 ` Lars Ingebrigtsen 2022-04-30 9:36 ` Lars Ingebrigtsen 2022-04-30 10:15 ` Eli Zaretskii 2022-04-30 11:04 ` Lars Ingebrigtsen 2022-04-29 16:02 ` Dmitry Gutov 2022-04-30 9:40 ` Lars Ingebrigtsen 2022-04-30 9:56 ` Lars Ingebrigtsen 2022-04-30 10:09 ` Eli Zaretskii 2022-04-30 10:59 ` Lars Ingebrigtsen 2022-04-30 11:02 ` Lars Ingebrigtsen 2022-04-30 11:12 ` Eli Zaretskii 2022-04-29 17:15 ` Juri Linkov 2022-04-30 0:27 ` Dmitry Gutov 2022-05-01 17:14 ` Juri Linkov 2020-12-01 20:34 ` Juri Linkov
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).