bug#45320: 27.1; diff-refine performance regression

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#45320: 27.1; diff-refine performance regression
@ 2020-12-19  8:12 Achim Gratz
  2022-06-07 11:44 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 4+ messages in thread
From: Achim Gratz @ 2020-12-19  8:12 UTC (permalink / raw)
  To: 45320

In GNU Emacs 27.1 (build 1, x86_64-suse-linux-gnu, GTK+ Version 3.24.22, cairo version 1.16.0)

The following change is highly problematic:

--8<---------------cut here---------------start------------->8---
** Diff mode

*** Hunks are now automatically refined by font-lock.
To disable refinement, set the new user option 'diff-refine' to nil.
To get back the old behavior where hunks are refined as you navigate
through a diff, set 'diff-refine' to the symbol 'navigate'.
--8<---------------cut here---------------end--------------->8---

The refinement is run synchronously and can't be interrupted.

The used algorithm clearly has superlinear complexity with the size of
the diff hunk.  I frequently use diff-mode for comparison of log files
from compilations, which routinely creates large hunks and sometimes
very large ones.  In mild cases that stalls Emacs for a few seconds
upwards to a minute, but I'm encountering larger diff hunks regularly
that do not complete refinement after more than an hour of burning
through 100% CPU (and not a slow one) when I try.  Due to the way this
is implemented, the refinement can not be stopped from within Emacs and
stopping it from the outside has a high propensity of killing Emacs and
taking all unsaved work with it.

Auto-refinement of diff hunks should

1. be stopped

a) after a customizable time threshold (personally I'd be OK with
   something like 1s, but other folks may have less patience),

b) when the user tries to move point (even small delays are annoying
   when you really just want to scroll through the file),

c) when C-g or the corresponding signal is issued.

2. not be attempted at all

a) when the hunk size exceeds a customizable threshold,

b) when the diff in question has run into one of the performance
   thresholds multiple times already.

Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf Blofeld V1.15B11:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#45320: 27.1; diff-refine performance regression
  2020-12-19  8:12 bug#45320: 27.1; diff-refine performance regression Achim Gratz
@ 2022-06-07 11:44 ` Lars Ingebrigtsen
  2022-06-08  6:50   ` Juri Linkov
  2022-06-13 15:11   ` Achim Gratz
  0 siblings, 2 replies; 4+ messages in thread
From: Lars Ingebrigtsen @ 2022-06-07 11:44 UTC (permalink / raw)
  To: Achim Gratz; +Cc: 45320

Achim Gratz <Stromeko@nexgo.de> writes:

> *** Hunks are now automatically refined by font-lock.
> To disable refinement, set the new user option 'diff-refine' to nil.
> To get back the old behavior where hunks are refined as you navigate
> through a diff, set 'diff-refine' to the symbol 'navigate'.
>
> The refinement is run synchronously and can't be interrupted.
>
> The used algorithm clearly has superlinear complexity with the size of
> the diff hunk.  I frequently use diff-mode for comparison of log files
> from compilations, which routinely creates large hunks and sometimes
> very large ones.

(I'm going through old bug reports that unfortunately weren't resolved
at the time.)

This isn't the common use case for showing diffs, so it sounds like you
should just alter the 'diff-refine' user option as described above.

> Auto-refinement of diff hunks should
>
> 1. be stopped
>
> a) after a customizable time threshold (personally I'd be OK with
>    something like 1s, but other folks may have less patience),
>
> b) when the user tries to move point (even small delays are annoying
>    when you really just want to scroll through the file),
>
> c) when C-g or the corresponding signal is issued.

The problem with  of these is that font-locking is done from
redisplay, and if you `C-g' something from those functions, it'll just
try to restart the fontification.  But...  I guess we could slap
something around `diff--font-lock-refined' to change the value
(buffer-locally) of diff-refine if the user hits `C-g' while it's
running?

> 2. not be attempted at all
>
> a) when the hunk size exceeds a customizable threshold,

That should be possible...

> b) when the diff in question has run into one of the performance
>    thresholds multiple times already.

I'm not sure that's practical.

Anybody have any other ideas here?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#45320: 27.1; diff-refine performance regression
  2022-06-07 11:44 ` Lars Ingebrigtsen
@ 2022-06-08  6:50   ` Juri Linkov
  2022-06-13 15:11   ` Achim Gratz
  1 sibling, 0 replies; 4+ messages in thread
From: Juri Linkov @ 2022-06-08  6:50 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 45320, Achim Gratz

>> 2. not be attempted at all
>>
>> a) when the hunk size exceeds a customizable threshold,
>
> That should be possible...

This looks like the recently added outline-default-rules
(that also can be used for diff-mode outlines, but not for diff-refine)
with such possible values:

  - `subtree-has-long-lines' to only show the heading branches when
    long lines are detected in its subtree (see
    `outline-default-long-line' for the definition of long lines).

  - `subtree-is-long' to only show the heading branches when its
    subtree contains more than `outline-default-line-count' lines.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#45320: 27.1; diff-refine performance regression
  2022-06-07 11:44 ` Lars Ingebrigtsen
  2022-06-08  6:50   ` Juri Linkov
@ 2022-06-13 15:11   ` Achim Gratz
  1 sibling, 0 replies; 4+ messages in thread
From: Achim Gratz @ 2022-06-13 15:11 UTC (permalink / raw)
  To: 45320

Lars Ingebrigtsen writes:
> This isn't the common use case for showing diffs, so it sounds like you
> should just alter the 'diff-refine' user option as described above.

The problem is caused by a change in defaults and whether or not the use
case seems common to you is besides the point.  The default used to be
no refinement and the change in default was done on the assumption that
the performance hit would generally not be bothersome.  As long as (GNU)
grep does have this clearly superlinear complexity characteristic that
simply isn't true (just for entertainment I've let Emacs hang while grep
finishes the refinement several times and it is not uncommon that I see
grep taking several minutes.  I think the longest I've kept it running
was way over one hour.

>> Auto-refinement of diff hunks should
>>
>> 1. be stopped
>>
>> a) after a customizable time threshold (personally I'd be OK with
>>    something like 1s, but other folks may have less patience),
>>
>> b) when the user tries to move point (even small delays are annoying
>>    when you really just want to scroll through the file),
>>
>> c) when C-g or the corresponding signal is issued.
>
> The problem with  of these is that font-locking is done from
> redisplay, and if you `C-g' something from those functions, it'll just
> try to restart the fontification.  But...  I guess we could slap
> something around `diff--font-lock-refined' to change the value
> (buffer-locally) of diff-refine if the user hits `C-g' while it's
> running?

My go-to solution is to send USR2 to the emacs process, but that's
really not something I should ever need to do with an Emacs in standard
configuration.

>> 2. not be attempted at all
>>
>> a) when the hunk size exceeds a customizable threshold,
>
> That should be possible...

The hunk size only factors into this due to the superlinear complexity in
the refinement.  It might be tricky to figure out a size that works
across all inputs since the structure of the hunk is also important.

>> b) when the diff in question has run into one of the performance
>>    thresholds multiple times already.
>
> I'm not sure that's practical.
>
> Anybody have any other ideas here?

Ideally all such external processes should run asynchronously (as a
"future") with a timeout.  If there is a timeout event, the user presses
C-g or point is moved to a different hunk before the result gets
delivered the process simply gets canceled, otherwise there is another
round of redisplay that is using the result.  I guess that caching previous
results would be quite useful at least in this particular case, so the
details of the implementation could become quite involved.

Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-13 15:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-19  8:12 bug#45320: 27.1; diff-refine performance regression Achim Gratz
2022-06-07 11:44 ` Lars Ingebrigtsen
2022-06-08  6:50   ` Juri Linkov
2022-06-13 15:11   ` Achim Gratz

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).