unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#56682: Fix the long lines font locking related slowdowns
@ 2022-07-21 18:00 Gregory Heytings
  2022-07-21 18:04 ` Eli Zaretskii
  2022-08-01 16:34 ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-21 18:00 UTC (permalink / raw)
  To: 56682


New bug number and thread to discuss the next chapter of the long lines 
slowdowns.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-21 18:00 bug#56682: Fix the long lines font locking related slowdowns Gregory Heytings
@ 2022-07-21 18:04 ` Eli Zaretskii
  2022-07-22 10:16   ` Gregory Heytings
  2022-07-22 23:25   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-01 16:34 ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-21 18:04 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682

> Date: Thu, 21 Jul 2022 18:00:37 +0000
> From: Gregory Heytings <gregory@heytings.org>
> 
> New bug number and thread to discuss the next chapter of the long lines 
> slowdowns.

Thanks.

FTR, I repeat here the recipe from
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56393#425:

  emacs -Q
  C-x C-f long-line.xml RET

Now, do NOT disable font-lock, and wait for Emacs to say "Valid" in
the mode line (to get nXML mode out of the way).  Then:

  M-x toggle-truncate-lines RET

Now simple cursor motion commands that use redisplay optimizations are
fast, but commands that cause more thorough redisplay are as slow as
on master.  As a simple example, try just "M-x" and wait until the
"M-x" prompt appears in the minibuffer -- here it takes much longer,
basically as long as the version on master.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-21 18:04 ` Eli Zaretskii
@ 2022-07-22 10:16   ` Gregory Heytings
  2022-07-22 14:11     ` Eli Zaretskii
                       ` (2 more replies)
  2022-07-22 23:25   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-22 10:16 UTC (permalink / raw)
  To: 56682; +Cc: Eli Zaretskii


The first step of the font locking related improvements was easier than I 
thought.

Two following commands:

1. C-x C-f dictionary.json RET y

2. C-e

on my laptop, with Emacs from master from a week ago, take respectively 
150 seconds and 40 seconds.  With the improvements on master, 1 is 
instantaneous but 2 still takes about 5 seconds.  Now, with the changes in 
the feature/long-lines-and-font-locking branch, both are instantaneous. 
The price of that speedup is that some portions of the buffer will be 
mis-highlighted, which is unavoidable.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 10:16   ` Gregory Heytings
@ 2022-07-22 14:11     ` Eli Zaretskii
  2022-07-22 14:44       ` Lars Ingebrigtsen
  2022-07-22 14:51     ` Eli Zaretskii
  2022-07-23  6:10     ` Eli Zaretskii
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-22 14:11 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682

> Date: Fri, 22 Jul 2022 10:16:52 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Eli Zaretskii <eliz@gnu.org>
> 
> 
> The first step of the font locking related improvements was easier than I 
> thought.
> 
> Two following commands:
> 
> 1. C-x C-f dictionary.json RET y
> 
> 2. C-e
> 
> on my laptop, with Emacs from master from a week ago, take respectively 
> 150 seconds and 40 seconds.  With the improvements on master, 1 is 
> instantaneous but 2 still takes about 5 seconds.  Now, with the changes in 
> the feature/long-lines-and-font-locking branch, both are instantaneous. 
> The price of that speedup is that some portions of the buffer will be 
> mis-highlighted, which is unavoidable.

Thanks.  This is indeed the immediate idea for that aspect, but I
wonder what would developers of some major modes which widen the
buffer say about this.  Some of them were very much against such
restrictions in the past.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 14:11     ` Eli Zaretskii
@ 2022-07-22 14:44       ` Lars Ingebrigtsen
  2022-07-25 20:59         ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-22 14:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

Eli Zaretskii <eliz@gnu.org> writes:

> Thanks.  This is indeed the immediate idea for that aspect, but I
> wonder what would developers of some major modes which widen the
> buffer say about this.  Some of them were very much against such
> restrictions in the past.

This reminds me of something I meant to mention -- Stefan M. once
proposed that there should be two kinds of narrowing (I think?).  The
first is the one that the user sets with `C-x n n', which says "the user
is only interested in this bit of the buffer", but programs are
"allowed" to remove that restriction when doing stuff (like font
locking).  The second type should be a strict one, where modes are not
allowed to widen the region.

Looking briefly at Gregory's new branch, it seems like that (sort of)
introduces this idea, but in a non-explicit way (i.e., by having an
inhibit-widen variable).

And, yes, some major mode authors were against this idea, but I think it
sounds like a sound idea.  And perhaps it'd make sense to implement it
like Stefan suggested, instead of `inhibit-widen'.

I've added Stefan to the CCs; perhaps he has some comments.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 10:16   ` Gregory Heytings
  2022-07-22 14:11     ` Eli Zaretskii
@ 2022-07-22 14:51     ` Eli Zaretskii
  2022-07-22 15:06       ` Eli Zaretskii
  2022-07-23  6:10     ` Eli Zaretskii
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-22 14:51 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682

> Date: Fri, 22 Jul 2022 10:16:52 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Eli Zaretskii <eliz@gnu.org>
> 
> 
> on my laptop, with Emacs from master from a week ago, take respectively 
> 150 seconds and 40 seconds.  With the improvements on master, 1 is 
> instantaneous but 2 still takes about 5 seconds.  Now, with the changes in 
> the feature/long-lines-and-font-locking branch, both are instantaneous. 
> The price of that speedup is that some portions of the buffer will be 
> mis-highlighted, which is unavoidable.

The assertion below can now be violated:

      if (it->narrowed_begv)
	{
	  record_unwind_protect (unwind_narrowed_begv, Fpoint_min ());
	  record_unwind_protect (unwind_narrowed_zv, Fpoint_max ());
	  SET_BUF_BEGV (current_buffer, it->narrowed_begv);
	  SET_BUF_ZV (current_buffer, it->narrowed_zv);
	  specbind (Qinhibit_widen, Qt);
	}

      val = Vfontification_functions;
      specbind (Qfontification_functions, Qnil);

      eassert (it->end_charpos == ZV);  <<<<<<<<<<<<<<<<<

because of the "narrowing".  (I actually saw this assertion violation
once on the branch, but I cannot reproduce it.)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 14:51     ` Eli Zaretskii
@ 2022-07-22 15:06       ` Eli Zaretskii
  2022-07-22 19:25         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-22 15:06 UTC (permalink / raw)
  To: gregory; +Cc: 56682

> Cc: 56682@debbugs.gnu.org
> Date: Fri, 22 Jul 2022 17:51:33 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Date: Fri, 22 Jul 2022 10:16:52 +0000
> > From: Gregory Heytings <gregory@heytings.org>
> > cc: Eli Zaretskii <eliz@gnu.org>
> > 
> > 
> > on my laptop, with Emacs from master from a week ago, take respectively 
> > 150 seconds and 40 seconds.  With the improvements on master, 1 is 
> > instantaneous but 2 still takes about 5 seconds.  Now, with the changes in 
> > the feature/long-lines-and-font-locking branch, both are instantaneous. 
> > The price of that speedup is that some portions of the buffer will be 
> > mis-highlighted, which is unavoidable.
> 
> The assertion below can now be violated:
> 
>       if (it->narrowed_begv)
> 	{
> 	  record_unwind_protect (unwind_narrowed_begv, Fpoint_min ());
> 	  record_unwind_protect (unwind_narrowed_zv, Fpoint_max ());
> 	  SET_BUF_BEGV (current_buffer, it->narrowed_begv);
> 	  SET_BUF_ZV (current_buffer, it->narrowed_zv);
> 	  specbind (Qinhibit_widen, Qt);
> 	}
> 
>       val = Vfontification_functions;
>       specbind (Qfontification_functions, Qnil);
> 
>       eassert (it->end_charpos == ZV);  <<<<<<<<<<<<<<<<<
> 
> because of the "narrowing".  (I actually saw this assertion violation
> once on the branch, but I cannot reproduce it.)

Sorry, it's very easy to reproduce: visit dictionary.josn, and then
type C-v several times until it happens.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 15:06       ` Eli Zaretskii
@ 2022-07-22 19:25         ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-22 19:25 UTC (permalink / raw)
  To: gregory; +Cc: 56682

> Cc: 56682@debbugs.gnu.org
> Date: Fri, 22 Jul 2022 18:06:44 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > The assertion below can now be violated:
> > 
> >       if (it->narrowed_begv)
> > 	{
> > 	  record_unwind_protect (unwind_narrowed_begv, Fpoint_min ());
> > 	  record_unwind_protect (unwind_narrowed_zv, Fpoint_max ());
> > 	  SET_BUF_BEGV (current_buffer, it->narrowed_begv);
> > 	  SET_BUF_ZV (current_buffer, it->narrowed_zv);
> > 	  specbind (Qinhibit_widen, Qt);
> > 	}
> > 
> >       val = Vfontification_functions;
> >       specbind (Qfontification_functions, Qnil);
> > 
> >       eassert (it->end_charpos == ZV);  <<<<<<<<<<<<<<<<<
> > 
> > because of the "narrowing".  (I actually saw this assertion violation
> > once on the branch, but I cannot reproduce it.)
> 
> Sorry, it's very easy to reproduce: visit dictionary.josn, and then
> type C-v several times until it happens.

I installed a fix.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-21 18:04 ` Eli Zaretskii
  2022-07-22 10:16   ` Gregory Heytings
@ 2022-07-22 23:25   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-23  6:41     ` Eli Zaretskii
  2022-07-25 21:23     ` Gregory Heytings
  1 sibling, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-22 23:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings

Before using a blunt tool like the forced-narrowing now in
`feature/long-lines-and-font-locking`, I think we should try and figure
out *why* the recipe below is so slow.

For example, the behavior surprises me because I can't see why `M-x`
should cause any font-locking at all in the `long-line.xml` file.

[ Other side note: IIRC `nxml-mode` uses font-lock and syntax-propertize
  in somewhat unusual ways (it started doing all of that work "by hand"
  in its own way, and only later was it coerced to try and play along
  with that "standard" infrastructure), so the problem may be specific to
  nxml-mode.  ]


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 10:16   ` Gregory Heytings
  2022-07-22 14:11     ` Eli Zaretskii
  2022-07-22 14:51     ` Eli Zaretskii
@ 2022-07-23  6:10     ` Eli Zaretskii
  2022-07-23  7:07       ` Gerd Möllmann
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23  6:10 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Stefan Monnier

> Date: Fri, 22 Jul 2022 10:16:52 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Eli Zaretskii <eliz@gnu.org>
> 
> on my laptop, with Emacs from master from a week ago, take respectively 
> 150 seconds and 40 seconds.  With the improvements on master, 1 is 
> instantaneous but 2 still takes about 5 seconds.  Now, with the changes in 
> the feature/long-lines-and-font-locking branch, both are instantaneous. 
> The price of that speedup is that some portions of the buffer will be 
> mis-highlighted, which is unavoidable.

The branch is much faster, indeed, but still font-lock imposes a
significant penalty on commands that needs to redisplay.  For example,
running the following simple benchmark:

  (defun scroll-up-benchmark ()
    (interactive)
    (let ((oldgc gcs-done)
	  (oldtime (float-time)))
      (condition-case nil (while t (scroll-up) (redisplay))
	(error (message "GCs: %d Elapsed time: %f seconds"
			(- gcs-done oldgc) (- (float-time) oldtime))))))

on long-line.xml produces a 15-fold slowdown with font-lock turned on
as compared to its being turned off (203 sec vs 13 sec).

This is an unoptimized build, so you will probably see times that are
4 times faster, but I'd be interested in the relative times on your
system.  Any explanations of the slowdown are also welcome.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 23:25   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-23  6:41     ` Eli Zaretskii
  2022-07-23 14:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-25 21:23     ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23  6:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gregory Heytings <gregory@heytings.org>,  56682@debbugs.gnu.org
> Date: Fri, 22 Jul 2022 19:25:47 -0400
> 
> For example, the behavior surprises me because I can't see why `M-x`
> should cause any font-locking at all in the `long-line.xml` file.

Any command that enters the minibuffer causes a thorough redisplay of
the windows on that frame (because more than one window has to be
updated).  What does that have to do with font-lock is a separate
question (but you are the best person to answer it).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  6:10     ` Eli Zaretskii
@ 2022-07-23  7:07       ` Gerd Möllmann
  2022-07-23  7:12         ` Eli Zaretskii
  2022-07-23  7:18         ` Gerd Möllmann
  0 siblings, 2 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23  7:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

Eli Zaretskii <eliz@gnu.org> writes:

> on long-line.xml produces a 15-fold slowdown with font-lock turned on
> as compared to its being turned off (203 sec vs 13 sec).
>
> This is an unoptimized build, so you will probably see times that are
> 4 times faster, but I'd be interested in the relative times on your
> system.  Any explanations of the slowdown are also welcome.

MacOS 12.5, M1 chip
Head: 792734a6e2cd5558debc8d9fe95d34cb3e809fa4 Improve efficiency of DND
tooltip movement
./configure --with-native-compilation

The long-lines.xml is 313295 bytes.  Hope that's the right one.

Font-lock       Output
------------------------------------------------------------
on              GCs: 14 Elapsed time: 7.880788 seconds
off             GCs: 2 Elapsed time: 0.885791 seconds





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  7:07       ` Gerd Möllmann
@ 2022-07-23  7:12         ` Eli Zaretskii
  2022-07-23  7:30           ` Gerd Möllmann
  2022-07-23  7:18         ` Gerd Möllmann
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23  7:12 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Gregory Heytings <gregory@heytings.org>,  56682@debbugs.gnu.org,  Stefan
>  Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 23 Jul 2022 09:07:53 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > on long-line.xml produces a 15-fold slowdown with font-lock turned on
> > as compared to its being turned off (203 sec vs 13 sec).
> >
> > This is an unoptimized build, so you will probably see times that are
> > 4 times faster, but I'd be interested in the relative times on your
> > system.  Any explanations of the slowdown are also welcome.
> 
> MacOS 12.5, M1 chip
> Head: 792734a6e2cd5558debc8d9fe95d34cb3e809fa4 Improve efficiency of DND
> tooltip movement
> ./configure --with-native-compilation
> 
> The long-lines.xml is 313295 bytes.  Hope that's the right one.
> 
> Font-lock       Output
> ------------------------------------------------------------
> on              GCs: 14 Elapsed time: 7.880788 seconds
> off             GCs: 2 Elapsed time: 0.885791 seconds

Thanks.  This is still an order-of-magnitude slowdown, so the question
about the reasons is still relevant.

One thing we do under font-lock is merging faces, but that's supposed
to be very fast nowadays, given that faces are kept in a hash-table.
And what jit-lock does when the text is already fontified should be
negligible, right?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  7:07       ` Gerd Möllmann
  2022-07-23  7:12         ` Eli Zaretskii
@ 2022-07-23  7:18         ` Gerd Möllmann
  2022-07-23  8:00           ` Gerd Möllmann
  1 sibling, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23  7:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Font-lock       Output
> ------------------------------------------------------------
> on              GCs: 14 Elapsed time: 7.880788 seconds
  on, and large gc-const-threshold
                  GCs: 0 Elapsed time: 7.720531 seconds
> off             GCs: 2 Elapsed time: 0.885791 seconds







^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  7:12         ` Eli Zaretskii
@ 2022-07-23  7:30           ` Gerd Möllmann
  0 siblings, 0 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23  7:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Eli Zaretskii <eliz@gnu.org> writes:

> And what jit-lock does when the text is already fontified should be
> negligible, right?

The iterator doesn't even stop and call jit-lock on regions with
'fontified' property, i.e. after font-lock.  I can't imagine ATM why
somone would have changed that.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  7:18         ` Gerd Möllmann
@ 2022-07-23  8:00           ` Gerd Möllmann
  2022-07-23  8:04             ` Gerd Möllmann
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23  8:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 636 bytes --]

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>> Font-lock       Output
>> ------------------------------------------------------------
>> on              GCs: 14 Elapsed time: 7.880788 seconds
>   on, and large gc-const-threshold
>                   GCs: 0 Elapsed time: 7.720531 seconds
>> off             GCs: 2 Elapsed time: 0.885791 seconds

I've profiled this.  The result looks sensible, but I have to add that I
had mixed results with profiling optimized builds, so maybe...  Just
saying.

Sorry for posting an image.  I haven't found a way to get that info from
Instruments as text (pointers welcome).


[-- Attachment #2: Instrumens profile --]
[-- Type: image/png, Size: 209471 bytes --]

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  8:00           ` Gerd Möllmann
@ 2022-07-23  8:04             ` Gerd Möllmann
  2022-07-23  8:11               ` Gerd Möllmann
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23  8:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>
>>> Font-lock       Output
>>> ------------------------------------------------------------
>>> on              GCs: 14 Elapsed time: 7.880788 seconds
>>   on, and large gc-const-threshold
>>                  GCs: 0 Elapsed time: 7.720531 seconds
    on, without scroll-bar
                    GCs: 15 Elapsed time: 5.574946 seconds
>>> off             GCs: 2 Elapsed time: 0.885791 seconds






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  8:04             ` Gerd Möllmann
@ 2022-07-23  8:11               ` Gerd Möllmann
  2022-07-23 13:42                 ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23  8:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>
>>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
>>>
>>>> Font-lock       Output
>>>> ------------------------------------------------------------
>>>> on              GCs: 14 Elapsed time: 7.880788 seconds
>>>   on, and large gc-const-threshold
>>>                  GCs: 0 Elapsed time: 7.720531 seconds
>     on, without scroll-bar
>                    GCs: 15 Elapsed time: 5.574946 seconds
      on without scroll-bar, auto-composition-mode off
                     GCs: 15 Elapsed time: 4.166101 seconds
>>>> off             GCs: 2 Elapsed time: 0.885791 seconds

Hm.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  8:11               ` Gerd Möllmann
@ 2022-07-23 13:42                 ` Eli Zaretskii
  2022-07-23 14:25                   ` Gerd Möllmann
                                     ` (3 more replies)
  0 siblings, 4 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 13:42 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 56682@debbugs.gnu.org,  Gregory Heytings <gregory@heytings.org>,  Stefan
>  Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 23 Jul 2022 10:11:04 +0200
> 
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> 
> > Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> >
> >>> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> >>>
> >>>> Font-lock       Output
> >>>> ------------------------------------------------------------
> >>>> on              GCs: 14 Elapsed time: 7.880788 seconds
> >>>   on, and large gc-const-threshold
> >>>                  GCs: 0 Elapsed time: 7.720531 seconds
> >     on, without scroll-bar
> >                    GCs: 15 Elapsed time: 5.574946 seconds
>       on without scroll-bar, auto-composition-mode off
>                      GCs: 15 Elapsed time: 4.166101 seconds
> >>>> off             GCs: 2 Elapsed time: 0.885791 seconds
> 
> Hm.

scroll-bar could be an issue only if we enter the new code that
doesn't use window_end_pos if it cannot be trusted.  Perhaps when
long-line optimizations are turned on for a buffer, we shouldn't try
to be holier than the Pope, and should use window_end_pos even if not
reliable?

And I've found a serious sink of CPU cycles under truncate-lines, and
installed a fix on the feature branch.  Gerd, if you have time to
eyeball the fix and comment on it, I'd appreciate.  It's commit
350e97d on the branch.  (I can post a more detailed explanation of
what I did and why, if that would help, because the code and the
functions it calls are somewhat tricky.)

After these changes, display of very long lines is quite reasonable,
when truncate-lines or truncate-partial-width-windows is in effect,
even without turning off font-lock and even in an unoptimized build.
Amusingly enough, show-paren-mode is now a serious performance killer
in these cases, because it puts overlays on buffer text, and that
disables a shortcut in the code that finds the next visible line
start, which is called a lot when lines are truncated.  Maybe we need
a smarter optimization there, one that doesn't immediately give up as
soon as it sees an overlay.  Hmm...





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23  6:41     ` Eli Zaretskii
@ 2022-07-23 14:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-23 15:29         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-23 14:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Gregory Heytings <gregory@heytings.org>,  56682@debbugs.gnu.org
>> Date: Fri, 22 Jul 2022 19:25:47 -0400
>> 
>> For example, the behavior surprises me because I can't see why `M-x`
>> should cause any font-locking at all in the `long-line.xml` file.
>
> Any command that enters the minibuffer causes a thorough redisplay of
> the windows on that frame (because more than one window has to be
> updated).  What does that have to do with font-lock is a separate
> question (but you are the best person to answer it).

AFAIK if the buffer has not been modified (including things like
changing `window-start` or `point`), then a redisplay will just not
run jit-lock (and hence font-lock) at all, no matter how thorough.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 13:42                 ` Eli Zaretskii
@ 2022-07-23 14:25                   ` Gerd Möllmann
  2022-07-23 14:33                     ` Gerd Möllmann
                                       ` (2 more replies)
  2022-07-23 14:47                   ` Eli Zaretskii
                                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 14:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Eli Zaretskii <eliz@gnu.org> writes:

> And I've found a serious sink of CPU cycles under truncate-lines, and
> installed a fix on the feature branch.  Gerd, if you have time to
> eyeball the fix and comment on it, I'd appreciate.  It's commit
> 350e97d on the branch.  (I can post a more detailed explanation of
> what I did and why, if that would help, because the code and the
> functions it calls are somewhat tricky.)

I'll look at it and come back.

BTW, does feature/long-lines-and-font-locking] build for you?  I'm getting

In toplevel form:
cedet/semantic/symref/list.el:35:2: Error: Wrong type argument: number-or-marker-p, nil
make[2]: *** [cedet/semantic/symref/list.elc] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [compile-main] Error 2
make: *** [lisp] Error 2

Configured --with-native-compilation.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:25                   ` Gerd Möllmann
@ 2022-07-23 14:33                     ` Gerd Möllmann
  2022-07-23 15:43                       ` Eli Zaretskii
  2022-07-23 14:35                     ` Visuwesh
  2022-07-23 14:35                     ` Gerd Möllmann
  2 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 14:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> And I've found a serious sink of CPU cycles under truncate-lines, and
>> installed a fix on the feature branch.  Gerd, if you have time to
>> eyeball the fix and comment on it, I'd appreciate.  It's commit
>> 350e97d on the branch.  (I can post a more detailed explanation of
>> what I did and why, if that would help, because the code and the
>> functions it calls are somewhat tricky.)
>
> I'll look at it and come back.

modified   src/xdisp.c
@@ -7153,10 +7153,10 @@ forward_to_next_line_start (struct it *it, bool *skipped_p,
 	  || ((pos = Fnext_single_property_change (make_fixnum (start),
 						   Qdisplay, Qnil,
 						   make_fixnum (limit)),
-	       NILP (pos))
+	       (NILP (pos) || XFIXNAT (pos) == limit))
 	      && next_overlay_change (start) == ZV))
 	{
-	  if (!it->bidi_p)
+	  if (!it->bidi_p || !bidi_it_prev)
 	    {
 	      IT_CHARPOS (*it) = limit;
 	      IT_BYTEPOS (*it) = bytepos;

I understand the first diff, which makes a lot of sense, but I'm afraid
I don't know enough about bidi to be of any help.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:25                   ` Gerd Möllmann
  2022-07-23 14:33                     ` Gerd Möllmann
@ 2022-07-23 14:35                     ` Visuwesh
  2022-07-23 14:46                       ` Gerd Möllmann
  2022-07-23 14:35                     ` Gerd Möllmann
  2 siblings, 1 reply; 416+ messages in thread
From: Visuwesh @ 2022-07-23 14:35 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, Eli Zaretskii, gregory, monnier

[சனி ஜூலை 23, 2022] Gerd Möllmann wrote:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>> And I've found a serious sink of CPU cycles under truncate-lines, and
>> installed a fix on the feature branch.  Gerd, if you have time to
>> eyeball the fix and comment on it, I'd appreciate.  It's commit
>> 350e97d on the branch.  (I can post a more detailed explanation of
>> what I did and why, if that would help, because the code and the
>> functions it calls are somewhat tricky.)
>
> I'll look at it and come back.
>
> BTW, does feature/long-lines-and-font-locking] build for you?  I'm getting
>
> In toplevel form:
> cedet/semantic/symref/list.el:35:2: Error: Wrong type argument: number-or-marker-p, nil
> make[2]: *** [cedet/semantic/symref/list.elc] Error 1
> make[2]: *** Waiting for unfinished jobs....
> make[1]: *** [compile-main] Error 2
> make: *** [lisp] Error 2
>
> Configured --with-native-compilation.

I think Eli fixed it in the master branch in commit 4a4fcf628e1e4c8db47cd62fa5617b662fa8b5d6.

https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=4a4fcf628e1e4c8db47cd62fa5617b662fa8b5d6





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:25                   ` Gerd Möllmann
  2022-07-23 14:33                     ` Gerd Möllmann
  2022-07-23 14:35                     ` Visuwesh
@ 2022-07-23 14:35                     ` Gerd Möllmann
  2 siblings, 0 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 14:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> BTW, does feature/long-lines-and-font-locking] build for you?  I'm getting
>
> In toplevel form:
> cedet/semantic/symref/list.el:35:2: Error: Wrong type argument: number-or-marker-p, nil
> make[2]: *** [cedet/semantic/symref/list.elc] Error 1
> make[2]: *** Waiting for unfinished jobs....
> make[1]: *** [compile-main] Error 2
> make: *** [lisp] Error 2
>
> Configured --with-native-compilation.

Same without native compilation.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:35                     ` Visuwesh
@ 2022-07-23 14:46                       ` Gerd Möllmann
  2022-07-23 15:01                         ` Gerd Möllmann
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 14:46 UTC (permalink / raw)
  To: Visuwesh; +Cc: 56682, Eli Zaretskii, gregory, monnier

Visuwesh <visuweshm@gmail.com> writes:

> I think Eli fixed it in the master branch in commit 4a4fcf628e1e4c8db47cd62fa5617b662fa8b5d6.

Yup that worked.  Thanks!





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 13:42                 ` Eli Zaretskii
  2022-07-23 14:25                   ` Gerd Möllmann
@ 2022-07-23 14:47                   ` Eli Zaretskii
  2022-07-23 15:04                   ` Gerd Möllmann
  2022-07-25 21:47                   ` Gregory Heytings
  3 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 14:47 UTC (permalink / raw)
  To: gregory; +Cc: gerd.moellmann, 56682, monnier

Another observation: any command that calls current_column, directly
or indirectly, will be very slow with long lines.  This includes
"C-x =" and column-number-mode, for example.

I installed a change to avoid calling current_column where possible,
but the problem is we have a bytecode which calls it, and it is
impossible to avoid the call because it could break some callers.
Perhaps we should expose the long-line-optimization flag to Lisp, so
that commands like what-cursor-position could refrain from calling
current-column.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:46                       ` Gerd Möllmann
@ 2022-07-23 15:01                         ` Gerd Möllmann
  2022-07-23 16:02                           ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 15:01 UTC (permalink / raw)
  To: Visuwesh; +Cc: 56682, Eli Zaretskii, gregory, monnier

And the new lottery numbers on the feature branch are:

Font-lock Scroll-bar Composition Output
------------------------------------------------------------
on	  on	     on		 GCs: 14 Elapsed time: 7.276626 seconds
on	  off	     on		 GCs: 1 Elapsed time: 5.363002 seconds
on	  off	     off	 GCs: 1 Elapsed time: 3.967520 seconds





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 13:42                 ` Eli Zaretskii
  2022-07-23 14:25                   ` Gerd Möllmann
  2022-07-23 14:47                   ` Eli Zaretskii
@ 2022-07-23 15:04                   ` Gerd Möllmann
  2022-07-23 16:03                     ` Eli Zaretskii
  2022-07-25 21:47                   ` Gregory Heytings
  3 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 15:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Eli Zaretskii <eliz@gnu.org> writes:

> Perhaps when
> long-line optimizations are turned on for a buffer, we shouldn't try
> to be holier than the Pope, and should use window_end_pos even if not
> reliable?

I don't think anyone would complain.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-23 15:29         ` Eli Zaretskii
  2022-07-23 15:46           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 15:29 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  56682@debbugs.gnu.org
> Date: Sat, 23 Jul 2022 10:07:29 -0400
> 
> >> From: Stefan Monnier <monnier@iro.umontreal.ca>
> >> Cc: Gregory Heytings <gregory@heytings.org>,  56682@debbugs.gnu.org
> >> Date: Fri, 22 Jul 2022 19:25:47 -0400
> >> 
> >> For example, the behavior surprises me because I can't see why `M-x`
> >> should cause any font-locking at all in the `long-line.xml` file.
> >
> > Any command that enters the minibuffer causes a thorough redisplay of
> > the windows on that frame (because more than one window has to be
> > updated).  What does that have to do with font-lock is a separate
> > question (but you are the best person to answer it).
> 
> AFAIK if the buffer has not been modified (including things like
> changing `window-start` or `point`), then a redisplay will just not
> run jit-lock (and hence font-lock) at all, no matter how thorough.

But the fact is without font-lock the response is faster by a large
factor.  So something, somewhere, still depends on font-lock.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 14:33                     ` Gerd Möllmann
@ 2022-07-23 15:43                       ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 15:43 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 56682@debbugs.gnu.org,  gregory@heytings.org,  monnier@iro.umontreal.ca
> Date: Sat, 23 Jul 2022 16:33:28 +0200
> 
> Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> 
> > Eli Zaretskii <eliz@gnu.org> writes:
> >
> >> And I've found a serious sink of CPU cycles under truncate-lines, and
> >> installed a fix on the feature branch.  Gerd, if you have time to
> >> eyeball the fix and comment on it, I'd appreciate.  It's commit
> >> 350e97d on the branch.  (I can post a more detailed explanation of
> >> what I did and why, if that would help, because the code and the
> >> functions it calls are somewhat tricky.)
> >
> > I'll look at it and come back.
> 
> modified   src/xdisp.c
> @@ -7153,10 +7153,10 @@ forward_to_next_line_start (struct it *it, bool *skipped_p,
>  	  || ((pos = Fnext_single_property_change (make_fixnum (start),
>  						   Qdisplay, Qnil,
>  						   make_fixnum (limit)),
> -	       NILP (pos))
> +	       (NILP (pos) || XFIXNAT (pos) == limit))
>  	      && next_overlay_change (start) == ZV))
>  	{
> -	  if (!it->bidi_p)
> +	  if (!it->bidi_p || !bidi_it_prev)
>  	    {
>  	      IT_CHARPOS (*it) = limit;
>  	      IT_BYTEPOS (*it) = bytepos;
> 
> I understand the first diff, which makes a lot of sense

Thanks, this was the main part.  From what I see, this code never
really worked correctly, since Emacs 21, because
Fnext_single_property_change can never return nil when called like
that.  IOW, we never took the shortcut there, and probably didn't
notice it because 500-character lines are rare.

> but I'm afraid I don't know enough about bidi to be of any help.

The idea is that since the value of BIDI_IT_PREV is never used by the
caller when ON_NEWLINE_P is zero, we don't need to compute it when
ON_NEWLINE_P is zero, something that would have precluded us from
taking the shortcut.  Not taking the shortcut is very expensive when
we have very long lines, of course.

Btw, one quirk of the code in forward_to_next_line_start (which took
me some time to take in and convince myself it's okay) is that it
returns non-zero when it didn't find any new newlines till ZV.  This
is not very clean, but I guess we don't care in that case, since the
iterator will be moved to ZV, and from there there's not much we need
to do...





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 15:29         ` Eli Zaretskii
@ 2022-07-23 15:46           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-23 16:15             ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-23 15:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory

>> AFAIK if the buffer has not been modified (including things like
>> changing `window-start` or `point`), then a redisplay will just not
>> run jit-lock (and hence font-lock) at all, no matter how thorough.
> But the fact is without font-lock the response is faster by a large
> factor.  So something, somewhere, still depends on font-lock.

Yes, that's the part that we need to explore.
Maybe font-lock *is* run somehow?
Or maybe it's just the mere presence of text properties?  (Or overlays?)


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 15:01                         ` Gerd Möllmann
@ 2022-07-23 16:02                           ` Eli Zaretskii
  2022-07-23 17:23                             ` Gerd Möllmann
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 16:02 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier, visuweshm

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 56682@debbugs.gnu.org,  Eli Zaretskii <eliz@gnu.org>,
>   gregory@heytings.org,  monnier@iro.umontreal.ca
> Date: Sat, 23 Jul 2022 17:01:02 +0200
> 
> And the new lottery numbers on the feature branch are:
> 
> Font-lock Scroll-bar Composition Output
> ------------------------------------------------------------
> on	  on	     on		 GCs: 14 Elapsed time: 7.276626 seconds
> on	  off	     on		 GCs: 1 Elapsed time: 5.363002 seconds
> on	  off	     off	 GCs: 1 Elapsed time: 3.967520 seconds

Thanks.  How about now, after I installed the last change?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 15:04                   ` Gerd Möllmann
@ 2022-07-23 16:03                     ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 16:03 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 56682@debbugs.gnu.org,  gregory@heytings.org,  monnier@iro.umontreal.ca
> Date: Sat, 23 Jul 2022 17:04:39 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Perhaps when
> > long-line optimizations are turned on for a buffer, we shouldn't try
> > to be holier than the Pope, and should use window_end_pos even if not
> > reliable?
> 
> I don't think anyone would complain.

I agree, and so I've now done so on the branch.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 15:46           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-23 16:15             ` Eli Zaretskii
  2022-07-23 16:19               ` Eli Zaretskii
  2022-07-23 19:05               ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 16:15 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  56682@debbugs.gnu.org
> Date: Sat, 23 Jul 2022 11:46:37 -0400
> 
> >> AFAIK if the buffer has not been modified (including things like
> >> changing `window-start` or `point`), then a redisplay will just not
> >> run jit-lock (and hence font-lock) at all, no matter how thorough.
> > But the fact is without font-lock the response is faster by a large
> > factor.  So something, somewhere, still depends on font-lock.
> 
> Yes, that's the part that we need to explore.
> Maybe font-lock *is* run somehow?
> Or maybe it's just the mere presence of text properties?  (Or overlays?)

My bet is indeed on the mere presence of text properties, plus the
fact that we need to merge faces.  But I could well be wrong.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 16:15             ` Eli Zaretskii
@ 2022-07-23 16:19               ` Eli Zaretskii
  2022-07-24  5:50                 ` Gerd Möllmann
  2022-07-23 19:05               ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 16:19 UTC (permalink / raw)
  To: monnier; +Cc: 56682, gregory

> Cc: 56682@debbugs.gnu.org, gregory@heytings.org
> Date: Sat, 23 Jul 2022 19:15:26 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: gregory@heytings.org,  56682@debbugs.gnu.org
> > Date: Sat, 23 Jul 2022 11:46:37 -0400
> > 
> > >> AFAIK if the buffer has not been modified (including things like
> > >> changing `window-start` or `point`), then a redisplay will just not
> > >> run jit-lock (and hence font-lock) at all, no matter how thorough.
> > > But the fact is without font-lock the response is faster by a large
> > > factor.  So something, somewhere, still depends on font-lock.
> > 
> > Yes, that's the part that we need to explore.
> > Maybe font-lock *is* run somehow?
> > Or maybe it's just the mere presence of text properties?  (Or overlays?)
> 
> My bet is indeed on the mere presence of text properties, plus the
> fact that we need to merge faces.  But I could well be wrong.

Btw, I think the best tool for determining this is run-time profiling,
such as with perf on GNU/Linux.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 16:02                           ` Eli Zaretskii
@ 2022-07-23 17:23                             ` Gerd Möllmann
  2022-07-23 17:44                               ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-23 17:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier, visuweshm

Eli Zaretskii <eliz@gnu.org> writes:

>> Font-lock Scroll-bar Composition Output
>> ------------------------------------------------------------
>> on	  on	     on		 GCs: 14 Elapsed time: 7.276626 seconds
>> on	  off	     on		 GCs: 1 Elapsed time: 5.363002 seconds
>> on	  off	     off	 GCs: 1 Elapsed time: 3.967520 seconds
>
> Thanks.  How about now, after I installed the last change?

Font-lock Scroll-bar Composition Output
------------------------------------------------------------
on	  on	     on		 GCs: 16 Elapsed time: 5.496764 seconds
on	  off	     on		 GCs: 1 Elapsed time: 5.362916 seconds
on	  off	     off	 GCs: 1 Elapsed time: 3.947306 seconds

That's with
280b8c96cc origin/feature/long-lines-and-font-locking Improve display of columns on mode-lin





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 17:23                             ` Gerd Möllmann
@ 2022-07-23 17:44                               ` Eli Zaretskii
  2022-07-23 17:49                                 ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 17:44 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier, visuweshm

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 56682@debbugs.gnu.org,  gregory@heytings.org,  monnier@iro.umontreal.ca,
>   visuweshm@gmail.com
> Date: Sat, 23 Jul 2022 19:23:40 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Font-lock Scroll-bar Composition Output
> >> ------------------------------------------------------------
> >> on	  on	     on		 GCs: 14 Elapsed time: 7.276626 seconds
> >> on	  off	     on		 GCs: 1 Elapsed time: 5.363002 seconds
> >> on	  off	     off	 GCs: 1 Elapsed time: 3.967520 seconds
> >
> > Thanks.  How about now, after I installed the last change?
> 
> Font-lock Scroll-bar Composition Output
> ------------------------------------------------------------
> on	  on	     on		 GCs: 16 Elapsed time: 5.496764 seconds
> on	  off	     on		 GCs: 1 Elapsed time: 5.362916 seconds
> on	  off	     off	 GCs: 1 Elapsed time: 3.947306 seconds
> 
> That's with
> 280b8c96cc origin/feature/long-lines-and-font-locking Improve display of columns on mode-lin

Thanks.  So the scroll-bar effect is largely gone, and font-lock is
now just 25% slower than no-font-lock.  Which I think is reasonable,
given that there's a face change every 10 to 20 characters?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 17:44                               ` Eli Zaretskii
@ 2022-07-23 17:49                                 ` Eli Zaretskii
  2022-07-23 17:59                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 17:49 UTC (permalink / raw)
  To: gerd.moellmann; +Cc: 56682, gregory, monnier, visuweshm

> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca,
>  visuweshm@gmail.com
> Date: Sat, 23 Jul 2022 20:44:43 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Font-lock Scroll-bar Composition Output
> > ------------------------------------------------------------
> > on	  on	     on		 GCs: 16 Elapsed time: 5.496764 seconds
> > on	  off	     on		 GCs: 1 Elapsed time: 5.362916 seconds
> > on	  off	     off	 GCs: 1 Elapsed time: 3.947306 seconds
> > 
> > That's with
> > 280b8c96cc origin/feature/long-lines-and-font-locking Improve display of columns on mode-lin
> 
> Thanks.  So the scroll-bar effect is largely gone, and font-lock is
> now just 25% slower than no-font-lock.  Which I think is reasonable,
> given that there's a face change every 10 to 20 characters?

Oops, I was confused: font-lock OFF is not in the table, so with it
we're still 4 to 5 times slower than without it.  It's
auto-composition mode that costs us 25% slowdown.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 17:49                                 ` Eli Zaretskii
@ 2022-07-23 17:59                                   ` Eli Zaretskii
  2022-07-24  6:16                                     ` Gerd Möllmann
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 17:59 UTC (permalink / raw)
  To: gerd.moellmann, 56682; +Cc: gregory, monnier, visuweshm

> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca,
>  visuweshm@gmail.com
> Date: Sat, 23 Jul 2022 20:49:39 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > > Font-lock Scroll-bar Composition Output
> > > ------------------------------------------------------------
> > > on	  on	     on		 GCs: 16 Elapsed time: 5.496764 seconds
> > > on	  off	     on		 GCs: 1 Elapsed time: 5.362916 seconds
> > > on	  off	     off	 GCs: 1 Elapsed time: 3.947306 seconds
> > > 
> > > That's with
> > > 280b8c96cc origin/feature/long-lines-and-font-locking Improve display of columns on mode-lin
> > 
> > Thanks.  So the scroll-bar effect is largely gone, and font-lock is
> > now just 25% slower than no-font-lock.  Which I think is reasonable,
> > given that there's a face change every 10 to 20 characters?
> 
> Oops, I was confused: font-lock OFF is not in the table, so with it
> we're still 4 to 5 times slower than without it.  It's
> auto-composition mode that costs us 25% slowdown.

Which seems to be similar to slowdown due to font-lock in other cases?
For example, scrolling with the same benchmark through xdisp.c takes
190 sec for the first time and 40 for the second time (when everything
is already fontified); whereas without font-lock it takes 20.  So it
sounds like font-lock generally slows down redisplay by such small
factors?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 16:15             ` Eli Zaretskii
  2022-07-23 16:19               ` Eli Zaretskii
@ 2022-07-23 19:05               ` Gregory Heytings
  2022-07-23 19:12                 ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-23 19:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier


>
> My bet is indeed on the mere presence of text properties, plus the fact 
> that we need to merge faces.
>

That's precisely what I've been investigating during the last day.  I 
probably need two or three more days to reach a conclusion, so please tell 
me if you're doing that in parallel.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 19:05               ` Gregory Heytings
@ 2022-07-23 19:12                 ` Eli Zaretskii
  2022-07-23 19:21                   ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-23 19:12 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier

> Date: Sat, 23 Jul 2022 19:05:54 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Stefan Monnier <monnier@iro.umontreal.ca>, bug-gnu-emacs@gnu.org
> 
> > My bet is indeed on the mere presence of text properties, plus the fact 
> > that we need to merge faces.
> 
> That's precisely what I've been investigating during the last day.  I 
> probably need two or three more days to reach a conclusion, so please tell 
> me if you're doing that in parallel.

I'm not.  I did the measurements, and so did Gerd, but I'm not looking
into the font-lock influence any deeper than that.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 19:12                 ` Eli Zaretskii
@ 2022-07-23 19:21                   ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-23 19:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier

[-- Attachment #1: Type: text/plain, Size: 418 bytes --]


>> That's precisely what I've been investigating during the last day.  I 
>> probably need two or three more days to reach a conclusion, so please 
>> tell me if you're doing that in parallel.
>
> I'm not.  I did the measurements, and so did Gerd, but I'm not looking 
> into the font-lock influence any deeper than that.
>

Okay, thanks; I can continue without fearing that I might be wasting my 
time 😉

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 16:19               ` Eli Zaretskii
@ 2022-07-24  5:50                 ` Gerd Möllmann
  2022-07-24 14:35                   ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-24  5:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> My bet is indeed on the mere presence of text properties, plus the
>> fact that we need to merge faces.  But I could well be wrong.

Can't say something about face merging, but "frequent" changes of faces
certainly have an effect on iterator performance.  It stops, looks up
properties again to determine the next stop pos, does what has to be
done for current properties...

> Btw, I think the best tool for determining this is run-time profiling,
> such as with perf on GNU/Linux.

Yes, I don't think there is something comparable on macOS.  Or I simply
can't find it.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 17:59                                   ` Eli Zaretskii
@ 2022-07-24  6:16                                     ` Gerd Möllmann
  2022-07-24  6:52                                       ` Eli Zaretskii
  2022-07-24 14:34                                       ` bug#56682: Interval tree balance (was: Fix the long lines font locking related slowdowns) Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-24  6:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier, visuweshm

Eli Zaretskii <eliz@gnu.org> writes:

> Which seems to be similar to slowdown due to font-lock in other cases?
> For example, scrolling with the same benchmark through xdisp.c takes
> 190 sec for the first time and 40 for the second time (when everything
> is already fontified); whereas without font-lock it takes 20.  So it
> sounds like font-lock generally slows down redisplay by such small
> factors?

Alas, I don't remember much about the exact figures I got when I
profiled this pre-21.  But a doubling of run time--after font-lock has
finished--doesn't appear to me to be entirely unplausible.

BTW, a memory that re-emerged right now: Interestingly, the number of
different text properties that the iterator checks, i.e. the number of
text property names like face, fontified, display, invisible, etc. that
the iterator checks, played on astonishing role back then.  And the
relation to performance wasn't linear either.  Which is why there's one
display property now, subsuming different subtypes.  Originally, the
subtypes were distinct properties, and display didn't exist.  Way before
Emacs 21.

Don't know if this is relevant for anything in this case.  I thought I
just mention that the interval tree might also have a potential for
improvement, if you will.  Amd another BTW: I was never 100% certain if
the interval tree is really always balanced because it didn't use an
algorithm that I knew and could recognize.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24  6:16                                     ` Gerd Möllmann
@ 2022-07-24  6:52                                       ` Eli Zaretskii
  2022-07-24 14:36                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-24 14:34                                       ` bug#56682: Interval tree balance (was: Fix the long lines font locking related slowdowns) Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-24  6:52 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier, visuweshm

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: 56682@debbugs.gnu.org,  gregory@heytings.org,  monnier@iro.umontreal.ca,
>   visuweshm@gmail.com
> Date: Sun, 24 Jul 2022 08:16:33 +0200
> 
> And another BTW: I was never 100% certain if the interval tree is
> really always balanced because it didn't use an algorithm that I
> knew and could recognize.

AFAIU, we balance on-the-fly (see find_interval, which calls
balance_possible_root_interval).

> "frequent" changes of faces certainly have an effect on iterator
> performance.  It stops, looks up properties again to determine the
> next stop pos, does what has to be done for current properties...

To be more accurate: changes in text properties that are not relevant
to display basically affect the iterator performance in only one way:
they make the search for the next change in _relevant_ properties more
expensive.  See the last part of compute_stop_pos for the gory
details.  In a nutshell, we check one by one the intervals following
the interval of the current iterator position, until we find an
interval whose values for one or more of the properties of interest to
redisplay are different.  When we find one such interval, it is
guaranteed to have changes only in the text properties that are of
interest to redisplay, but the search could take more time if there
are many text properties that are not interesting, because there are
more intervals to check.

Btw, it might be interesting to measure the effect of enlarging
TEXT_PROP_DISTANCE_LIMIT, currently 100 character positions, on the
performance.  Looking at the code, it is not clear to me whether it
could affect the performance in any significant ways, but maybe I'm
wrong.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Interval tree balance (was: Fix the long lines font locking related slowdowns)
  2022-07-24  6:16                                     ` Gerd Möllmann
  2022-07-24  6:52                                       ` Eli Zaretskii
@ 2022-07-24 14:34                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-24 15:47                                         ` bug#56682: Interval tree balance Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-25  6:23                                         ` bug#56682: Fix the long lines font locking related slowdowns Gerd Möllmann
  1 sibling, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-24 14:34 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, Eli Zaretskii, gregory, visuweshm

> Don't know if this is relevant for anything in this case.  I thought I
> just mention that the interval tree might also have a potential for
> improvement, if you will.  And another BTW: I was never 100% certain if
> the interval tree is really always balanced because it didn't use an
> algorithm that I knew and could recognize.

I can answer this one: no it's not always balanced (tho in practice
it is most of the time).  We could make it use a more standard
algorithm, but I have not been able to measure any impact on performance
(I also played with a splay-tree alternative, under the assumption that
we mostly consult the tree "locally" (within the visible part of the
buffer, basically), so a splay-tree could turn the O(log N) into an O(n)
where `N` is the buffer-size and `n` is the distance between
window-start and window-end).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24  5:50                 ` Gerd Möllmann
@ 2022-07-24 14:35                   ` Dmitry Gutov
  2022-07-24 15:05                     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-07-24 14:35 UTC (permalink / raw)
  To: Gerd Möllmann, Eli Zaretskii; +Cc: 56682, gregory, monnier

On 24.07.2022 08:50, Gerd Möllmann wrote:
> Eli Zaretskii<eliz@gnu.org>  writes:
> 
>>> My bet is indeed on the mere presence of text properties, plus the
>>> fact that we need to merge faces.  But I could well be wrong.
> Can't say something about face merging, but "frequent" changes of faces
> certainly have an effect on iterator performance.  It stops, looks up
> properties again to determine the next stop pos, does what has to be
> done for current properties...

But the problem is contingent on having long lines, isn't it?

There must be some interplay between those circumstances. Not just 
having to look up faces (relatively) a lot.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24  6:52                                       ` Eli Zaretskii
@ 2022-07-24 14:36                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-24 15:07                                           ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-24 14:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 56682, gregory, visuweshm

> redisplay are different.  When we find one such interval, it is
> guaranteed to have changes only in the text properties that are of
> interest to redisplay, but the search could take more time if there
> are many text properties that are not interesting, because there are
> more intervals to check.

Not only more intervals, but also more elements in the `plist`s of
each interval.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 14:35                   ` Dmitry Gutov
@ 2022-07-24 15:05                     ` Eli Zaretskii
  2022-07-25 23:23                       ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-24 15:05 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, gregory, monnier

> Date: Sun, 24 Jul 2022 17:35:19 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 24.07.2022 08:50, Gerd Möllmann wrote:
> > Eli Zaretskii<eliz@gnu.org>  writes:
> > 
> >>> My bet is indeed on the mere presence of text properties, plus the
> >>> fact that we need to merge faces.  But I could well be wrong.
> > Can't say something about face merging, but "frequent" changes of faces
> > certainly have an effect on iterator performance.  It stops, looks up
> > properties again to determine the next stop pos, does what has to be
> > done for current properties...
> 
> But the problem is contingent on having long lines, isn't it?

Not necessarily, see the times I measured scrolling through xdisp.c,
which I posted earlier.  It could be that with long lines font-lock
just makes it slower still, to the point where it becomes unbearable.

> There must be some interplay between those circumstances. Not just 
> having to look up faces (relatively) a lot.

What else did you have in mind?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 14:36                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-24 15:07                                           ` Eli Zaretskii
  2022-07-24 15:48                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-24 15:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, 56682, gregory, visuweshm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gerd Möllmann <gerd.moellmann@gmail.com>,
>   56682@debbugs.gnu.org,
>   gregory@heytings.org,  visuweshm@gmail.com
> Date: Sun, 24 Jul 2022 10:36:41 -0400
> 
> > redisplay are different.  When we find one such interval, it is
> > guaranteed to have changes only in the text properties that are of
> > interest to redisplay, but the search could take more time if there
> > are many text properties that are not interesting, because there are
> > more intervals to check.
> 
> Not only more intervals, but also more elements in the `plist`s of
> each interval.

Could be, if there are many different properties.  But since we are
talking about font-lock, there's only one we are talking about:
'face'.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Interval tree balance
  2022-07-24 14:34                                       ` bug#56682: Interval tree balance (was: Fix the long lines font locking related slowdowns) Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-24 15:47                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-25  6:23                                         ` bug#56682: Fix the long lines font locking related slowdowns Gerd Möllmann
  1 sibling, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-24 15:47 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, Eli Zaretskii, gregory, visuweshm

> buffer, basically), so a splay-tree could turn the O(log N) into an O(n)
                                                                        ^
                                                                      log n

> where `N` is the buffer-size and `n` is the distance between
> window-start and window-end).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 15:07                                           ` Eli Zaretskii
@ 2022-07-24 15:48                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-24 16:18                                               ` Eli Zaretskii
  2022-07-24 16:26                                               ` Lars Ingebrigtsen
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-24 15:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, gregory, visuweshm

>> Not only more intervals, but also more elements in the `plist`s of
>> each interval.
> Could be, if there are many different properties.  But since we are
> talking about font-lock, there's only one we are talking about: 'face'.

`face` may be the one we care about, but there will be others
(especially in the case you were talking about, i.e. the impact of
non-display-related faces)


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 15:48                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-24 16:18                                               ` Eli Zaretskii
  2022-07-24 16:26                                               ` Lars Ingebrigtsen
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-24 16:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gerd.moellmann, 56682, gregory, visuweshm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gerd.moellmann@gmail.com,  56682@debbugs.gnu.org,  gregory@heytings.org,
>   visuweshm@gmail.com
> Date: Sun, 24 Jul 2022 11:48:35 -0400
> 
> >> Not only more intervals, but also more elements in the `plist`s of
> >> each interval.
> > Could be, if there are many different properties.  But since we are
> > talking about font-lock, there's only one we are talking about: 'face'.
> 
> `face` may be the one we care about, but there will be others
> (especially in the case you were talking about, i.e. the impact of
> non-display-related faces)

In general, sure.  But this particular discussion is about the
difference in performance between font-lock being on and off, and in
that case the only property that counts is 'face'.

If you are thinking about non-display-related properties, you are in
the wrong thread ;-)  That one was raised by João in another
discussion.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 15:48                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-24 16:18                                               ` Eli Zaretskii
@ 2022-07-24 16:26                                               ` Lars Ingebrigtsen
  2022-07-24 16:33                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-24 16:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, gregory, Stefan Monnier, visuweshm

Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs@gnu.org> writes:

> `face` may be the one we care about, but there will be others
> (especially in the case you were talking about, i.e. the impact of
> non-display-related faces)

Random thought: Would it make sense to make `put-text-property' (and
friends) try to ensure that `face' (and `font-lock-face'?) are towards
the start of the plists?

That wouldn't help with intervals that don't have a `face' at all, but
should speed up the ones that have, and would be relatively low impact.

I don't think we guarantee anything about the order, so it might not
break anything.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 16:26                                               ` Lars Ingebrigtsen
@ 2022-07-24 16:33                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-24 16:33 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: gerd.moellmann, 56682, gregory, monnier, visuweshm

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  gerd.moellmann@gmail.com,
>   56682@debbugs.gnu.org,  gregory@heytings.org,  visuweshm@gmail.com
> Date: Sun, 24 Jul 2022 18:26:38 +0200
> 
> Random thought: Would it make sense to make `put-text-property' (and
> friends) try to ensure that `face' (and `font-lock-face'?) are towards
> the start of the plists?

The display code doesn't care only about 'face', it also cares for
'invisible' and 'display' properties.  And only one of them can be
"the first" ;-)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 14:34                                       ` bug#56682: Interval tree balance (was: Fix the long lines font locking related slowdowns) Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-24 15:47                                         ` bug#56682: Interval tree balance Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-25  6:23                                         ` Gerd Möllmann
  2022-07-25 20:49                                           ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-25  6:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, gregory, visuweshm

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Don't know if this is relevant for anything in this case.  I thought I
>> just mention that the interval tree might also have a potential for
>> improvement, if you will.  And another BTW: I was never 100% certain if
>> the interval tree is really always balanced because it didn't use an
>> algorithm that I knew and could recognize.
>
> I can answer this one: no it's not always balanced (tho in practice
> it is most of the time).  We could make it use a more standard
> algorithm, but I have not been able to measure any impact on performance
> (I also played with a splay-tree alternative, under the assumption that
> we mostly consult the tree "locally" (within the visible part of the
> buffer, basically), so a splay-tree could turn the O(log N) into an O(n)
> where `N` is the buffer-size and `n` is the distance between
> window-start and window-end).

Thanks.  And too bad.

But I'd really prefer a standard algorithm with formally proven
properties anyway.  Call me German.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25  6:23                                         ` bug#56682: Fix the long lines font locking related slowdowns Gerd Möllmann
@ 2022-07-25 20:49                                           ` Gregory Heytings
  2022-07-26  6:32                                             ` Gerd Möllmann
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-25 20:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 56682, Stefan Monnier, visuweshm


I pushed another improvement to the branch.  Now motion in a font-locked 
buffer is almost as fast as in a non font-locked one; I don't think it's 
possible to make further significant improvements.

It turns out that the motion slowdowns in font locked-buffers were caused 
only indirectly by font locking.  What happens is that in a font-locked 
buffer, get_next_display_element calls next_element_from_buffer which uses 
handle_stop (and therefore compute_stop_pos) to position the iterator on 
the next element.  In a file such as long-line.xml, there are many more 
stop positions when the buffer is font locked, roughly 100 times more. 
The solution is to make back_to_previous_visible_line_start do something 
closer to what it promises, namely "Set IT's current position to the 
previous visible line start."  Previously it moved IT to the beginning of 
the (temporarily narrowed) buffer, which was already much better than 
moving it to the beginning of the line, but not enough to make motion 
commands fast.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 14:44       ` Lars Ingebrigtsen
@ 2022-07-25 20:59         ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-25 20:59 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>
> This reminds me of something I meant to mention -- Stefan M. once 
> proposed that there should be two kinds of narrowing (I think?).  The 
> first is the one that the user sets with `C-x n n', which says "the user 
> is only interested in this bit of the buffer", but programs are 
> "allowed" to remove that restriction when doing stuff (like font 
> locking).  The second type should be a strict one, where modes are not 
> allowed to widen the region.
>
> Looking briefly at Gregory's new branch, it seems like that (sort of) 
> introduces this idea, but in a non-explicit way (i.e., by having an 
> inhibit-widen variable).
>

Thanks for the idea!  I think this could be useful in other contexts, so 
I'll try to implement that.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-22 23:25   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-23  6:41     ` Eli Zaretskii
@ 2022-07-25 21:23     ` Gregory Heytings
  2022-07-26 21:17       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-25 21:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>
> Before using a blunt tool like the forced-narrowing now in 
> `feature/long-lines-and-font-locking`, I think we should try and figure 
> out *why* the recipe below is so slow.
>

It's not a blunt tool, it's an appropriate tool to help making sure that 
Emacs remains responsive when large files are visited.  Think of it as 
POSIX's ulimits.  Allowing fontification-functions to search for 
arbitrarily complex regexpes in an arbitrarily large buffer, each and 
every time they are asked to to highlight a small chunk of the said 
buffer, is a recipe for disaster.  If for some reason modes really need to 
go through to the whole buffer to decide which highlighting to use, they 
should to do so outside of fontification-functions, and ideally once, for 
example, when the file is loaded.

(Note that at the moment that tool is enabled only when files with long 
lines are visited.)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-23 13:42                 ` Eli Zaretskii
                                     ` (2 preceding siblings ...)
  2022-07-23 15:04                   ` Gerd Möllmann
@ 2022-07-25 21:47                   ` Gregory Heytings
  2022-07-26  6:51                     ` Gerd Möllmann
  2022-07-26 11:37                     ` Eli Zaretskii
  3 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-25 21:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 56682, monnier


>
> And I've found a serious sink of CPU cycles under truncate-lines, and 
> installed a fix on the feature branch.  Gerd, if you have time to 
> eyeball the fix and comment on it, I'd appreciate.  It's commit 350e97d 
> on the branch.  (I can post a more detailed explanation of what I did 
> and why, if that would help, because the code and the functions it calls 
> are somewhat tricky.)
>

Hmmm...  After 350e97d78e, Isearch locks Emacs with toggle-truncate-lines. 
Recipe:

C-x C-f long-line.xml
C-x x t
C-s </

You have to kill Emacs, C-g does not work.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-24 15:05                     ` Eli Zaretskii
@ 2022-07-25 23:23                       ` Dmitry Gutov
  2022-07-26  6:52                         ` Gregory Heytings
  2022-07-26 11:45                         ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-07-25 23:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, gregory, monnier

On 24.07.2022 18:05, Eli Zaretskii wrote:
>> Date: Sun, 24 Jul 2022 17:35:19 +0300
>> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> On 24.07.2022 08:50, Gerd Möllmann wrote:
>>> Eli Zaretskii<eliz@gnu.org>  writes:
>>>
>>>>> My bet is indeed on the mere presence of text properties, plus the
>>>>> fact that we need to merge faces.  But I could well be wrong.
>>> Can't say something about face merging, but "frequent" changes of faces
>>> certainly have an effect on iterator performance.  It stops, looks up
>>> properties again to determine the next stop pos, does what has to be
>>> done for current properties...
>>
>> But the problem is contingent on having long lines, isn't it?
> 
> Not necessarily, see the times I measured scrolling through xdisp.c,
> which I posted earlier.  It could be that with long lines font-lock
> just makes it slower still, to the point where it becomes unbearable.

Why with long lines, though?

>> There must be some interplay between those circumstances. Not just
>> having to look up faces (relatively) a lot.
> 
> What else did you have in mind?

Some operation dependent on the length of the current line?

With font-lock, it seems to get progressively slower the farther you get 
along the current (long line).

E.g. you can have a long line spanning several screenfuls without line 
breaks. When the window is scrolled to the beginning, redisplay is 
relatively fast (I can press up/down arrows, and they seem responsive).

But if I scroll the window to the end of said long line, up/down 
commands become much less responsive.

Tested that with today's master and js-mode visiting a minified JS file.

Perhaps it's due to font-lock logic in that it has to match from the 
beginning of a line (not sure we'd want to abandon that promise, 
though). Or maybe something else.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25 20:49                                           ` Gregory Heytings
@ 2022-07-26  6:32                                             ` Gerd Möllmann
  2022-07-26  6:53                                               ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-26  6:32 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier, visuweshm

Gregory Heytings <gregory@heytings.org> writes:

> I pushed another improvement to the branch.  Now motion in a
> font-locked buffer is almost as fast as in a non font-locked one; I
> don't think it's possible to make further significant improvements.

Font-lock Scroll-bar Composition Output
------------------------------------------------------------
on	  on	     on		 GCs: 13 Elapsed time: 1.632215 seconds
on	  off	     on		 GCs: 1 Elapsed time: 1.433515 seconds
on	  off	     off	 GCs: 2 Elapsed time: 1.209662 seconds
off	  off	     off	 GCs: 2 Elapsed time: 0.524913 seconds





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25 21:47                   ` Gregory Heytings
@ 2022-07-26  6:51                     ` Gerd Möllmann
  2022-07-26  7:08                       ` Gerd Möllmann
  2022-07-26 12:12                       ` Eli Zaretskii
  2022-07-26 11:37                     ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-26  6:51 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

Gregory Heytings <gregory@heytings.org> writes:

> Hmmm...  After 350e97d78e, Isearch locks Emacs with
> toggle-truncate-lines. Recipe:
>
> C-x C-f long-line.xml
> C-x x t
> C-s </
>
> You have to kill Emacs, C-g does not work.

That's kind of funny :-):

(setq isearch-lazy-highlight nil)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25 23:23                       ` Dmitry Gutov
@ 2022-07-26  6:52                         ` Gregory Heytings
       [not found]                           ` <addcac7f-cb95-c433-58e5-e2d525582613@yandex.ru>
  2022-07-26 11:45                         ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-26  6:52 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier


>
> With font-lock, it seems to get progressively slower the farther you get 
> along the current (long line).
>
> E.g. you can have a long line spanning several screenfuls without line 
> breaks. When the window is scrolled to the beginning, redisplay is 
> relatively fast (I can press up/down arrows, and they seem responsive).
>
> But if I scroll the window to the end of said long line, up/down 
> commands become much less responsive.
>
> Tested that with today's master and js-mode visiting a minified JS file.
>

You should have tried the feature/long-lines-and-font-locking branch 
instead, where that problem has been fixed a few hours ago.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26  6:32                                             ` Gerd Möllmann
@ 2022-07-26  6:53                                               ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-26  6:53 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, Eli Zaretskii, Stefan Monnier, visuweshm


>> I pushed another improvement to the branch.  Now motion in a 
>> font-locked buffer is almost as fast as in a non font-locked one; I 
>> don't think it's possible to make further significant improvements.
>
> Font-lock Scroll-bar Composition Output
> ------------------------------------------------------------
> on	  on	     on		 GCs: 13 Elapsed time: 1.632215 seconds
> on	  off	     on		 GCs: 1 Elapsed time: 1.433515 seconds
> on	  off	     off	 GCs: 2 Elapsed time: 1.209662 seconds
> off	  off	     off	 GCs: 2 Elapsed time: 0.524913 seconds
>

Thanks, this confirms my "almost as fast" feeling and measurements.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26  6:51                     ` Gerd Möllmann
@ 2022-07-26  7:08                       ` Gerd Möllmann
  2022-07-26 12:12                       ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-26  7:08 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

Gerd Möllmann <gerd.moellmann@gmail.com> writes:

> Gregory Heytings <gregory@heytings.org> writes:
>
>> Hmmm...  After 350e97d78e, Isearch locks Emacs with
>> toggle-truncate-lines. Recipe:
>>
>> C-x C-f long-line.xml
>> C-x x t
>> C-s </
>>
>> You have to kill Emacs, C-g does not work.
>
> That's kind of funny :-):
>
> (setq isearch-lazy-highlight nil)

I eventually finishes, sort of.  The many '<' in the XML lead to a stop
every few characters for drawing highlights.  Like a train that stops at
every trash can along the tracks.  That's more stops than with
font-lock.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25 21:47                   ` Gregory Heytings
  2022-07-26  6:51                     ` Gerd Möllmann
@ 2022-07-26 11:37                     ` Eli Zaretskii
  2022-07-26 11:53                       ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 11:37 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Mon, 25 Jul 2022 21:47:55 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Gerd Möllmann <gerd.moellmann@gmail.com>, 
>     56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > And I've found a serious sink of CPU cycles under truncate-lines, and 
> > installed a fix on the feature branch.  Gerd, if you have time to 
> > eyeball the fix and comment on it, I'd appreciate.  It's commit 350e97d 
> > on the branch.  (I can post a more detailed explanation of what I did 
> > and why, if that would help, because the code and the functions it calls 
> > are somewhat tricky.)
> 
> Hmmm...  After 350e97d78e, Isearch locks Emacs with toggle-truncate-lines. 
> Recipe:
> 
> C-x C-f long-line.xml
> C-x x t
> C-s </
> 
> You have to kill Emacs, C-g does not work.

I see the same on master, where that change is not yet installed.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25 23:23                       ` Dmitry Gutov
  2022-07-26  6:52                         ` Gregory Heytings
@ 2022-07-26 11:45                         ` Eli Zaretskii
  2022-07-26 20:52                           ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 11:45 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, gregory, monnier

> Date: Tue, 26 Jul 2022 02:23:42 +0300
> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, gregory@heytings.org,
>  monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> >> But the problem is contingent on having long lines, isn't it?
> > 
> > Not necessarily, see the times I measured scrolling through xdisp.c,
> > which I posted earlier.  It could be that with long lines font-lock
> > just makes it slower still, to the point where it becomes unbearable.
> 
> Why with long lines, though?

Because they are long.  So there's more faces to encounter when going
from one line to another.

> >> There must be some interplay between those circumstances. Not just
> >> having to look up faces (relatively) a lot.
> > 
> > What else did you have in mind?
> 
> Some operation dependent on the length of the current line?
> 
> With font-lock, it seems to get progressively slower the farther you get 
> along the current (long line).
> 
> E.g. you can have a long line spanning several screenfuls without line 
> breaks. When the window is scrolled to the beginning, redisplay is 
> relatively fast (I can press up/down arrows, and they seem responsive).
> 
> But if I scroll the window to the end of said long line, up/down 
> commands become much less responsive.
> 
> Tested that with today's master and js-mode visiting a minified JS file.
> 
> Perhaps it's due to font-lock logic in that it has to match from the 
> beginning of a line (not sure we'd want to abandon that promise, 
> though). Or maybe something else.

It isn't font-lock, at least not in all major modes.  It's the display
engine itself that sometimes needs to go to the beginning of the line.
When it does, going back gets slower with font-lock than without.
This is why you see slower redisplay when you go deeper into a long
line.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 11:37                     ` Eli Zaretskii
@ 2022-07-26 11:53                       ` Gregory Heytings
  2022-07-26 12:09                         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-26 11:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier


>> Hmmm...  After 350e97d78e, Isearch locks Emacs with 
>> toggle-truncate-lines. Recipe:
>>
>> C-x C-f long-line.xml
>> C-x x t
>> C-s </
>>
>> You have to kill Emacs, C-g does not work.
>
> I see the same on master, where that change is not yet installed.
>

Are you sure?  I don't see the same at 304e2a3a05.  C-s is slow, but it 
does not hang Emacs, and the effect of C-s </ is visible within a couple 
of seconds.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 11:53                       ` Gregory Heytings
@ 2022-07-26 12:09                         ` Eli Zaretskii
  2022-07-26 12:34                           ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 12:09 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Tue, 26 Jul 2022 11:53:28 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> 
> >> Hmmm...  After 350e97d78e, Isearch locks Emacs with 
> >> toggle-truncate-lines. Recipe:
> >>
> >> C-x C-f long-line.xml
> >> C-x x t
> >> C-s </
> >>
> >> You have to kill Emacs, C-g does not work.
> >
> > I see the same on master, where that change is not yet installed.
> 
> Are you sure?  I don't see the same at 304e2a3a05.  C-s is slow, but it 
> does not hang Emacs, and the effect of C-s </ is visible within a couple 
> of seconds.

I don't know the difference between "slow" and "hang".  Gerd says it
eventually finishes, so it isn't a "hang" at least on his machine.
And my build is unoptimized, so what is "slow" for you probably
qualifies as "hang" for me.

If you disable lazy-highlighting, does the problem go away?

And if you customize lazy-highlight-interval and
lazy-highlight-max-at-a-time to reasonable values, doesn't the "hang"
becomes much shorter?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26  6:51                     ` Gerd Möllmann
  2022-07-26  7:08                       ` Gerd Möllmann
@ 2022-07-26 12:12                       ` Eli Zaretskii
  2022-07-26 12:22                         ` Gerd Möllmann
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 12:12 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org,
>   monnier@iro.umontreal.ca
> Date: Tue, 26 Jul 2022 08:51:54 +0200
> 
> Gregory Heytings <gregory@heytings.org> writes:
> 
> > Hmmm...  After 350e97d78e, Isearch locks Emacs with
> > toggle-truncate-lines. Recipe:
> >
> > C-x C-f long-line.xml
> > C-x x t
> > C-s </
> >
> > You have to kill Emacs, C-g does not work.
> 
> That's kind of funny :-):
> 
> (setq isearch-lazy-highlight nil)

Did you try this on master or on the feature branch?

And I found that customizing lazy-highlight-max-at-a-time and
lazy-highlight-interval can alleviate the problem to some extent.

The reason seems to be that lazy-highlighting tries to highlight every
match "in th window", but its interpretation of "in the window" seems
to be "buffer position before window-end position".  Which of course
includes a lot of stuff in a window with truncate-lines.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 12:12                       ` Eli Zaretskii
@ 2022-07-26 12:22                         ` Gerd Möllmann
  0 siblings, 0 replies; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-26 12:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 1133 bytes --]

This is on Gregory's branch.

On Tue, 26 Jul 2022, 14:12 Eli Zaretskii, <eliz@gnu.org> wrote:

> > From: Gerd Möllmann <gerd.moellmann@gmail.com>
> > Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org,
> >   monnier@iro.umontreal.ca
> > Date: Tue, 26 Jul 2022 08:51:54 +0200
> >
> > Gregory Heytings <gregory@heytings.org> writes:
> >
> > > Hmmm...  After 350e97d78e, Isearch locks Emacs with
> > > toggle-truncate-lines. Recipe:
> > >
> > > C-x C-f long-line.xml
> > > C-x x t
> > > C-s </
> > >
> > > You have to kill Emacs, C-g does not work.
> >
> > That's kind of funny :-):
> >
> > (setq isearch-lazy-highlight nil)
>
> Did you try this on master or on the feature branch?
>
> And I found that customizing lazy-highlight-max-at-a-time and
> lazy-highlight-interval can alleviate the problem to some extent.
>
> The reason seems to be that lazy-highlighting tries to highlight every
> match "in th window", but its interpretation of "in the window" seems
> to be "buffer position before window-end position".  Which of course
> includes a lot of stuff in a window with truncate-lines.
>

[-- Attachment #2: Type: text/html, Size: 1968 bytes --]

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 12:09                         ` Eli Zaretskii
@ 2022-07-26 12:34                           ` Gregory Heytings
  2022-07-26 12:41                             ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-26 12:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier


>
> I don't know the difference between "slow" and "hang".  Gerd says it 
> eventually finishes, so it isn't a "hang" at least on his machine. And 
> my build is unoptimized, so what is "slow" for you probably qualifies as 
> "hang" for me.
>

I just tried again, and on my machine with an optimized build C-s </ RET 
takes 220 seconds to finish (and cannot be interrupted).  That's what I 
call "hang".  With 350e97d78e, C-s </ RET takes about 1 second.  That's 
what I call "slow" (without truncate-lines C-s </ RET is instantaneous).

>
> If you disable lazy-highlighting, does the problem go away?
>

Yes, as I said to Gerd, with (setq isearch-lazy-highlight nil) the problem 
goes away (and C-s </ RET is instantaneous).

>
> And if you customize lazy-highlight-interval and 
> lazy-highlight-max-at-a-time to reasonable values, doesn't the "hang" 
> becomes much shorter?
>

With (setq lazy-highlight-interval 5) the problem does not go away.  With 
(setq lazy-highlight-max-at-a-time 10) the problem goes away, C-s </ RET 
takes about 1 second.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 12:34                           ` Gregory Heytings
@ 2022-07-26 12:41                             ` Eli Zaretskii
  2022-07-26 13:08                               ` Gerd Möllmann
  2022-07-26 17:46                               ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 12:41 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Tue, 26 Jul 2022 12:34:30 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > I don't know the difference between "slow" and "hang".  Gerd says it 
> > eventually finishes, so it isn't a "hang" at least on his machine. And 
> > my build is unoptimized, so what is "slow" for you probably qualifies as 
> > "hang" for me.
> 
> I just tried again, and on my machine with an optimized build C-s </ RET 
> takes 220 seconds to finish (and cannot be interrupted).  That's what I 
> call "hang".  With 350e97d78e, C-s </ RET takes about 1 second.  That's 
> what I call "slow" (without truncate-lines C-s </ RET is instantaneous).

OK, I will take a look soon.  That change makes us use a branch of
code that AFAIU was never used since it was written, so some kind of
trouble can be expected.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 12:41                             ` Eli Zaretskii
@ 2022-07-26 13:08                               ` Gerd Möllmann
  2022-07-26 17:29                                 ` Eli Zaretskii
  2022-07-26 17:46                               ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-26 13:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, monnier

What about the following idea, in general.  Sorry if that is in the
wrong sub-thread, I'm beginning to loose orientation a bit.

While iterating,

- remmember where we started, say it's 'start_pos'.

- when reaching start_pos + X, without reaching a line end, refuse to do
  any complex stuff, no faces, invisible text and so on.

I could think of numerous variations of that theme.

I just loaded long-lines.xml in VSCode, and it seems to do something
like that.  It says that "tokenization has been disabled" in the long
line.  The long line doesn't have the highlighting that the shorter ones
have, and VSCode also wraps the long line instead of truncating it.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 13:08                               ` Gerd Möllmann
@ 2022-07-26 17:29                                 ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 17:29 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: Gregory Heytings <gregory@heytings.org>,  56682@debbugs.gnu.org,
>   monnier@iro.umontreal.ca
> Date: Tue, 26 Jul 2022 15:08:18 +0200
> 
> What about the following idea, in general.  Sorry if that is in the
> wrong sub-thread, I'm beginning to loose orientation a bit.
> 
> While iterating,
> 
> - remmember where we started, say it's 'start_pos'.
> 
> - when reaching start_pos + X, without reaching a line end, refuse to do
>   any complex stuff, no faces, invisible text and so on.
> 
> I could think of numerous variations of that theme.
> 
> I just loaded long-lines.xml in VSCode, and it seems to do something
> like that.  It says that "tokenization has been disabled" in the long
> line.  The long line doesn't have the highlighting that the shorter ones
> have, and VSCode also wraps the long line instead of truncating it.

We keep this possibility in mind all the time, you can see it in, for
example, the discussion whether to turn off line truncation.  But, as
I explained a minute ago, I think we haven't yet exhausted less
drastic measures, and there could still be opportunities for speedup
in the basic display code or thereabouts.  Our display is different
from that of VSCode, so the trade-offs are different as well.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 12:41                             ` Eli Zaretskii
  2022-07-26 13:08                               ` Gerd Möllmann
@ 2022-07-26 17:46                               ` Eli Zaretskii
  2022-07-26 20:55                                 ` Gregory Heytings
  2022-07-27  2:33                                 ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-26 17:46 UTC (permalink / raw)
  To: gregory, gerd.moellmann; +Cc: 56682, monnier

> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> Date: Tue, 26 Jul 2022 15:41:49 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > I just tried again, and on my machine with an optimized build C-s </ RET 
> > takes 220 seconds to finish (and cannot be interrupted).  That's what I 
> > call "hang".  With 350e97d78e, C-s </ RET takes about 1 second.  That's 
> > what I call "slow" (without truncate-lines C-s </ RET is instantaneous).
> 
> OK, I will take a look soon.  That change makes us use a branch of
> code that AFAIU was never used since it was written, so some kind of
> trouble can be expected.

Heh, it's a case of "better is worse".  There's nothing wrong with the
change in commit 350e97d78e: it does indeed make redisplay of
long-and-truncated lines significantly faster.  And that speedup is
what's causing the problem: because the initial redisplay of the
buffer, before the results of C-s are displayed, is much faster, the
lazy-highlighting starts very soon, and it inserts thousands of
overlays into the buffer (because it by default highlights all the
matches till window-end, which is EOB in this case).  And once it
inserts enough overlays, we are dead in the water: the shortcut in
forward_to_next_line_start can no longer be taken, and we instead
iterate the slow way, one character at a time, all the way to EOB, and
on top of that need on the way to examine every one of the overlays
inserted by lazy-highlighting.

If I revert commit 350e97d78e, redisplay of long-and-truncated lines
becomes much slower, but because of that, lazy-highlighting doesn't
have time to highlight anything before you hit C-s for repeated
search, and thus there are no overlays in the buffer, and on the
average the responses are faster.

I tried to limit the lazy-highlighting by long-line-threshold, but the
speedup due to that is not significant enough (because even a single
overlay in the long line disables the shortcut in
forward_to_next_line_start).  Only limiting
lazy-highlight-initial-delay to, say, 1 sec and
lazy-highlight-max-at-a-time to a small number (like 2 or 5) makes the
problem go away because lazy-highlighting doesn't have enough time to
highlight too much.  And even then once you go deep enough into the
file, the problem comes back.

At this point, I think the only way to produce reasonable performance
from C-s in this case is to turn off lazy-highlighting when lines are
truncated and the buffer has long lines.  Which means we need to
expose the value of long_line_optimizations_p to Lisp, via an accessor
function.  I already have one other use for this: "C-x =", which
attempts to report the column of the character, something that is very
slow in a long-and-truncated line.  And I think we will see more cases
where Lisp code needs to know about this in order to adapt itself to
long lines.

Comments?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 11:45                         ` Eli Zaretskii
@ 2022-07-26 20:52                           ` Dmitry Gutov
  0 siblings, 0 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-07-26 20:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, gregory, monnier

On 26.07.2022 14:45, Eli Zaretskii wrote:
>> Perhaps it's due to font-lock logic in that it has to match from the
>> beginning of a line (not sure we'd want to abandon that promise,
>> though). Or maybe something else.
> It isn't font-lock, at least not in all major modes.  It's the display
> engine itself that sometimes needs to go to the beginning of the line.
> When it does, going back gets slower with font-lock than without.
> This is why you see slower redisplay when you go deeper into a long
> line.

Makese sense: the text only has to be fontified once, but all redisplays 
are slowed down (on master), not just the first one.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 17:46                               ` Eli Zaretskii
@ 2022-07-26 20:55                                 ` Gregory Heytings
  2022-07-27  2:41                                   ` Eli Zaretskii
  2022-07-27  2:33                                 ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-26 20:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier


>
> Heh, it's a case of "better is worse".  There's nothing wrong with the 
> change in commit 350e97d78e: it does indeed make redisplay of 
> long-and-truncated lines significantly faster.  And that speedup is 
> what's causing the problem: because the initial redisplay of the buffer, 
> before the results of C-s are displayed, is much faster, the 
> lazy-highlighting starts very soon, and it inserts thousands of overlays 
> into the buffer (because it by default highlights all the matches till 
> window-end, which is EOB in this case).  And once it inserts enough 
> overlays, we are dead in the water: the shortcut in 
> forward_to_next_line_start can no longer be taken, and we instead 
> iterate the slow way, one character at a time, all the way to EOB, and 
> on top of that need on the way to examine every one of the overlays 
> inserted by lazy-highlighting.
>

Thanks; as far as I can see your analysis is correct.  At 304e2a3a05, if I 
press C-s </ and wait long enough (a few seconds are enough), 
lazy-highlighting has enough time to put enough overlays in the buffer to 
hang Emacs.

>
> At this point, I think the only way to produce reasonable performance 
> from C-s in this case is to turn off lazy-highlighting when lines are 
> truncated and the buffer has long lines.  Which means we need to expose 
> the value of long_line_optimizations_p to Lisp, via an accessor 
> function.  I already have one other use for this: "C-x =", which 
> attempts to report the column of the character, something that is very 
> slow in a long-and-truncated line.  And I think we will see more cases 
> where Lisp code needs to know about this in order to adapt itself to 
> long lines.
>

My conclusion is different: we will see more such cases, but only with 
truncate-lines enabled, and that means that truncate-lines should be 
disabled in such buffers.

The fundamental problem is that with truncate-lines we cannot really 
narrow the buffer to a smaller (contiguous) portion without creating 
problems.  What would be necessary would be some kind of non-contiguous 
narrowing.  If we are on line 1, on column 10000, of a buffer with 80 
lines each of which is 20000 characters wide, what we see on screen are 
characters 9960-10040, 29960-30040, 49960-50040, 69960-70040, and so 
forth.  Without such a non-continguous narrowing, all kinds of problems 
like the C-s and C-x = one you identified will appear.  These problems can 
indeed, in principle, be solved by adding local fixes depending on 
(long-line-optimizations-p) whenever we encounter such a bug, but doing 
that would be most regrettable, as the simplicity of these optimizations 
would be largely lost.  Doing that while knowing that, as I said, we'll 
hit another ceil very soon is clearly (to me at least) not TRT.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-25 21:23     ` Gregory Heytings
@ 2022-07-26 21:17       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-27  6:44         ` Gregory Heytings
  2022-07-27 11:18         ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-26 21:17 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

Gregory Heytings [2022-07-25 21:23:55] wrote:
>> Before using a blunt tool like the forced-narrowing now in
>> `feature/long-lines-and-font-locking`, I think we should try and figure
>> out *why* the recipe below is so slow.
> It's not a blunt tool, it's an appropriate tool to help making sure that
> Emacs remains responsive when large files are visited.

I'm not opposed to reducing the size of the text that's considered, but
doing it via narrowing is a blunt tool.

> Think of it as POSIX's ulimits.

That's also a blunt tool.

> Allowing fontification-functions to search for arbitrarily complex
> regexpes in an arbitrarily large buffer, each and every time they are
> asked to to highlight a small chunk of the said buffer, is a recipe
> for disaster.

`font-lock.el` could enforce a smaller scope in a more discerning way
that narrowing can.

> If for some reason modes really need to go through to the whole
> buffer to decide which highlighting to use, they should to do so outside of
> fontification-functions, and ideally once, for example, when the file
> is loaded.

With the current narrowing they can't even know why the buffer is
narrowed and hence can't make an informed decision whether they should
maybe widen to look elsewhere.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 17:46                               ` Eli Zaretskii
  2022-07-26 20:55                                 ` Gregory Heytings
@ 2022-07-27  2:33                                 ` Eli Zaretskii
  2022-07-27  6:24                                   ` Gerd Möllmann
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-27  2:33 UTC (permalink / raw)
  To: gerd.moellmann; +Cc: 56682, gregory, monnier

> Cc: 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> Date: Tue, 26 Jul 2022 20:46:24 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> At this point, I think the only way to produce reasonable performance
> from C-s in this case is to turn off lazy-highlighting when lines are
> truncated and the buffer has long lines.  Which means we need to
> expose the value of long_line_optimizations_p to Lisp, via an accessor
> function.  I already have one other use for this: "C-x =", which
> attempts to report the column of the character, something that is very
> slow in a long-and-truncated line.  And I think we will see more cases
> where Lisp code needs to know about this in order to adapt itself to
> long lines.

Actually, I might have one more idea that could perhaps resolve this
more nicely.

Gerd, the search for display properties and overlays in
forward_to_next_line_start, which disables the 'reseat' shortcut, is
because these could have strings as values, and those strings could
have embedded newlines, right?  Or are there some other reasons?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 20:55                                 ` Gregory Heytings
@ 2022-07-27  2:41                                   ` Eli Zaretskii
  2022-07-27  7:08                                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-27  2:41 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Tue, 26 Jul 2022 20:55:05 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> My conclusion is different: we will see more such cases, but only with 
> truncate-lines enabled, and that means that truncate-lines should be 
> disabled in such buffers.

I think these two issues are almost orthogonal.  Even if we decide to
disable truncate-lines in such buffers, we won't prevent users from
enabling it.  And if and when they enable it, we'd still like to give
them the best performance we can.  So speeding up the performance when
truncate-lines is enabled is still a worthy goal, although if that is
disabled in long-line buffers, that goal becomes less important.

> The fundamental problem is that with truncate-lines we cannot really 
> narrow the buffer to a smaller (contiguous) portion without creating 
> problems.

Yes, we need to find speedup opportunities that aren't solved by
narrowing.  But what I see for now is that the main bottleneck when
lines are truncated is reseat_at_next_visible_line_start, because we
call it each time we reach the right edge of a window, and need to
decide where to display the next screen line.  I've sped that up with
recent commits on the branch, but it is still very slow in some
situations, such as the one with isearch-lazy-highlight.  Another
situation which slows it down tremendously is when show-paren-mode
(which is ON by default nowadays) has a highlighted parenthesis
somewhere in the portion of the buffer outside of the viewport, the
part that reseat_at_next_visible_line_start needs to traverse to get
to the next newline.

I'm trying to speed up these situations as much as possible, and I
think it will serve us well even if we decide to turn off
truncate-lines in long-line buffers.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27  2:33                                 ` Eli Zaretskii
@ 2022-07-27  6:24                                   ` Gerd Möllmann
  2022-07-27 17:26                                     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gerd Möllmann @ 2022-07-27  6:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

Eli Zaretskii <eliz@gnu.org> writes:

> Gerd, the search for display properties and overlays in
> forward_to_next_line_start, which disables the 'reseat' shortcut, is
> because these could have strings as values, and those strings could
> have embedded newlines, right?  Or are there some other reasons?

Correct, and I don't think there are other reasons.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 21:17       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-27  6:44         ` Gregory Heytings
  2022-07-30  7:16           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-27 11:18         ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-27  6:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>>> Before using a blunt tool like the forced-narrowing now in 
>>> `feature/long-lines-and-font-locking`, I think we should try and 
>>> figure out *why* the recipe below is so slow.
>>
>> It's not a blunt tool, it's an appropriate tool to help making sure 
>> that Emacs remains responsive when large files are visited.
>
> I'm not opposed to reducing the size of the text that's considered, but 
> doing it via narrowing is a blunt tool.
>

It isn't.  The only way to make sure that the size of the text is small 
enough is a forced narrowing from which fontification-functions cannot 
escape.

But let's try to be constructive.  You tell me that you're not opposed to 
reducing the size of the text, and that font-lock.el could enforce a 
smaller scope.  So could you please design a (Elisp) function (in 
font-lock.el) which, given a (beg end) with beg <= end in a buffer, 
returns a (beg' end') with beg <= beg' <= end' <= end that are better 
starting end end points for the forced narrowing?  That function would be 
run in handle_fontified_prop, before fontification-functions, and would 
have access to the whole buffer.

>> Think of it as POSIX's ulimits.
>
> That's also a blunt tool.
>

It isn't either.  It's a practical way to limit what a single process can 
do to make sure that what it does doesn't impact other running processes. 
At $job, each time I've seen a ulimit reached (usually the limit on open 
file descriptors), it was because of a bug.  I think I've seen a single 
exception, with a program that sometimes really needed to open more 
(temporary) files.  And even in that case the solution was not to remove 
the limit, but to raise it (from 1024 to 4096 IIRC).

>
> With the current narrowing they can't even know why the buffer is 
> narrowed and hence can't make an informed decision whether they should 
> maybe widen to look elsewhere.
>

There is no point to make such a decision, because they can't escape the 
narrowing.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
       [not found]                           ` <addcac7f-cb95-c433-58e5-e2d525582613@yandex.ru>
@ 2022-07-27  6:55                             ` Gregory Heytings
  2022-07-27 21:38                               ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-27  6:55 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier


>
> But I'm seeing incorrect fontification. Is this one expected?
>

Yes, occasional mis-fontification is expected.  It's a compromise between 
"no fontification" and "slow fontification".

>
> Perhaps something to do with the number 40000?
>

There is no magical number 40000 in the implementation, the buffer limits 
to which fontification-functions are constrained are determined 
dynamically, depending on the width and height of the window.

My guess in this specific case is that the first instance of 
"Downloadify.Container" was fontified by the previous call to 
fontification-functions, and that the next chunk of text in which the two 
other instances of "Downloadify.Container" are contained was fontified by 
the next call to fontification-functions, which did not have access 
anymore to the place where Downloadify.Container is defined.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27  2:41                                   ` Eli Zaretskii
@ 2022-07-27  7:08                                     ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-27  7:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier


>
> Even if we decide to disable truncate-lines in such buffers, we won't 
> prevent users from enabling it.
>

That's correct, but if they consciously decide to shoot themselves in the 
foot, that's not Emacs' responsibility anymore.

>
> And if and when they enable it, we'd still like to give them the best 
> performance we can.  So speeding up the performance when truncate-lines 
> is enabled is still a worthy goal, although if that is disabled in 
> long-line buffers, that goal becomes less important.
>

I fully agree with that.

>
> I'm trying to speed up these situations as much as possible, and I think 
> it will serve us well even if we decide to turn off truncate-lines in 
> long-line buffers.
>

Thanks!





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-26 21:17       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-27  6:44         ` Gregory Heytings
@ 2022-07-27 11:18         ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-27 11:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org
> Date: Tue, 26 Jul 2022 17:17:29 -0400
> 
> > Allowing fontification-functions to search for arbitrarily complex
> > regexpes in an arbitrarily large buffer, each and every time they are
> > asked to to highlight a small chunk of the said buffer, is a recipe
> > for disaster.
> 
> `font-lock.el` could enforce a smaller scope in a more discerning way
> that narrowing can.
> 
> > If for some reason modes really need to go through to the whole
> > buffer to decide which highlighting to use, they should to do so outside of
> > fontification-functions, and ideally once, for example, when the file
> > is loaded.
> 
> With the current narrowing they can't even know why the buffer is
> narrowed and hence can't make an informed decision whether they should
> maybe widen to look elsewhere.

Feel free to suggest better ways of handling these issues, or even
ways to solve this entirely inside font-lock.  If and when such
suggestions materialize, I'm sure we will be glad to use them instead
of less elegant/more direct solutions.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27  6:24                                   ` Gerd Möllmann
@ 2022-07-27 17:26                                     ` Eli Zaretskii
  2022-07-28 16:29                                       ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-27 17:26 UTC (permalink / raw)
  To: Gerd Möllmann; +Cc: 56682, gregory, monnier

> From: Gerd Möllmann <gerd.moellmann@gmail.com>
> Cc: gregory@heytings.org,  56682@debbugs.gnu.org,  monnier@iro.umontreal.ca
> Date: Wed, 27 Jul 2022 08:24:45 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Gerd, the search for display properties and overlays in
> > forward_to_next_line_start, which disables the 'reseat' shortcut, is
> > because these could have strings as values, and those strings could
> > have embedded newlines, right?  Or are there some other reasons?
> 
> Correct, and I don't think there are other reasons.

Thanks, I installed an optimization, which makes Isearch faster in
such cases.

Turning off isearch-lazy-highlight in these cases is still a good
idea, though.

The next problem with long-and-truncated lines is that in many places
as part of redisplay we begin from the line's beginning and then go to
the position of point or the first visible X coordinate or somesuch.
This becomes slow when point is very far to the right, like near the
end of the line.  A typical place in the code where this happens is at
beginning of display_line, where we call move_it_in_display_line_to to
get to it->first_visible_x.  This could take seconds if there are many
overlays on the line, which is what happens, for example, if we are in
the middle of Isearch and isearch-lazy-highlight is turned ON.
(Searching for "</" in long-line.xml produces 9K overlays!)

I'll try to think how to speed up these move_it_* calls in those
cases.  Ideas welcome.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27  6:55                             ` Gregory Heytings
@ 2022-07-27 21:38                               ` Dmitry Gutov
  2022-07-28  6:21                                 ` Eli Zaretskii
  2022-07-28  7:49                                 ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-07-27 21:38 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier

On 27.07.2022 09:55, Gregory Heytings wrote:
> 
>>
>> But I'm seeing incorrect fontification. Is this one expected?
>>
> 
> Yes, occasional mis-fontification is expected.  It's a compromise 
> between "no fontification" and "slow fontification".

I wonder now if the majority of the slowdown was caused by the 
redisplay, whereas font-lock (which only has to run once per screenful) 
was actually "fast enough".

>> Perhaps something to do with the number 40000?
>>
> 
> There is no magical number 40000 in the implementation, the buffer 
> limits to which fontification-functions are constrained are determined 
> dynamically, depending on the width and height of the window.
> 
> My guess in this specific case is that the first instance of 
> "Downloadify.Container" was fontified by the previous call to 
> fontification-functions, and that the next chunk of text in which the 
> two other instances of "Downloadify.Container" are contained was 
> fontified by the next call to fontification-functions, which did not 
> have access anymore to the place where Downloadify.Container is defined.

Could you clarify what you mean by "access ... to the place where ... is 
defined"? "new Downloadify.Container" is highlighted by a regular regexp 
matcher, not some custom elisp code which has to visit the position 
where the identifier is defined.

Same goes for the tokens like "null", "function" and "return", but those 
do get fontified after position 40000 in this example.

And the way those rules get applied doesn't seem particularly different, 
it's just that the keyword matcher goes before the class instantiation 
matcher inside js--font-lock-keywords-3.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27 21:38                               ` Dmitry Gutov
@ 2022-07-28  6:21                                 ` Eli Zaretskii
  2022-07-28  7:49                                 ` Gregory Heytings
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-28  6:21 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, gregory, monnier

> Date: Thu, 28 Jul 2022 00:38:09 +0300
> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org,
>  Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > Yes, occasional mis-fontification is expected.  It's a compromise 
> > between "no fontification" and "slow fontification".
> 
> I wonder now if the majority of the slowdown was caused by the 
> redisplay, whereas font-lock (which only has to run once per screenful) 
> was actually "fast enough".

To establish that, you can run font-lock-fontify-buffer, wait until it
finishes, and then try moving around and comparing that with a buffer
in which font-lock was turned off.  (Do this on master, of course,
where the changes we are discussing are not yet installed.)

In general, even after a full fontification, you should see at least
some slowdown due to the faces.  How much slowdown, quantitatively, I
don't know, but perhaps Gregory measured that as part of his work.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27 21:38                               ` Dmitry Gutov
  2022-07-28  6:21                                 ` Eli Zaretskii
@ 2022-07-28  7:49                                 ` Gregory Heytings
  2022-08-04  0:49                                   ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-28  7:49 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]


>> Yes, occasional mis-fontification is expected.  It's a compromise 
>> between "no fontification" and "slow fontification".
>
> I wonder now if the majority of the slowdown was caused by the 
> redisplay, whereas font-lock (which only has to run once per screenful) 
> was actually "fast enough".
>

Those two statements are not mutually exclusive.  The majority of the 
slowdown was indeed caused by redisplay, but font-lock was not fast 
enough.  Try to open a sufficiently large file (e.g. the dictionary.json 
one) with the code on master, and type M->.  You'll see that Emacs needs 
about five seconds (on my laptop) to display the end of the buffer.  Now 
compare that with the feature branch, with which the end of the buffer is 
displayed instantaneously.  That five seconds delay is caused by 
fontification-functions.

>
> Could you clarify what you mean by "access ... to the place where ... is 
> defined"? "new Downloadify.Container" is highlighted by a regular regexp 
> matcher, not some custom elisp code which has to visit the position 
> where the identifier is defined.
>

Sorry, I cannot be more precise, I don't have the "downloadify.js" file 
here.  It was just a guess, based on what I saw on the screenshot, that 
one function called by fontification-functions collects all class 
definitions and highlights their identifiers elsewhere in the buffer with 
a specific face.  When the buffer is narrowed, that function may not see 
the Downloadify.Container definition (which is, I guess, placed near the 
beginning of the file) anymore.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27 17:26                                     ` Eli Zaretskii
@ 2022-07-28 16:29                                       ` Gregory Heytings
  2022-07-28 16:42                                         ` Gregory Heytings
  2022-07-28 16:48                                         ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-28 16:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 56682, monnier


>
> Thanks, I installed an optimization, which makes Isearch faster in such 
> cases.
>

Thanks.  Alas, as far as I can see, we're still quite far from a usable 
Emacs with long-and-truncated lines.  A recipe:

emacs -Q
C-x C-f long-line.xml RET
M-g g 12
C-u 12 M-x duplicate-line
C-x C-s multiple-long-lines.xml RET
C-x C-f multiple-long-lines.xml RET y
C-x x t
C-s </ RET

This last step is not yet finished after 10 minutes (600 seconds), and you 
cannot abort it.

>
> Turning off isearch-lazy-highlight in these cases is still a good idea, 
> though.
>

Indeed, (setq isearch-lazy-highlight nil) "fixes" the above recipe (but 
see below).

>
> The next problem with long-and-truncated lines is that in many places as 
> part of redisplay we begin from the line's beginning and then go to the 
> position of point or the first visible X coordinate or somesuch. This 
> becomes slow when point is very far to the right, like near the end of 
> the line.  A typical place in the code where this happens is at 
> beginning of display_line, where we call move_it_in_display_line_to to 
> get to it->first_visible_x.  This could take seconds if there are many 
> overlays on the line, which is what happens, for example, if we are in 
> the middle of Isearch and isearch-lazy-highlight is turned ON.
>

Indeed, Emacs isn't really usable.  For example, with 
multiple-long-lines.xml, after M-g g 20 C-e (which already takes several 
seconds to complete) each motion, search or insertion command still takes 
several seconds (even with find-file-literally).  For example (without 
isearch-lazy-highlight), C-s </releases> RET takes about 10 seconds. 
After that, C-s C-s takes another 10 seconds, an additional C-s also takes 
10 seconds, and C-g to abort isearch takes about 90 seconds.  And, even 
when you don't do anything, Emacs uses 100% of the CPU.  All this is, in 
the same file, instantaneous without truncate-lines.

>
> (Searching for "</" in long-line.xml produces 9K overlays!)
>

That's (another sign of) the fundamental problem I mentioned earlier. 
Without truncate-lines the visible portion of the buffer remains small, 
and it is thus possible to reduce the portion of the buffer that display 
routines will see.  With truncate-lines it can easily become huge.  For 
example, with multiple-long-lines.xml, (- (window-end) (window-start)) is 
7744464 with truncate-lines and 2454 without truncate-lines.  And 
multiple-long-lines.xml is still a relatively "small" file, it's only 15 
MB.

>
> I'll try to think how to speed up these move_it_* calls in those cases. 
> Ideas welcome.
>

I cannot claim I understand the display code enough, but I see no way to 
make the portion of the buffer that is considered by display routines 
smaller without introducing the kind of rectangular narrowing I mentioned 
earlier.  As far as I understand, everything else is just a band-aid, that 
will make Emacs behave a bit faster only in some cases, and with which 
Emacs will remain unresponsive in too many cases.

Perfect is the enemy of good.  I strongly suggest we just admit that Emacs 
can't cope with long lines and truncate-lines together, which is true 
anyway given the DISP_INFINITY limit.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 16:29                                       ` Gregory Heytings
@ 2022-07-28 16:42                                         ` Gregory Heytings
  2022-07-28 16:48                                         ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-28 16:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gerd Möllmann, 56682, monnier


>
> emacs -Q
> C-x C-f long-line.xml RET
> M-g g 12
> C-u 12 M-x duplicate-line
> C-x C-s multiple-long-lines.xml RET
> C-x C-f multiple-long-lines.xml RET y
> C-x x t
> C-s </ RET
>

Sorry, there's a typo in that recipe, it should be:

C-u 50 M-x duplicate-line





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 16:29                                       ` Gregory Heytings
  2022-07-28 16:42                                         ` Gregory Heytings
@ 2022-07-28 16:48                                         ` Eli Zaretskii
  2022-07-28 17:16                                           ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-28 16:48 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Thu, 28 Jul 2022 16:29:44 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Gerd Möllmann <gerd.moellmann@gmail.com>, 
>     56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> Indeed, Emacs isn't really usable.  For example, with 
> multiple-long-lines.xml, after M-g g 20 C-e (which already takes several 
> seconds to complete) each motion, search or insertion command still takes 
> several seconds (even with find-file-literally).  For example (without 
> isearch-lazy-highlight), C-s </releases> RET takes about 10 seconds. 
> After that, C-s C-s takes another 10 seconds, an additional C-s also takes 
> 10 seconds, and C-g to abort isearch takes about 90 seconds.  And, even 
> when you don't do anything, Emacs uses 100% of the CPU.  All this is, in 
> the same file, instantaneous without truncate-lines.

I think such long lines happen in files that have very little except
the long line.  So duplicating it makes the use case much less
important.  And in any case, we shouldn't try dealing with such files
before we have a good solution for a single-line files.

> > (Searching for "</" in long-line.xml produces 9K overlays!)
> 
> That's (another sign of) the fundamental problem I mentioned earlier. 
> Without truncate-lines the visible portion of the buffer remains small, 
> and it is thus possible to reduce the portion of the buffer that display 
> routines will see.  With truncate-lines it can easily become huge.  For 
> example, with multiple-long-lines.xml, (- (window-end) (window-start)) is 
> 7744464 with truncate-lines and 2454 without truncate-lines.  And 
> multiple-long-lines.xml is still a relatively "small" file, it's only 15 
> MB.

This is actually a bug in isearch.el: using window-end in a buffer
under truncate-lines is simply wrong.  I'm going to file a bug about
that soon.

But that's not a fundamental problem, or at least I don't yet see why
it would be fundamental.  It's something specific to isearch.el and to
how isearch-lazy-highlight is implemented.  The fix I installed
prevented us from walking all that long line till the end of the
buffer, so in effect the display code doesn't see most of the buffer
text, they just jump over it.

> > I'll try to think how to speed up these move_it_* calls in those cases. 
> > Ideas welcome.
> 
> I cannot claim I understand the display code enough, but I see no way to 
> make the portion of the buffer that is considered by display routines 
> smaller without introducing the kind of rectangular narrowing I mentioned 
> earlier.

I still have one or two ideas to try, and they don't involve anything
as complex as some new kind of narrowing.

> Perfect is the enemy of good.  I strongly suggest we just admit that Emacs 
> can't cope with long lines and truncate-lines together, which is true 
> anyway given the DISP_INFINITY limit.

Feel free to give up on it and stop trying to make that case faster,
but I'm not ready to give up yet.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 16:48                                         ` Eli Zaretskii
@ 2022-07-28 17:16                                           ` Gregory Heytings
  2022-07-28 17:44                                             ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-28 17:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier


>
> I think such long lines happen in files that have very little except the 
> long line.  So duplicating it makes the use case much less important.
>

That's not correct.  Some files have indeed have a single long line, but 
others have more than one long line.  Duplicating the line in that file 
was just a quick way to generate a file with more long lines.  For 
example, a database dump will typically contain one long line for each 
table.  I have a database dump here with ~100 lines that are each about 1 
MB long, each of which is a single "insert into" statement.

>
> This is actually a bug in isearch.el: using window-end in a buffer under 
> truncate-lines is simply wrong.
>

The measurements I gave were with isearch, but as I said you see the same 
kind of slowdowns with motion and insertion commands.

>
> But that's not a fundamental problem, or at least I don't yet see why it 
> would be fundamental.  It's something specific to isearch.el and to how 
> isearch-lazy-highlight is implemented.
>

No, the measurements I gave were with isearch-lazy-highlight turned OFF. 
And it is the fundamental problem, because it means that all display 
routines have to deal with a very large amount of data.

>
> I still have one or two ideas to try, and they don't involve anything as 
> complex as some new kind of narrowing.
>

Okay, so I'll wait a bit more.  I'd like to reach a conclusion as to 
whether truncate-lines should be turned off when long_line_optimizations_p 
is on before merging the branch into master.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 17:16                                           ` Gregory Heytings
@ 2022-07-28 17:44                                             ` Eli Zaretskii
  2022-07-28 18:40                                               ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-28 17:44 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Thu, 28 Jul 2022 17:16:32 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > This is actually a bug in isearch.el: using window-end in a buffer under 
> > truncate-lines is simply wrong.
> 
> The measurements I gave were with isearch, but as I said you see the same 
> kind of slowdowns with motion and insertion commands.

No, that's not what I see.  Without the thousands of Isearch overlays
Emacs works much faster, almost 2 orders of magnitude faster.

> No, the measurements I gave were with isearch-lazy-highlight turned OFF. 
> And it is the fundamental problem, because it means that all display 
> routines have to deal with a very large amount of data.

By multiplying a very long and truncated line enough times you can
always make Emacs useless.  The speedups I have in mind scale linearly
with the number of such lines, so eventually, with enough such lines,
Emacs will always become very slow at some point, especially if the
window is hscrolled very far to the right.  That doesn't mean we
shouldn't try speeding up the code: someone just told me that the
perfect is the enemy of the good.

> > I still have one or two ideas to try, and they don't involve anything as 
> > complex as some new kind of narrowing.
> 
> Okay, so I'll wait a bit more.  I'd like to reach a conclusion as to 
> whether truncate-lines should be turned off when long_line_optimizations_p 
> is on before merging the branch into master.

That's unrelated.  The branch was created for your work on font-lock,
and if you are done with that, feel free to land it on master.  I can
continue working on master, and/or will create a feature branch if I
feel it's justified.

Thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 17:44                                             ` Eli Zaretskii
@ 2022-07-28 18:40                                               ` Gregory Heytings
  2022-07-28 18:57                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-28 18:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier


>> And it is the fundamental problem, because it means that all display 
>> routines have to deal with a very large amount of data.
>
> By multiplying a very long and truncated line enough times you can 
> always make Emacs useless.  The speedups I have in mind scale linearly 
> with the number of such lines, so eventually, with enough such lines, 
> Emacs will always become very slow at some point, especially if the 
> window is hscrolled very far to the right.
>

That's exactly my point: with long and truncated lines Emacs can become 
unusable with a 20 MB file, with long lines Emacs does not become unusable 
even with a 1 GB file.  And what I (and everybody I guess) wants is a 
"full and complete" solution.

>> Okay, so I'll wait a bit more.  I'd like to reach a conclusion as to 
>> whether truncate-lines should be turned off when 
>> long_line_optimizations_p is on before merging the branch into master.
>
> That's unrelated.  The branch was created for your work on font-lock, 
> and if you are done with that, feel free to land it on master.  I can 
> continue working on master, and/or will create a feature branch if I 
> feel it's justified.
>

It's not unrelated, at least not in my mind.  The branch was initially 
created to fix the remaining font-lock related issues, but this thread 
discusses, and the branch contains, fixes to the other remaining issues. 
I don't want to close this bug without a "full and complete" solution, and 
currently someone who has (setq truncate-lines t) in their init file or 
who presses C-x x t will see Emacs become unusable with a file with long 
lines.

What about the following course of action: I add a "disable truncate-lines 
in buffers with long lines" feature, and you remove it/disable it/make it 
optional later if/when you consider that your fixes make Emacs fast enough 
with long-and-truncated lines.  And we can/should open a new bug report 
and branch to discuss these long-and-truncated lines issues and solution 
attempts.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 18:40                                               ` Gregory Heytings
@ 2022-07-28 18:57                                                 ` Eli Zaretskii
  2022-07-28 21:31                                                   ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-28 18:57 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier

> Date: Thu, 28 Jul 2022 18:40:26 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > By multiplying a very long and truncated line enough times you can 
> > always make Emacs useless.  The speedups I have in mind scale linearly 
> > with the number of such lines, so eventually, with enough such lines, 
> > Emacs will always become very slow at some point, especially if the 
> > window is hscrolled very far to the right.
> 
> That's exactly my point: with long and truncated lines Emacs can become 
> unusable with a 20 MB file, with long lines Emacs does not become unusable 
> even with a 1 GB file.  And what I (and everybody I guess) wants is a 
> "full and complete" solution.

Feel free to want it and feel free to wait for it, or work on it.  But
that doesn't mean I or others cannot meanwhile install less "full and
complete" solutions that improve some use cases.  And any user can
always turn off truncate-lines in such buffers, if some particular use
case makes Emacs unusable for them, it's not like this possibility is
taken from them.

> > That's unrelated.  The branch was created for your work on font-lock, 
> > and if you are done with that, feel free to land it on master.  I can 
> > continue working on master, and/or will create a feature branch if I 
> > feel it's justified.
> 
> It's not unrelated, at least not in my mind.  The branch was initially 
> created to fix the remaining font-lock related issues, but this thread 
> discusses, and the branch contains, fixes to the other remaining issues. 

I worked on truncated-lines case on the branch because without the
improvements you installed it was impossible to do anything with that:
Emacs was too slow to allow reasonable debugging and measurements.
That is the only relation between the two.

I don't see any reason to delay landing the important improvements we
have on the branch.  It will allow more people to use those
improvements, and thus will contribute both to their stability and to
further progress in this direction.

> I don't want to close this bug without a "full and complete" solution, and 
> currently someone who has (setq truncate-lines t) in their init file or 
> who presses C-x x t will see Emacs become unusable with a file with long 
> lines.

The bug can be kept open, if you don't want to close it.  I don't
mind.

> What about the following course of action: I add a "disable truncate-lines 
> in buffers with long lines" feature, and you remove it/disable it/make it 
> optional later if/when you consider that your fixes make Emacs fast enough 
> with long-and-truncated lines.

No.  Please stop pressuring me into making a decision I'm not yet
ready to make.  There's absolutely no rush.

> And we can/should open a new bug report and branch to discuss these
> long-and-truncated lines issues and solution attempts.

Opening a new bug about that is fine by me.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 18:57                                                 ` Eli Zaretskii
@ 2022-07-28 21:31                                                   ` Gregory Heytings
  2022-07-29  7:12                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-28 21:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, Lars Ingebrigtsen, monnier


>
> I don't see any reason to delay landing the important improvements we 
> have on the branch.  It will allow more people to use those 
> improvements, and thus will contribute both to their stability and to 
> further progress in this direction.
>

Okay, I pushed two final (?) improvements: a new optional argument "lock" 
to narrow-to-region that Lars (added in Cc) suggested and which is now 
used to lock the restriction around fontification-functions, and two 
updates to the documentation.

Could you please tell me if it's ready to merge?

>> And we can/should open a new bug report and branch to discuss these 
>> long-and-truncated lines issues and solution attempts.
>
> Opening a new bug about that is fine by me.
>

Actually there's no need to open a new bug report, as bug#56683 already 
exists.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28 21:31                                                   ` Gregory Heytings
@ 2022-07-29  7:12                                                     ` Eli Zaretskii
  2022-07-29  8:33                                                       ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29  7:12 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Thu, 28 Jul 2022 21:31:30 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Lars Ingebrigtsen <larsi@gnus.org>, gerd.moellmann@gmail.com, 
>     56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> Okay, I pushed two final (?) improvements: a new optional argument "lock" 
> to narrow-to-region that Lars (added in Cc) suggested and which is now 
> used to lock the restriction around fontification-functions, and two 
> updates to the documentation.

Thanks, I made a few minor documentation changes there.

> Could you please tell me if it's ready to merge?

Yes, I think it's ready, thanks for all your hard work on this.

> >> And we can/should open a new bug report and branch to discuss these 
> >> long-and-truncated lines issues and solution attempts.
> >
> > Opening a new bug about that is fine by me.
> >
> 
> Actually there's no need to open a new bug report, as bug#56683 already 
> exists.

Right.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29  7:12                                                     ` Eli Zaretskii
@ 2022-07-29  8:33                                                       ` Gregory Heytings
  2022-07-29 10:29                                                         ` Eli Zaretskii
  2022-07-29 13:27                                                         ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29  8:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682-done, larsi, monnier


>> Could you please tell me if it's ready to merge?
>
> Yes, I think it's ready, thanks for all your hard work on this.
>

Now done, and closing this bug.  I made a few further documentation 
changes, in particular "arbitrarily long" is important to me, factually 
correct, and correct English (see e.g. 
https://en.wikipedia.org/wiki/Arbitrarily_large ).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29  8:33                                                       ` Gregory Heytings
@ 2022-07-29 10:29                                                         ` Eli Zaretskii
  2022-07-29 10:44                                                           ` Gregory Heytings
       [not found]                                                           ` <19e5f0b3-c259-79f5-c31-469e8dfaf193@heytings.org>
  2022-07-29 13:27                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 10:29 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 08:33:38 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682-done@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> >> Could you please tell me if it's ready to merge?
> >
> > Yes, I think it's ready, thanks for all your hard work on this.
> >
> 
> Now done, and closing this bug.

Thanks.

> I made a few further documentation changes, in particular
> "arbitrarily long" is important to me, factually correct, and
> correct English (see e.g.
> https://en.wikipedia.org/wiki/Arbitrarily_large ).

Please always assume that I gave these aspects and your perspective
due consideration before making my changes, and please never revert
them without discussing first.

In this case, "arbitrarily large" contradicts the text that follows,
which describes the circumstances where that might not be true.  Other
minor changes are due to simple humility: we shouldn't boast
achievements that have yet to see serious real-world testing.

In any case, the final responsibility for what the documentation says
and how is Lars's and mine, as the current project maintainers.

Thanks again for all the efforts invested in these important
improvements.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 10:29                                                         ` Eli Zaretskii
@ 2022-07-29 10:44                                                           ` Gregory Heytings
  2022-07-29 10:53                                                             ` Eli Zaretskii
       [not found]                                                           ` <19e5f0b3-c259-79f5-c31-469e8dfaf193@heytings.org>
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 10:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> Please always assume that I gave these aspects and your perspective due 
> consideration before making my changes, and please never revert them 
> without discussing first.
>

Okay, I'll keep that in mind in the future.

>
> In this case, "arbitrarily large" contradicts the text that follows, 
> which describes the circumstances where that might not be true.
>

I've carefully chosen the words of the title, and it doesn't contradict 
what follows, as far as I understand.  It says "Emacs is now capable of 
editing files with arbitrarily long lines", in which "capable" means that 
it can do it, but will not always do it.  The circumstances that are 
described in the text that follows tell the reader that the remaining 
cases in which Emacs would choke on such files are outside of Emacs' 
responsibility, they are the responsibility of major and minor mode 
writers.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
       [not found]                                                           ` <19e5f0b3-c259-79f5-c31-469e8dfaf193@heytings.org>
@ 2022-07-29 10:50                                                             ` Gregory Heytings
  2022-07-29 11:16                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 10:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>> In this case, "arbitrarily large" contradicts the text that follows, 
>> which describes the circumstances where that might not be true.
>
> I've carefully chosen the words of the title, and it doesn't contradict 
> what follows, as far as I understand.  It says "Emacs is now capable of 
> editing files with arbitrarily long lines", in which "capable" means 
> that it can do it, but will not always do it.  The circumstances that 
> are described in the text that follows tell the reader that the 
> remaining cases in which Emacs would choke on such files are outside of 
> Emacs' responsibility, they are the responsibility of major and minor 
> mode writers.
>

I see that you've already reverted.  Sigh.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 10:44                                                           ` Gregory Heytings
@ 2022-07-29 10:53                                                             ` Eli Zaretskii
  2022-07-29 11:03                                                               ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 10:53 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 10:44:27 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Please always assume that I gave these aspects and your perspective due 
> > consideration before making my changes, and please never revert them 
> > without discussing first.
> 
> Okay, I'll keep that in mind in the future.

Thank you.

> > In this case, "arbitrarily large" contradicts the text that follows, 
> > which describes the circumstances where that might not be true.
> 
> I've carefully chosen the words of the title, and it doesn't contradict 
> what follows, as far as I understand.  It says "Emacs is now capable of 
> editing files with arbitrarily long lines", in which "capable" means that 
> it can do it, but will not always do it.  The circumstances that are 
> described in the text that follows tell the reader that the remaining 
> cases in which Emacs would choke on such files are outside of Emacs' 
> responsibility, they are the responsibility of major and minor mode 
> writers.

The usual interpretation of "capable" in Emacs is that we do it unless
the user tells us not to.  Otherwise users will ask why not do it
whenever possible, definitely for a feature like this one.  So that is
the contradiction which I had in mind.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 10:53                                                             ` Eli Zaretskii
@ 2022-07-29 11:03                                                               ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 11:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> The usual interpretation of "capable" in Emacs is that we do it unless 
> the user tells us not to.  Otherwise users will ask why not do it 
> whenever possible, definitely for a feature like this one.  So that is 
> the contradiction which I had in mind.
>

I understand.  Nonetheless, the OED tells me "capable: adj. 1 (capable of 
doing something) having the ability or quality necessary to do something."





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 10:50                                                             ` Gregory Heytings
@ 2022-07-29 11:16                                                               ` Eli Zaretskii
  2022-07-29 12:05                                                                 ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 11:16 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 10:50:02 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> 
> >> In this case, "arbitrarily large" contradicts the text that follows, 
> >> which describes the circumstances where that might not be true.
> >
> > I've carefully chosen the words of the title, and it doesn't contradict 
> > what follows, as far as I understand.  It says "Emacs is now capable of 
> > editing files with arbitrarily long lines", in which "capable" means 
> > that it can do it, but will not always do it.  The circumstances that 
> > are described in the text that follows tell the reader that the 
> > remaining cases in which Emacs would choke on such files are outside of 
> > Emacs' responsibility, they are the responsibility of major and minor 
> > mode writers.
> 
> I see that you've already reverted.  Sigh.

I didn't revert.  I've changed back only some parts of the changes you
made, and left others as you changed them.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 11:16                                                               ` Eli Zaretskii
@ 2022-07-29 12:05                                                                 ` Gregory Heytings
  2022-07-29 12:36                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 12:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>>>> In this case, "arbitrarily large" contradicts the text that follows, 
>>>> which describes the circumstances where that might not be true.
>>>
>>> I've carefully chosen the words of the title, and it doesn't 
>>> contradict what follows, as far as I understand.  It says "Emacs is 
>>> now capable of editing files with arbitrarily long lines", in which 
>>> "capable" means that it can do it, but will not always do it.  The 
>>> circumstances that are described in the text that follows tell the 
>>> reader that the remaining cases in which Emacs would choke on such 
>>> files are outside of Emacs' responsibility, they are the 
>>> responsibility of major and minor mode writers.
>>
>> I see that you've already reverted.  Sigh.
>
> I didn't revert.  I've changed back only some parts of the changes you 
> made, and left others as you changed them.
>

You did revert the only change of which I said it was important to me, and 
without discussing it: the NEWS title "Emacs is now capable of editing 
files with arbitrarily long lines."  Which isn't boasting about an 
achievement, but an accurate statement.  Adding "unlike all other editors 
out there" would have been boasting about an achievement (but an accurate 
statement, too).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 12:05                                                                 ` Gregory Heytings
@ 2022-07-29 12:36                                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 12:36 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 12:05:16 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > I didn't revert.  I've changed back only some parts of the changes you 
> > made, and left others as you changed them.
> 
> You did revert the only change of which I said it was important to me

I understand that it's important to you, and understood that the first
time, but I cannot let our documentation say things that I'm unable to
defend in good faith.  When the dust settles and all is said and done,
it is only us the maintainers who are left to defend the decisions we
made as a project.  Please try to see this from our POV as well.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29  8:33                                                       ` Gregory Heytings
  2022-07-29 10:29                                                         ` Eli Zaretskii
@ 2022-07-29 13:27                                                         ` Eli Zaretskii
  2022-07-29 13:58                                                           ` Eli Zaretskii
  2022-07-29 15:19                                                           ` Gregory Heytings
  1 sibling, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 13:27 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682-done, larsi, monnier

> Date: Fri, 29 Jul 2022 08:33:38 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682-done@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> Now done, and closing this bug.

Hmm...  I'm bothered by this code in handle_fontified_prop:

      if (it->narrowed_begv)
	Fnarrow_to_region (make_fixnum (it->narrowed_begv),
			   make_fixnum (it->narrowed_zv), Qt);

This narrows the buffer around window's point position (since this is
how narrowed_begv and narrowed_zv are computed), but the display
iterator can be called for position outside this range.  This is
unlikely to happen when the function is called as part of actual
redisplay of a window, but it can easily happen when the display code
is used by other primitives, for example vertical-motion or
pos-visible-in-window-p.  What happens then is that
fontification-functions are called with the argument POS that is
outside of the restriction, and that can cause errors.  (jit-lock
simply does nothing in that case, AFAICT.)

Is this intended?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 13:27                                                         ` Eli Zaretskii
@ 2022-07-29 13:58                                                           ` Eli Zaretskii
  2022-07-29 15:35                                                             ` Gregory Heytings
  2022-07-29 15:19                                                           ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 13:58 UTC (permalink / raw)
  To: gregory, gerd.moellmann; +Cc: 56682, larsi, monnier

> Cc: gerd.moellmann@gmail.com, 56682-done@debbugs.gnu.org, larsi@gnus.org,
>  monnier@iro.umontreal.ca
> Date: Fri, 29 Jul 2022 16:27:50 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> Hmm...  I'm bothered by this code in handle_fontified_prop:
> 
>       if (it->narrowed_begv)
> 	Fnarrow_to_region (make_fixnum (it->narrowed_begv),
> 			   make_fixnum (it->narrowed_zv), Qt);
> 

And another thing: the condition on it->narrowed_begv being non-zero
means that as long as we are close enough to the beginning of a
buffer, we don't restrict fontification-functions from going as far as
they want into the buffer.

So I think the condition should be the long_line_optimizations_p flag
of the buffer, and we should narrow the buffer even when we are at
BOB, to prevent fontification-functions from going too far.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 13:27                                                         ` Eli Zaretskii
  2022-07-29 13:58                                                           ` Eli Zaretskii
@ 2022-07-29 15:19                                                           ` Gregory Heytings
  2022-07-29 15:35                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 15:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682-done, larsi, monnier


>
> Hmm...  I'm bothered by this code in handle_fontified_prop:
>
> if (it->narrowed_begv)
>   Fnarrow_to_region (make_fixnum (it->narrowed_begv),
>                      make_fixnum (it->narrowed_zv), Qt);
>

Hmmm... that code has been on the branch for a week, and I didn't see 
particular bugs.  Which doesn't mean there are none, of course.

>
> This narrows the buffer around window's point position (since this is 
> how narrowed_begv and narrowed_zv are computed), but the display 
> iterator can be called for position outside this range. This is unlikely 
> to happen when the function is called as part of actual redisplay of a 
> window, but it can easily happen when the display code is used by other 
> primitives, for example vertical-motion or pos-visible-in-window-p. 
> What happens then is that fontification-functions are called with the 
> argument POS that is outside of the restriction, and that can cause 
> errors.  (jit-lock simply does nothing in that case, AFAICT.)
>
> Is this intended?
>

I'm not sure I understand how this could happen.  Can a non-visible part 
of the buffer be fontified by fontification-functions when for example 
pos-visible-in-window-p is called and eventually returns nil?  At least if 
I do (pos-visible-in-window-p (point-max)), they are not: 
handle_fontified_prop is not even called with it at point-max.  Even with 
(pos-visible-in-window-p (1+ (window-end))) fontification-functions are 
not called.

Should we perhaps be extra careful and add not apply the narrowing when 
IT_CHARPOS is not between narrowed_begv and narrowed_zv?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 15:19                                                           ` Gregory Heytings
@ 2022-07-29 15:35                                                             ` Eli Zaretskii
  2022-07-29 16:37                                                               ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 15:35 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 15:19:49 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682-done@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Hmm...  I'm bothered by this code in handle_fontified_prop:
> >
> > if (it->narrowed_begv)
> >   Fnarrow_to_region (make_fixnum (it->narrowed_begv),
> >                      make_fixnum (it->narrowed_zv), Qt);
> >
> 
> Hmmm... that code has been on the branch for a week, and I didn't see 
> particular bugs.  Which doesn't mean there are none, of course.

It depends on what kind of bugs we expect.  I've seen in a debugger
that fontification-functions are called with POS outside the narrowed
region.  But since jit-lock simply does nothing in that case, this is
silently ignored.

> > This narrows the buffer around window's point position (since this is 
> > how narrowed_begv and narrowed_zv are computed), but the display 
> > iterator can be called for position outside this range. This is unlikely 
> > to happen when the function is called as part of actual redisplay of a 
> > window, but it can easily happen when the display code is used by other 
> > primitives, for example vertical-motion or pos-visible-in-window-p. 
> > What happens then is that fontification-functions are called with the 
> > argument POS that is outside of the restriction, and that can cause 
> > errors.  (jit-lock simply does nothing in that case, AFAICT.)
> >
> > Is this intended?
> 
> I'm not sure I understand how this could happen.  Can a non-visible part 
> of the buffer be fontified by fontification-functions when for example 
> pos-visible-in-window-p is called and eventually returns nil?

Yes.  Whenever the display code is called, it fontifies the portions
of the buffer that it moves through.  And that is in general
justified: if we didn't fontify, we could produce wrong results from
pos-visible-in-window-p, due to faces that change the font height.

> At least if I do (pos-visible-in-window-p (point-max)), they are
> not: handle_fontified_prop is not even called with it at point-max.
> Even with (pos-visible-in-window-p (1+ (window-end)))
> fontification-functions are not called.

Try with vertical-motion.  Visit long-line.xml, go to position 20000,
and then do "C-u 200 C-n" or "M-: (vertical-motion 200) RET.  Sooner
or later you will see that it->current in handle_fontified_prop will
be outside of the narrowing.

> Should we perhaps be extra careful and add not apply the narrowing when 
> IT_CHARPOS is not between narrowed_begv and narrowed_zv?

I'd rather narrow around IT_CHARPOS in that case.  That would be also
consistent with what the doc string of fontification-functions now
says.

Perhaps we should also change what init_iterator does: if the start
position with which it's called is outside of the restriction,
recompute the restriction using the start point instead of the
window's point position.  WDYT?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 13:58                                                           ` Eli Zaretskii
@ 2022-07-29 15:35                                                             ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 15:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> And another thing: the condition on it->narrowed_begv being non-zero 
> means that as long as we are close enough to the beginning of a buffer, 
> we don't restrict fontification-functions from going as far as they want 
> into the buffer.
>
> So I think the condition should be the long_line_optimizations_p flag of 
> the buffer, and we should narrow the buffer even when we are at BOB, to 
> prevent fontification-functions from going too far.
>

That's correct, indeed, thanks.  Now fixed on master.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 15:35                                                             ` Eli Zaretskii
@ 2022-07-29 16:37                                                               ` Gregory Heytings
  2022-07-29 18:09                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 16:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>> At least if I do (pos-visible-in-window-p (point-max)), they are not: 
>> handle_fontified_prop is not even called with it at point-max. Even 
>> with (pos-visible-in-window-p (1+ (window-end))) 
>> fontification-functions are not called.
>
> Try with vertical-motion.  Visit long-line.xml, go to position 20000, 
> and then do "C-u 200 C-n" or "M-: (vertical-motion 200) RET.  Sooner or 
> later you will see that it->current in handle_fontified_prop will be 
> outside of the narrowing.
>

Thanks, I was able to reproduce the bug with that recipe.

>> Should we perhaps be extra careful and add not apply the narrowing when 
>> IT_CHARPOS is not between narrowed_begv and narrowed_zv?
>
> I'd rather narrow around IT_CHARPOS in that case.  That would be also 
> consistent with what the doc string of fontification-functions now says.
>
> Perhaps we should also change what init_iterator does: if the start 
> position with which it's called is outside of the restriction, recompute 
> the restriction using the start point instead of the window's point 
> position.  WDYT?
>

Doing it in init_iterator is too early alas, with the above recipe at 
least init_iterator is called with charpos inside the narrowing bounds, 
after which the iterator moves outside the narrowing bounds.  So I fixed 
the bug in handle_fontified_prop.

I don't know yet if it's necessary to add another similar recomputation 
inside init_iterator.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 16:37                                                               ` Gregory Heytings
@ 2022-07-29 18:09                                                                 ` Eli Zaretskii
  2022-07-29 18:27                                                                   ` Gregory Heytings
  2022-07-29 20:02                                                                   ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-29 18:09 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 16:37:14 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> 
> >> At least if I do (pos-visible-in-window-p (point-max)), they are not: 
> >> handle_fontified_prop is not even called with it at point-max. Even 
> >> with (pos-visible-in-window-p (1+ (window-end))) 
> >> fontification-functions are not called.
> >
> > Try with vertical-motion.  Visit long-line.xml, go to position 20000, 
> > and then do "C-u 200 C-n" or "M-: (vertical-motion 200) RET.  Sooner or 
> > later you will see that it->current in handle_fontified_prop will be 
> > outside of the narrowing.
> >
> 
> Thanks, I was able to reproduce the bug with that recipe.
> 
> >> Should we perhaps be extra careful and add not apply the narrowing when 
> >> IT_CHARPOS is not between narrowed_begv and narrowed_zv?
> >
> > I'd rather narrow around IT_CHARPOS in that case.  That would be also 
> > consistent with what the doc string of fontification-functions now says.
> >
> > Perhaps we should also change what init_iterator does: if the start 
> > position with which it's called is outside of the restriction, recompute 
> > the restriction using the start point instead of the window's point 
> > position.  WDYT?
> >
> 
> Doing it in init_iterator is too early alas, with the above recipe at 
> least init_iterator is called with charpos inside the narrowing bounds, 
> after which the iterator moves outside the narrowing bounds.  So I fixed 
> the bug in handle_fontified_prop.

Thanks.

> I don't know yet if it's necessary to add another similar recomputation 
> inside init_iterator.

I'll play with other callers of init_iterator and start_display, and
see if they can do similar things.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 18:09                                                                 ` Eli Zaretskii
@ 2022-07-29 18:27                                                                   ` Gregory Heytings
  2022-07-29 20:48                                                                     ` Gregory Heytings
  2022-07-29 20:02                                                                   ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 18:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>> I don't know yet if it's necessary to add another similar recomputation 
>> inside init_iterator.
>
> I'll play with other callers of init_iterator and start_display, and see 
> if they can do similar things.
>

In case this helps, it is easy to enter the following conditional, for 
example with C-s, or with your previous M-g c 20000 and M-: 
(vertical-motion 200) RET recipe (in both steps).  It's not clear to me if 
updating the narrowing bounds there has an actual impact.  At least 
applying that change does not seem to have negative effects.

diff --git a/src/xdisp.c b/src/xdisp.c
index b1ee7889d4..e415320a52 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -3429,6 +3429,12 @@ init_iterator (struct it *it, struct window *w,
      {
        it->narrowed_begv = get_narrowed_begv (w, window_point (w));
        it->narrowed_zv = get_narrowed_zv (w, window_point (w));
+      if (charpos >= 0
+         && (charpos < it->narrowed_begv || charpos > it->narrowed_zv))
+       {
+         it->narrowed_begv = get_narrowed_begv (w, charpos);
+         it->narrowed_zv = get_narrowed_zv (w, charpos);
+       }
      }

    /* If a buffer position was specified, set the iterator there,





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 18:09                                                                 ` Eli Zaretskii
  2022-07-29 18:27                                                                   ` Gregory Heytings
@ 2022-07-29 20:02                                                                   ` Gregory Heytings
  2022-07-30  9:05                                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 20:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> Doing it in init_iterator is too early alas, with the above recipe at 
> least init_iterator is called with charpos inside the narrowing bounds, 
> after which the iterator moves outside the narrowing bounds.  So I fixed 
> the bug in handle_fontified_prop.
>

Actually that doesn't work correctly.  A recipe:

emacs -Q
M-: (progn (set-frame-width nil 119) (set-frame-height nil 38)) RET
C-x C-f dictionary.json RET y
C-s aan SPC

Now you'll see that the last line at the bottom of the window, which does 
not contain "aan ", is highlighted.  Is the following okay from your point 
of view?

diff --git a/src/xdisp.c b/src/xdisp.c
index b1ee7889d4..8c62f088b8 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4412,13 +4412,8 @@ handle_fontified_prop (struct it *it)
           ptrdiff_t begv = it->narrowed_begv ? it->narrowed_begv : BEGV;
           ptrdiff_t zv = it->narrowed_zv;
           ptrdiff_t charpos = IT_CHARPOS (*it);
-         if (charpos < begv || charpos > zv)
-           {
-             begv = get_narrowed_begv (it->w, charpos);
-             if (!begv) begv = BEGV;
-             zv = get_narrowed_zv (it->w, charpos);
-           }
-         Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
+         if (begv <= charpos && charpos <= zv)
+           Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
         }

        /* Don't allow Lisp that runs from 'fontification-functions'





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 18:27                                                                   ` Gregory Heytings
@ 2022-07-29 20:48                                                                     ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-29 20:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> At least applying that change does not seem to have negative effects.
>

Actually it does, with the same recipe:

emacs -Q
M-: (progn (set-frame-width nil 119) (set-frame-height nil 38)) RET
C-x C-f dictionary.json RET y
C-s aan SPC

So you can forget this patch.

>
> diff --git a/src/xdisp.c b/src/xdisp.c
> index b1ee7889d4..e415320a52 100644
> --- a/src/xdisp.c
> +++ b/src/xdisp.c
> @@ -3429,6 +3429,12 @@ init_iterator (struct it *it, struct window *w,
>     {
>       it->narrowed_begv = get_narrowed_begv (w, window_point (w));
>       it->narrowed_zv = get_narrowed_zv (w, window_point (w));
> +      if (charpos >= 0
> +         && (charpos < it->narrowed_begv || charpos > it->narrowed_zv))
> +       {
> +         it->narrowed_begv = get_narrowed_begv (w, charpos);
> +         it->narrowed_zv = get_narrowed_zv (w, charpos);
> +       }
>     }
>
>   /* If a buffer position was specified, set the iterator there,
>





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-27  6:44         ` Gregory Heytings
@ 2022-07-30  7:16           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-30  8:12             ` Eli Zaretskii
  2022-07-30 13:17             ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-30  7:16 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

>> I'm not opposed to reducing the size of the text that's considered, but
>> doing it via narrowing is a blunt tool.
> It isn't.

There's no point arguing about it.  I find it to be and you don't and
that's that.  The reason why I find it to be is because it removes all
possibilities of making different choices for different elements
depending on the cost of those elements and the amount of
available information.

> The only way to make sure that the size of the text is small
> enough is a forced narrowing from which fontification-functions
> cannot escape.

It's clearly not the only way, nor is it an absolute guarantee.
I agree that it's a convenient way, tho.

> But let's try to be constructive.  You tell me that you're not opposed to
>  reducing the size of the text, and that font-lock.el could enforce
>  a smaller scope.  So could you please design a (Elisp) function (in
>  font-lock.el) which, given a (beg end) with beg <= end in a buffer, returns
>  a (beg' end') with beg <= beg' <= end' <= end that are better starting end
>  end points for the forced narrowing?  That function would be run in
>  handle_fontified_prop, before fontification-functions, and would have
> access to the whole buffer.

That can't be done.  What I think would be a better option is to
(somehow) pass the `beg..end` "limit" to jit-lock which can then pass it
on to its own clients (e.g. font-lock) so they each can make their
own choices.

E.g. the syntax-ppss part of the job performed by font-lock is heavily
cached, does not depends on lines, is theoretically always computed from
BOB but with a cache which makes it fast even when working near EOB (tho
it can still be somewhat slow when jumping from BOB to EOB, but that
depends on the size of the buffer, not the size of lines).  This part
*should* ignore your limits, which will make sure comments and strings are
recognized correctly at least in simple cases (i.e. cases which don't
depend on `syntax-propertize-function`).

>>> Think of it as POSIX's ulimits.
>> That's also a blunt tool.
> It isn't either.

Maybe we don't use the same meaning for "blunt".  What I mean is that
it's a tool whose effect cannot be fine-tuned for specific cases.
E.g. limiting the amount of memory used to store images, or the amount
of time spent in a particular operation, rather than applying those
limits to the whole process (even all its children as well).

> It's a practical way to limit what a single process can do

Yup, I use it too, but it's a one-size fits all.

Eli wrote:
> Feel free to suggest better ways of handling these issues, or even
> ways to solve this entirely inside font-lock.  If and when such
> suggestions materialize, I'm sure we will be glad to use them instead
> of less elegant/more direct solutions.

I'd suggest to keep things mostly as they are but move the decision to
ELisp: i.e. pass the beg..end limits to jit-lock and let jit-lock do
the narrowing.  This way it's easy to later refine the mechanism.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30  7:16           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-30  8:12             ` Eli Zaretskii
  2022-07-30 10:52               ` Gregory Heytings
  2022-07-30 13:17             ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30  8:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org
> Date: Sat, 30 Jul 2022 03:16:29 -0400
> 
> > Feel free to suggest better ways of handling these issues, or even
> > ways to solve this entirely inside font-lock.  If and when such
> > suggestions materialize, I'm sure we will be glad to use them instead
> > of less elegant/more direct solutions.
> 
> I'd suggest to keep things mostly as they are but move the decision to
> ELisp: i.e. pass the beg..end limits to jit-lock and let jit-lock do
> the narrowing.  This way it's easy to later refine the mechanism.

That's already happening: code called via fontification-functions can
access the restriction via point-min and point-max.  If you or someone
else can come up with efficient methods of using that information so
as not to go too far forward and back, we could consider removing the
lock from the narrowing.  But we'd need to see the code first and
assess the resulting performance with long lines.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-29 20:02                                                                   ` Gregory Heytings
@ 2022-07-30  9:05                                                                     ` Eli Zaretskii
  2022-07-30 11:34                                                                       ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30  9:05 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Fri, 29 Jul 2022 20:02:47 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Doing it in init_iterator is too early alas, with the above recipe at 
> > least init_iterator is called with charpos inside the narrowing bounds, 
> > after which the iterator moves outside the narrowing bounds.  So I fixed 
> > the bug in handle_fontified_prop.
> 
> Actually that doesn't work correctly.  A recipe:
> 
> emacs -Q
> M-: (progn (set-frame-width nil 119) (set-frame-height nil 38)) RET
> C-x C-f dictionary.json RET y
> C-s aan SPC
> 
> Now you'll see that the last line at the bottom of the window, which does 
> not contain "aan ", is highlighted.  Is the following okay from your point 
> of view?

I see the problem, but I don't understand how it could be related to
handle_fontified_prop.  The highlight in this case is the isearch
highlighting of matches, and those are shown via overlays, not via
face text properties, so AFAIK handle_fontified_prop doesn't handle
them.

Do you understand the relation of this to handle_fontified_prop?

The immediate reason for the wrong highlighting seems to be an overlay
whose end position is the match for "aan ", and whose start position
is much earlier in the buffer (about 600K characters earlier).  I
don't yet understand why and how this overlay comes into existence.
Hmm...





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30  8:12             ` Eli Zaretskii
@ 2022-07-30 10:52               ` Gregory Heytings
  2022-07-30 10:59                 ` Eli Zaretskii
                                   ` (2 more replies)
  0 siblings, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 10:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Stefan Monnier


>>> Feel free to suggest better ways of handling these issues, or even 
>>> ways to solve this entirely inside font-lock.  If and when such 
>>> suggestions materialize, I'm sure we will be glad to use them instead 
>>> of less elegant/more direct solutions.
>>
>> I'd suggest to keep things mostly as they are but move the decision to 
>> ELisp: i.e. pass the beg..end limits to jit-lock and let jit-lock do 
>> the narrowing.  This way it's easy to later refine the mechanism.
>
> That's already happening: code called via fontification-functions can 
> access the restriction via point-min and point-max.  If you or someone 
> else can come up with efficient methods of using that information so as 
> not to go too far forward and back, we could consider removing the lock 
> from the narrowing.  But we'd need to see the code first and assess the 
> resulting performance with long lines.
>

IIUC, what Stefan suggests is the following, which seems (almost) fine to 
me.  The only problem I see is that jit-lock-function is not the only user 
of fontication-functions.  It has at least two other users: in ELPA 
multi-mode.el sets fontification-functions to multi-fontify, and in MELPA 
poly-lock.el sets fontification-functions to poly-lock-function.

diff --git a/lisp/jit-lock.el b/lisp/jit-lock.el
index be26ca55f0..aa26b990bc 100644
--- a/lisp/jit-lock.el
+++ b/lisp/jit-lock.el
@@ -370,27 +370,36 @@ jit-lock-refontify

  ;;; On demand fontification.

+(defun jit-lock-function--internal (start)
+  "Internal function called by `jit-lock-function'."
+  (if (not (and jit-lock-defer-timer
+                (or (not (eq jit-lock-defer-time 0))
+                    (input-pending-p))))
+      ;; No deferral.
+      (jit-lock-fontify-now start (+ start jit-lock-chunk-size))
+    ;; Record the buffer for later fontification.
+    (unless (memq (current-buffer) jit-lock-defer-buffers)
+      (push (current-buffer) jit-lock-defer-buffers))
+    ;; Mark the area as defer-fontified so that the redisplay engine
+    ;; is happy and so that the idle timer can find the places to fontify.
+    (with-buffer-prepared-for-jit-lock
+     (put-text-property start
+			(next-single-property-change
+			 start 'fontified nil
+			 (min (point-max) (+ start jit-lock-chunk-size)))
+			'fontified 'defer))))
+
  (defun jit-lock-function (start)
    "Fontify current buffer starting at position START.
  This function is added to `fontification-functions' when `jit-lock-mode'
  is active."
    (when (and jit-lock-mode (not memory-full))
-    (if (not (and jit-lock-defer-timer
-                  (or (not (eq jit-lock-defer-time 0))
-                      (input-pending-p))))
-	;; No deferral.
-	(jit-lock-fontify-now start (+ start jit-lock-chunk-size))
-      ;; Record the buffer for later fontification.
-      (unless (memq (current-buffer) jit-lock-defer-buffers)
-	(push (current-buffer) jit-lock-defer-buffers))
-      ;; Mark the area as defer-fontified so that the redisplay engine
-      ;; is happy and so that the idle timer can find the places to fontify.
-      (with-buffer-prepared-for-jit-lock
-       (put-text-property start
-			  (next-single-property-change
-			   start 'fontified nil
-			   (min (point-max) (+ start jit-lock-chunk-size)))
-			  'fontified 'defer)))))
+    (if (not fontification-functions-restriction)
+        (jit-lock-function--internal start)
+      (narrow-to-region (car fontification-functions-restriction)
+                        (cdr fontification-functions-restriction)
+                        t)
+      (jit-lock-function--internal start))))

  (defun jit-lock--run-functions (beg end)
    (let ((tight-beg nil) (tight-end nil)
diff --git a/src/xdisp.c b/src/xdisp.c
index 0fdb1922e5..726e77b8eb 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4412,9 +4412,9 @@ handle_fontified_prop (struct it *it)
  	  ptrdiff_t begv = it->narrowed_begv ? it->narrowed_begv : BEGV;
  	  ptrdiff_t zv = it->narrowed_zv;
  	  ptrdiff_t charpos = IT_CHARPOS (*it);
  	  if (begv <= charpos && charpos <= zv)
-	    Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
+	    specbind (Qfontification_functions_restriction,
+		      Fcons (make_fixnum (begv), make_fixnum (zv)));
  	}

        /* Don't allow Lisp that runs from 'fontification-functions'
@@ -36673,6 +36673,13 @@ syms_of_xdisp (void)
    Vfontification_functions = Qnil;
    Fmake_variable_buffer_local (Qfontification_functions);

+  DEFSYM (Qfontification_functions_restriction,
+	  "fontification-functions-restriction");
+  DEFVAR_LISP ("fontification-functions-restriction",
+	       Vfontification_functions_restriction,
+	       doc: /* TODO  */);
+  Vfontification_functions_restriction = Qnil;
+
    DEFVAR_BOOL ("unibyte-display-via-language-environment",
                 unibyte_display_via_language_environment,
      doc: /* Non-nil means display unibyte text according to language environment.

By the way, while trying the above, it became clear that I forgot to 
properly handle the new optional argument to narrow-to-region in 
byte-compiled code.  But I don't know how to do that:

diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index b4954eee9f..1ecd77f751 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -767,7 +767,7 @@ 121
  (byte-defop 122  0 byte-char-syntax)
  (byte-defop 123 -1 byte-buffer-substring)
  (byte-defop 124 -1 byte-delete-region)
-(byte-defop 125 -1 byte-narrow-to-region)
+(byte-defop 125 -2 byte-narrow-to-region)
  (byte-defop 126  1 byte-widen)
  (byte-defop 127  0 byte-end-of-line)

@@ -3833,7 +3833,7 @@ setcar
  (byte-defop-compiler setcdr            2)
  (byte-defop-compiler buffer-substring  2)
  (byte-defop-compiler delete-region     2)
-(byte-defop-compiler narrow-to-region  2)
+(byte-defop-compiler narrow-to-region  2-3)
  (byte-defop-compiler (% byte-rem)      2)
  (byte-defop-compiler aset              3)

is apparently not enough, because "2-3" seems to install an 
integer-or-marker-p check on the third argument, which raises a 
(wrong-type-argument integer-or-marker-p nil) or (wrong-type-argument 
integer-or-marker-p t) error when narrow-to-region is called from 
byte-compiled code.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 10:52               ` Gregory Heytings
@ 2022-07-30 10:59                 ` Eli Zaretskii
  2022-07-30 11:07                   ` Gregory Heytings
  2022-07-30 11:32                 ` Eli Zaretskii
  2022-07-31  7:25                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30 10:59 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier

> Date: Sat, 30 Jul 2022 10:52:48 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Stefan Monnier <monnier@iro.umontreal.ca>, 56682@debbugs.gnu.org
> 
> >> I'd suggest to keep things mostly as they are but move the decision to 
> >> ELisp: i.e. pass the beg..end limits to jit-lock and let jit-lock do 
> >> the narrowing.  This way it's easy to later refine the mechanism.
> >
> > That's already happening: code called via fontification-functions can 
> > access the restriction via point-min and point-max.  If you or someone 
> > else can come up with efficient methods of using that information so as 
> > not to go too far forward and back, we could consider removing the lock 
> > from the narrowing.  But we'd need to see the code first and assess the 
> > resulting performance with long lines.
> >
> 
> IIUC, what Stefan suggests is the following, which seems (almost) fine to 
> me.

I fail to see the difference, sorry.  Instead of doing the narrowing
from C we leave it to jit-lock, but tell it the limits to narrow to?
Or am I missing something?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 10:59                 ` Eli Zaretskii
@ 2022-07-30 11:07                   ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 11:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier


>> IIUC, what Stefan suggests is the following, which seems (almost) fine 
>> to me.
>
> I fail to see the difference, sorry.  Instead of doing the narrowing 
> from C we leave it to jit-lock, but tell it the limits to narrow to? Or 
> am I missing something?
>

TBH, I don't see a real difference either.  Stefan says that "this way 
it's easy to later refine the mechanism", I guess because it is possible 
to "refine the mechanism" without recompiling Emacs.  But as I said the 
problem is that there are other users of fontification-functions besides 
jit-lock-function.

Could you (or Stefan) please have a look at the bytecomp.el problem I 
mentioned at the end of my previous post?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 10:52               ` Gregory Heytings
  2022-07-30 10:59                 ` Eli Zaretskii
@ 2022-07-30 11:32                 ` Eli Zaretskii
  2022-07-30 11:36                   ` Gregory Heytings
  2022-07-31  7:25                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30 11:32 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier

> Date: Sat, 30 Jul 2022 10:52:48 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Stefan Monnier <monnier@iro.umontreal.ca>, 56682@debbugs.gnu.org
> 
> By the way, while trying the above, it became clear that I forgot to 
> properly handle the new optional argument to narrow-to-region in 
> byte-compiled code.  But I don't know how to do that:
> 
> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
> index b4954eee9f..1ecd77f751 100644
> --- a/lisp/emacs-lisp/bytecomp.el
> +++ b/lisp/emacs-lisp/bytecomp.el
> @@ -767,7 +767,7 @@ 121
>   (byte-defop 122  0 byte-char-syntax)
>   (byte-defop 123 -1 byte-buffer-substring)
>   (byte-defop 124 -1 byte-delete-region)
> -(byte-defop 125 -1 byte-narrow-to-region)
> +(byte-defop 125 -2 byte-narrow-to-region)
>   (byte-defop 126  1 byte-widen)
>   (byte-defop 127  0 byte-end-of-line)
> 
> @@ -3833,7 +3833,7 @@ setcar
>   (byte-defop-compiler setcdr            2)
>   (byte-defop-compiler buffer-substring  2)
>   (byte-defop-compiler delete-region     2)
> -(byte-defop-compiler narrow-to-region  2)
> +(byte-defop-compiler narrow-to-region  2-3)
>   (byte-defop-compiler (% byte-rem)      2)
>   (byte-defop-compiler aset              3)
> 
> is apparently not enough, because "2-3" seems to install an 
> integer-or-marker-p check on the third argument, which raises a 
> (wrong-type-argument integer-or-marker-p nil) or (wrong-type-argument 
> integer-or-marker-p t) error when narrow-to-region is called from 
> byte-compiled code.

Where's the integer-or-marker-p test installed and/or called from?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30  9:05                                                                     ` Eli Zaretskii
@ 2022-07-30 11:34                                                                       ` Gregory Heytings
  2022-07-30 13:18                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 11:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>> Actually that doesn't work correctly.  A recipe:
>>
>> emacs -Q
>> M-: (progn (set-frame-width nil 119) (set-frame-height nil 38)) RET
>> C-x C-f dictionary.json RET y
>> C-s aan SPC
>>
>> Now you'll see that the last line at the bottom of the window, which 
>> does not contain "aan ", is highlighted.  Is the following okay from 
>> your point of view?
>
> I see the problem, but I don't understand how it could be related to 
> handle_fontified_prop.  The highlight in this case is the isearch 
> highlighting of matches, and those are shown via overlays, not via face 
> text properties, so AFAIK handle_fontified_prop doesn't handle them.
>
> Do you understand the relation of this to handle_fontified_prop?
>

I cannot claim that fully understand what happens, but see below.  What I 
do know is that that problem wasn't present before 9c12c3b7c5, and that it 
disappears with:

diff --git a/src/xdisp.c b/src/xdisp.c
index b1ee7889d4..8c62f088b8 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4412,13 +4412,8 @@ handle_fontified_prop (struct it *it)
           ptrdiff_t begv = it->narrowed_begv ? it->narrowed_begv : BEGV;
           ptrdiff_t zv = it->narrowed_zv;
           ptrdiff_t charpos = IT_CHARPOS (*it);
-         if (charpos < begv || charpos > zv)
-           {
-             begv = get_narrowed_begv (it->w, charpos);
-             if (!begv) begv = BEGV;
-             zv = get_narrowed_zv (it->w, charpos);
-           }
-         Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
+         if (begv <= charpos && charpos <= zv)
+           Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
         }

As far as I understand, what happens is this:

C-s aan SPC RET positions point at 730072.  When typing C-s aan SPC in a 
not yet fontified buffer, it->narrowed_begv and it->narrowed_zv are 
correctly positioned around that position, at 706860 and 732564 
respectively.  But the iterator is late, and is still at the position at 
which it was after C-s aan (without SPC); the first occurrence of "aan" is 
at 152274, the corresponding narrowing bounds are 128520 and 154224, which 
means that it has stopped at 154224.  So if we "reseat" the narrowing for 
fontification-functions around the position 154224 of the iterator, the 
narrowing becomes 141372-167076, which is well before the position of "aan 
", namely 730072.  At that point, isearch has found a match, and puts the 
match overlay at 167076 (the last possible position of the narrowed 
portion) and beyond, instead of putting it at 730072.

In short, it seems to me that using the position of the iterator is a too 
fragile solution, and that it is better to not apply a narrowing when the 
iterator is outside of the narrowing bounds.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 11:32                 ` Eli Zaretskii
@ 2022-07-30 11:36                   ` Gregory Heytings
  2022-07-30 12:05                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 11:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier


>> is apparently not enough, because "2-3" seems to install an 
>> integer-or-marker-p check on the third argument, which raises a 
>> (wrong-type-argument integer-or-marker-p nil) or (wrong-type-argument 
>> integer-or-marker-p t) error when narrow-to-region is called from 
>> byte-compiled code.
>
> Where's the integer-or-marker-p test installed and/or called from?
>

It is called when narrow-to-region, which has its own opcode, is called 
from byte-compiled code.  But I have no idea where it is installed.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 11:36                   ` Gregory Heytings
@ 2022-07-30 12:05                     ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 12:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier


>>> is apparently not enough, because "2-3" seems to install an 
>>> integer-or-marker-p check on the third argument, which raises a 
>>> (wrong-type-argument integer-or-marker-p nil) or (wrong-type-argument 
>>> integer-or-marker-p t) error when narrow-to-region is called from 
>>> byte-compiled code.
>> 
>> Where's the integer-or-marker-p test installed and/or called from?
>
> It is called when narrow-to-region, which has its own opcode, is called 
> from byte-compiled code.  But I have no idea where it is installed.
>

Got it, fixed on master.  make bootstrap is probably necessary.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30  7:16           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-30  8:12             ` Eli Zaretskii
@ 2022-07-30 13:17             ` Gregory Heytings
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 13:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>>> I'm not opposed to reducing the size of the text that's considered, but
>>> doing it via narrowing is a blunt tool.
>>
>> It isn't.
>
> There's no point arguing about it.  I find it to be and you don't and 
> that's that.
>

Actually, with your explanation below that "blunt" means for you "cannot 
be fine-tuned", I agree with you.

>
> The reason why I find it to be is because it removes all possibilities 
> of making different choices for different elements depending on the cost 
> of those elements and the amount of available information.
>

With that I also agree.  It would be better to have a refined decision 
method, but at the moment it doesn't exist.

>
> What I think would be a better option is to (somehow) pass the 
> `beg..end` "limit" to jit-lock which can then pass it on to its own 
> clients (e.g. font-lock) so they each can make their own choices.
>

I sent a patch a few hours ago in this thread.

The limit of what you suggest is that its effect would depend on the 
goodwill of major and minor mode authors, which could decide to ignore 
these beg..end recommendations altogether.  Whereas the point of that 
feature is more or less to protect Emacs users from major and minor modes, 
to make sure that Emacs remains responsive when the buffer contains long 
lines.

Also, there are other users of fontification-functions besides 
jit-lock-function.

(And for some reason the patch I sent does not give the results I would 
have expected, font-locking is still too slow, but it's perhaps a bug in 
the patch.)

>
> E.g. the syntax-ppss part of the job performed by font-lock is heavily 
> cached, does not depends on lines, is theoretically always computed from 
> BOB but with a cache which makes it fast even when working near EOB (tho 
> it can still be somewhat slow when jumping from BOB to EOB, but that 
> depends on the size of the buffer, not the size of lines).  This part 
> *should* ignore your limits, which will make sure comments and strings 
> are recognized correctly at least in simple cases (i.e. cases which 
> don't depend on `syntax-propertize-function`).
>

I see.  Do you see a way to somehow extract the syntax-ppss part out of 
font-lock?  Would that be feasible?

And another question: can syntax-ppss not be used to determine a "good 
starting position" for the narrowing, outside of any comments or strings 
(if possible)?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 11:34                                                                       ` Gregory Heytings
@ 2022-07-30 13:18                                                                         ` Eli Zaretskii
  2022-07-30 13:31                                                                           ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30 13:18 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sat, 30 Jul 2022 11:34:03 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> C-s aan SPC RET positions point at 730072.  When typing C-s aan SPC in a 
> not yet fontified buffer, it->narrowed_begv and it->narrowed_zv are 
> correctly positioned around that position, at 706860 and 732564 
> respectively.  But the iterator is late, and is still at the position at 
> which it was after C-s aan (without SPC); the first occurrence of "aan" is 
> at 152274, the corresponding narrowing bounds are 128520 and 154224, which 
> means that it has stopped at 154224.  So if we "reseat" the narrowing for 
> fontification-functions around the position 154224 of the iterator, the 
> narrowing becomes 141372-167076, which is well before the position of "aan 
> ", namely 730072.  At that point, isearch has found a match, and puts the 
> match overlay at 167076 (the last possible position of the narrowed 
> portion) and beyond, instead of putting it at 730072.

What I see is that isearch puts the match overlay on text that starts
at 167076 and ends at 730068 (the latter is the beginning of the
match), instead of on text between 730068 and 730072.

> In short, it seems to me that using the position of the iterator is a too 
> fragile solution, and that it is better to not apply a narrowing when the 
> iterator is outside of the narrowing bounds.

But this means we give up on the narrowing, which means the display
could be very slow.

And I've found the culprit: we weren't restoring point after lifting
the locked narrowing.  narrow-to-region can move point if the new
restriction puts point outside of the region.  So what was happening
is that isearch-update was calling pos-visible-in-window-group-p to
see whether the match is visible, and that call would move point from
under the feet of isearch-update, because pos-visible-in-window-p
calls display routines.  So any subsequent uses of point would use a
completely wrong value of point.

I've now made narrow-to-region preserve point across locked narrowing,
and the problem went away.

Ugh! this one was a bitch to debug!





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 13:18                                                                         ` Eli Zaretskii
@ 2022-07-30 13:31                                                                           ` Gregory Heytings
  2022-07-30 15:23                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 13:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]


>
> And I've found the culprit: we weren't restoring point after lifting the 
> locked narrowing.  narrow-to-region can move point if the new 
> restriction puts point outside of the region.  So what was happening is 
> that isearch-update was calling pos-visible-in-window-group-p to see 
> whether the match is visible, and that call would move point from under 
> the feet of isearch-update, because pos-visible-in-window-p calls 
> display routines.  So any subsequent uses of point would use a 
> completely wrong value of point.
>
> I've now made narrow-to-region preserve point across locked narrowing, 
> and the problem went away.
>
> Ugh! this one was a bitch to debug!
>

😉  Thanks, that's even better!

So the only remaining question is whether it is necessary to recompute 
narrowed_begv and narrowed_zv in init_iterator:

diff --git a/src/xdisp.c b/src/xdisp.c
index b1ee7889d4..e415320a52 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -3429,6 +3429,12 @@ init_iterator (struct it *it, struct window *w,
     {
       it->narrowed_begv = get_narrowed_begv (w, window_point (w));
       it->narrowed_zv = get_narrowed_zv (w, window_point (w));
+      if (charpos >= 0
+         && (charpos < it->narrowed_begv || charpos > it->narrowed_zv))
+       {
+         it->narrowed_begv = get_narrowed_begv (w, charpos);
+         it->narrowed_zv = get_narrowed_zv (w, charpos);
+       }
     }

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 13:31                                                                           ` Gregory Heytings
@ 2022-07-30 15:23                                                                             ` Eli Zaretskii
  2022-07-30 18:13                                                                               ` Gregory Heytings
  2022-07-31  7:11                                                                               ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30 15:23 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sat, 30 Jul 2022 13:31:42 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> So the only remaining question is whether it is necessary to recompute 
> narrowed_begv and narrowed_zv in init_iterator:

I tend to think we should, but let me think about this some more.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 15:23                                                                             ` Eli Zaretskii
@ 2022-07-30 18:13                                                                               ` Gregory Heytings
  2022-07-30 18:34                                                                                 ` Eli Zaretskii
  2022-07-31  7:11                                                                               ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 18:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>> So the only remaining question is whether it is necessary to recompute 
>> narrowed_begv and narrowed_zv in init_iterator:
>
> I tend to think we should, but let me think about this some more.
>

Okay, I'll wait for your feedback.

Wouldn't it make sense to also limit the portion of the buffer to which 
pre-/post-command-hook have access (see below)?

With that patch, I was able to open and edit a file with a single 50 GB 
(!) line, in js-mode.  Does that still not qualify as "arbitrarily large"?

I compared that with a 50 GB JSON file with 80 character wide lines. 
With that file it was necessary to disable font-lock-mode, which took too 
much time.  Apart from that, I did not see any significant performance 
differences while editing the file, compared to the single line one.

diff --git a/src/keyboard.c b/src/keyboard.c
index 2863058d63..ce529222a3 100644
--- a/src/keyboard.c
+++ b/src/keyboard.c
@@ -1461,7 +1461,22 @@ command_loop_1 (void)
        }
        Vthis_command = cmd;
        Vreal_this_command = cmd;
-      safe_run_hooks (Qpre_command_hook);
+
+      if (current_buffer->long_line_optimizations_p)
+	{
+	  specpdl_ref count = SPECPDL_INDEX ();
+	  struct window *w = XWINDOW (selected_window);
+	  ptrdiff_t begv = get_narrowed_begv (w, PT);
+	  ptrdiff_t zv = get_narrowed_zv (w, PT);
+	  if (!begv) begv = BEGV;
+	  Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
+	  safe_run_hooks (Qpre_command_hook);
+	  unbind_to (count, Qnil);
+	}
+      else
+	{
+	  safe_run_hooks (Qpre_command_hook);
+	}

        already_adjusted = 0;

@@ -1513,7 +1528,21 @@ command_loop_1 (void)
            }
        kset_last_prefix_arg (current_kboard, Vcurrent_prefix_arg);

-      safe_run_hooks (Qpost_command_hook);
+      if (current_buffer->long_line_optimizations_p)
+	{
+	  specpdl_ref count = SPECPDL_INDEX ();
+	  struct window *w = XWINDOW (selected_window);
+	  ptrdiff_t begv = get_narrowed_begv (w, PT);
+	  ptrdiff_t zv = get_narrowed_zv (w, PT);
+	  if (!begv) begv = BEGV;
+	  Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
+	  safe_run_hooks (Qpost_command_hook);
+	  unbind_to (count, Qnil);
+	}
+      else
+	{
+	  safe_run_hooks (Qpost_command_hook);
+	}

        /* If displaying a message, resize the echo area window to fit
  	 that message's size exactly.  Do this only if the echo area





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 18:13                                                                               ` Gregory Heytings
@ 2022-07-30 18:34                                                                                 ` Eli Zaretskii
  2022-07-30 18:47                                                                                   ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30 18:34 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sat, 30 Jul 2022 18:13:18 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> Wouldn't it make sense to also limit the portion of the buffer to which 
> pre-/post-command-hook have access (see below)?

Those generally don't belong to the display department, so I'd
hesitate doing so.  Which pre-/post-command-hook functions did you
find that cause slowdown because of long lines.

Before considering these hooks, I'd consider window-scroll-functions
and window-*-change-functions, which _are_ run by redisplay.

> With that patch, I was able to open and edit a file with a single 50 GB 
> (!) line, in js-mode.  Does that still not qualify as "arbitrarily large"?

We don't even claim to be able to edit _files_ of arbitrary size
(because we are limited by fixnums).

> I compared that with a 50 GB JSON file with 80 character wide lines. 
> With that file it was necessary to disable font-lock-mode, which took too 
> much time.

How so?  We now restrict font-lock to a small region, so why does it
matter how much more stuff is there outside of the viewport?  What
other aspects of the line size still affect performance?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 18:34                                                                                 ` Eli Zaretskii
@ 2022-07-30 18:47                                                                                   ` Gregory Heytings
  2022-07-30 19:02                                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 18:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>> Wouldn't it make sense to also limit the portion of the buffer to which 
>> pre-/post-command-hook have access (see below)?
>
> Those generally don't belong to the display department, so I'd hesitate 
> doing so.  Which pre-/post-command-hook functions did you find that 
> cause slowdown because of long lines.
>

jit-lock--antiblink-post-command

>> With that patch, I was able to open and edit a file with a single 50 GB 
>> (!) line, in js-mode.  Does that still not qualify as "arbitrarily 
>> large"?
>
> We don't even claim to be able to edit _files_ of arbitrary size 
> (because we are limited by fixnums).
>

That's theory, isn't it?  With 64-bit builds we are limited to files that 
are less than 2047 Po.  No computer on this planet has that much RAM.

>> I compared that with a 50 GB JSON file with 80 character wide lines. 
>> With that file it was necessary to disable font-lock-mode, which took 
>> too much time.
>
> How so?  We now restrict font-lock to a small region, so why does it 
> matter how much more stuff is there outside of the viewport?  What other 
> aspects of the line size still affect performance?
>

We do not restrict font-lock in large files _without long lines_, hence 
the difference.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 18:47                                                                                   ` Gregory Heytings
@ 2022-07-30 19:02                                                                                     ` Eli Zaretskii
  2022-07-30 19:11                                                                                       ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-30 19:02 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sat, 30 Jul 2022 18:47:04 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> >> Wouldn't it make sense to also limit the portion of the buffer to which 
> >> pre-/post-command-hook have access (see below)?
> >
> > Those generally don't belong to the display department, so I'd hesitate 
> > doing so.  Which pre-/post-command-hook functions did you find that 
> > cause slowdown because of long lines.
> 
> jit-lock--antiblink-post-command

OK, but is it TRT to "punish" every one of these hooks for the
"crimes" of the few?  Maybe we should instead handle the problematic
ones locally, by exposing the long_line_optimizations_p flag to Lisp
(through an accessor), and then modifying those that misbehave to
"behave"?

> >> With that patch, I was able to open and edit a file with a single 50 GB 
> >> (!) line, in js-mode.  Does that still not qualify as "arbitrarily 
> >> large"?
> >
> > We don't even claim to be able to edit _files_ of arbitrary size 
> > (because we are limited by fixnums).
> 
> That's theory, isn't it?  With 64-bit builds we are limited to files that 
> are less than 2047 Po.  No computer on this planet has that much RAM.

You forget that we are talking about VM.

But let's not restart that argument, okay?

> >> I compared that with a 50 GB JSON file with 80 character wide lines. 
> >> With that file it was necessary to disable font-lock-mode, which took 
> >> too much time.
> >
> > How so?  We now restrict font-lock to a small region, so why does it 
> > matter how much more stuff is there outside of the viewport?  What other 
> > aspects of the line size still affect performance?
> >
> 
> We do not restrict font-lock in large files _without long lines_, hence 
> the difference.

Sorry, I thought you were talking about a single-line file.

If JS mode wants to access the entire buffer for fontifications, then
IMO the problem is in JS mode, and should be fixed there.
narrow-to-region is available to Lisp programs as well ;-)

IOW, it isn't an infrastructure problem that needs to be fixed in
display code.  (It is even possible that tree-sitter integration will
fix this, or at least alleviate it, as a side effect.)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 19:02                                                                                     ` Eli Zaretskii
@ 2022-07-30 19:11                                                                                       ` Gregory Heytings
  2022-07-31  6:16                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-30 19:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier

[-- Attachment #1: Type: text/plain, Size: 1856 bytes --]


>>>> Wouldn't it make sense to also limit the portion of the buffer to 
>>>> which pre-/post-command-hook have access (see below)?
>>>
>>> Those generally don't belong to the display department, so I'd 
>>> hesitate doing so.  Which pre-/post-command-hook functions did you 
>>> find that cause slowdown because of long lines.
>>
>> jit-lock--antiblink-post-command
>
> OK, but is it TRT to "punish" every one of these hooks for the "crimes" 
> of the few?  Maybe we should instead handle the problematic ones 
> locally, by exposing the long_line_optimizations_p flag to Lisp (through 
> an accessor), and then modifying those that misbehave to "behave"?
>

It's the same problem than with fontification-functions.  We cannot know 
what all these hooks that are installed by major and minor modes will do, 
we cannot hope to fix them one by one, so it seems to me that with 
long_line_optimizations_p, which is an unusual case anyway, it makes sense 
to "punish" them all in the same way.

>> That's theory, isn't it?  With 64-bit builds we are limited to files 
>> that are less than 2047 Po.  No computer on this planet has that much 
>> RAM.
>
> You forget that we are talking about VM.
>
> But let's not restart that argument, okay?
>

Hmmm... okay 😉

>
> If JS mode wants to access the entire buffer for fontifications, then 
> IMO the problem is in JS mode, and should be fixed there. 
> narrow-to-region is available to Lisp programs as well ;-)
>
> IOW, it isn't an infrastructure problem that needs to be fixed in 
> display code.  (It is even possible that tree-sitter integration will 
> fix this, or at least alleviate it, as a side effect.)
>

Agreed.  My point was only that Emacs now behaves a bit better when 
editing a single-line very large file compared to a multi-line very large 
file.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 19:11                                                                                       ` Gregory Heytings
@ 2022-07-31  6:16                                                                                         ` Eli Zaretskii
  2022-07-31  8:22                                                                                           ` Lars Ingebrigtsen
  2022-07-31  8:30                                                                                           ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-31  6:16 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sat, 30 Jul 2022 19:11:57 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> >> jit-lock--antiblink-post-command
> >
> > OK, but is it TRT to "punish" every one of these hooks for the "crimes" 
> > of the few?  Maybe we should instead handle the problematic ones 
> > locally, by exposing the long_line_optimizations_p flag to Lisp (through 
> > an accessor), and then modifying those that misbehave to "behave"?
> 
> It's the same problem than with fontification-functions.  We cannot know 
> what all these hooks that are installed by major and minor modes will do, 
> we cannot hope to fix them one by one, so it seems to me that with 
> long_line_optimizations_p, which is an unusual case anyway, it makes sense 
> to "punish" them all in the same way.

It sounds...too drastic.  Lars, WDYT?

I don't find it problematic to have to fix any such hooks in the core
that we discover as misbehaving with long lines.  How many of those
could we have?  And those in 3rd party packages can follow suit if
they want (and if the respective modes are relevant to files with long
lines, I expect to see pressure on their developers to do so).

Once again, IME it is impossible to fix such problems only in
low-level C infrastructure.  There will always be left-overs and
fallouts that should be fixed locally in Lisp where they happen.
There's no problem here, and I don't expect us to be able to fix
everything by a small number of quick fixes, and declare a victory
once and for all.

> Agreed.  My point was only that Emacs now behaves a bit better when 
> editing a single-line very large file compared to a multi-line very large 
> file.

Well, WDYT about a similar feature for very large files?  IOW, when
the buffer's size is above some threshold, turn on the
long_line_optimizations_p flag (which should perhaps be renamed to
better reflect its purpose) even if no long lines are seen?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 15:23                                                                             ` Eli Zaretskii
  2022-07-30 18:13                                                                               ` Gregory Heytings
@ 2022-07-31  7:11                                                                               ` Eli Zaretskii
  2022-07-31 22:54                                                                                 ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-31  7:11 UTC (permalink / raw)
  To: gregory, gerd.moellmann; +Cc: 56682, larsi, monnier

> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org,
>  monnier@iro.umontreal.ca
> Date: Sat, 30 Jul 2022 18:23:12 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Date: Sat, 30 Jul 2022 13:31:42 +0000
> > From: Gregory Heytings <gregory@heytings.org>
> > cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
> >     monnier@iro.umontreal.ca
> > 
> > So the only remaining question is whether it is necessary to recompute 
> > narrowed_begv and narrowed_zv in init_iterator:
> 
> I tend to think we should, but let me think about this some more.

Here are my thoughts.

First, I think the setting of narrowed_begv and narrowed_zv should be
done in 'reseat', not in init_iterator.  The latter always calls the
former when invoked to start iteration of buffer text, but we also
call 'reseat' from other places, when we "jump" the iterator to a new
place, potentially far from the last.  If nothing else, this should
help with truncate-lines, where using window_point is basically right
only for the point's line.  And currently, init_iterator computes and
sets the narrowing even when we iterate on strings, which is unneeded
and incorrect.

Whether always to correct narrowed_begv and narrowed_zv if we are
reseating to a position outside the narrowing, is a more complicated
question.  The basic problem here is that we don't have an easy way of
restoring the previous narrowing (except by unwind_protect), and the
display code sometimes calls init_iterator or start_display using the
iterator that already has these members set by previous code, a
situation which we currently cannot easily detect.  However, when this
code runs as part of redisplay, we generally don't expect the original
narrowing to be insufficient, except perhaps in the truncate-line
case.

So I think we should correct narrowed_begv and narrowed_zv only if
either the 'redisplaying_p' flag is reset (meaning the display code is
being invoked outside of redisplay) or it->line_wrap == TRUNCATE.

Comments? thoughts?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-30 10:52               ` Gregory Heytings
  2022-07-30 10:59                 ` Eli Zaretskii
  2022-07-30 11:32                 ` Eli Zaretskii
@ 2022-07-31  7:25                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31  7:48                   ` Eli Zaretskii
  2022-07-31  8:08                   ` Gregory Heytings
  2 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-31  7:25 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

> IIUC, what Stefan suggests is the following, which seems (almost) fine to
> me.  The only problem I see is that jit-lock-function is not the only user
> of fontication-functions.  It has at least two other users: in ELPA
> multi-mode.el sets fontification-functions to multi-fontify, and in MELPA
> poly-lock.el sets fontification-functions to poly-lock-function.

Good point.

> @@ -4412,9 +4412,9 @@ handle_fontified_prop (struct it *it)
>  	  ptrdiff_t begv = it->narrowed_begv ? it->narrowed_begv : BEGV;
>  	  ptrdiff_t zv = it->narrowed_zv;
>  	  ptrdiff_t charpos = IT_CHARPOS (*it);
>  	  if (begv <= charpos && charpos <= zv)
> -	    Fnarrow_to_region (make_fixnum (begv), make_fixnum (zv), Qt);
> +	    specbind (Qfontification_functions_restriction,
> +		      Fcons (make_fixnum (begv), make_fixnum (zv)));
>  	}

Hpw 'bout we do the reverse then: set the narrowing, but let-bind
a variable to indicate that we're inside a line-length-induced
narrowing, together with the previous narrowing bounds, so jit-lock or
its clients can undo the narrowing when needed?


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  7:25                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-31  7:48                   ` Eli Zaretskii
  2022-07-31  8:08                   ` Gregory Heytings
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-31  7:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org
> Date: Sun, 31 Jul 2022 03:25:46 -0400
> 
> How 'bout we do the reverse then: set the narrowing, but let-bind
> a variable to indicate that we're inside a line-length-induced
> narrowing, together with the previous narrowing bounds, so jit-lock or
> its clients can undo the narrowing when needed?

I'd rather prefer an extension of 'widen' that would be able to undo
the locked narrowing "when needed".  That way, jit-lock clients should
actually do something to request that, instead of letting them keep
the current code that widens whenever they feel like it.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  7:25                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31  7:48                   ` Eli Zaretskii
@ 2022-07-31  8:08                   ` Gregory Heytings
  2022-07-31 10:41                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31  8:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>
> How 'bout we do the reverse then: set the narrowing, but let-bind a 
> variable to indicate that we're inside a line-length-induced narrowing, 
> together with the previous narrowing bounds, so jit-lock or its clients 
> can undo the narrowing when needed?
>

That's not possible: the narrowing is (really) locked (with un uninterned 
symbol), it cannot be undone.  What would be possible would be to add an 
optional "unlock" argument to widen.  But somehow I don't think that would 
be TRT, as mode authors who now do a (widen) would simply take the habit 
to write (widen t) instead, and the same problems would surface again.

BTW, my tests show that syntax-ppss can be rather slow, when the file is 
large enough (say 1 GB).  I didn't look at what it does, but is it not 
possible to design a version of syntax-ppss that would approximate, with 
some heuristics, what syntax-ppss does, but on a smaller chunk of the 
buffer?  For example, I'd guess that '"' immediately followed by an 
alphanumeric character most likely starts a string, and '"' immediately 
preceded by an alphanumeric character most likely ends a string.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  6:16                                                                                         ` Eli Zaretskii
@ 2022-07-31  8:22                                                                                           ` Lars Ingebrigtsen
  2022-07-31  8:38                                                                                             ` Eli Zaretskii
  2022-07-31  8:30                                                                                           ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-31  8:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, Gregory Heytings, monnier

Eli Zaretskii <eliz@gnu.org> writes:

>> It's the same problem than with fontification-functions.  We cannot know 
>> what all these hooks that are installed by major and minor modes will do, 
>> we cannot hope to fix them one by one, so it seems to me that with 
>> long_line_optimizations_p, which is an unusual case anyway, it makes sense 
>> to "punish" them all in the same way.
>
> It sounds...too drastic.  Lars, WDYT?

I agree with Gregory that it makes sense to disable (some) fontification
stuff in long-line buffers.  It much more important to be able to view
and edit these files than getting fontification details completely
correct.

Of course, it'd be better if everything just worked perfectly, but
that's very ambitious. 






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  6:16                                                                                         ` Eli Zaretskii
  2022-07-31  8:22                                                                                           ` Lars Ingebrigtsen
@ 2022-07-31  8:30                                                                                           ` Gregory Heytings
  2022-07-31  9:04                                                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31  8:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> It sounds...too drastic.
>

Are you sure?  The docstring already says "It is a bad idea to use this 
hook for expensive processing."  And Emacs already removes a function from 
the hook when it misbhaves.  Adding something like "In a too large buffer 
or in a buffer with long lines, the functions in this hook will only have 
access to a small portion of the buffer" seems coherent, at least to me.

>
> Once again, IME it is impossible to fix such problems only in low-level 
> C infrastructure.  There will always be left-overs and fallouts that 
> should be fixed locally in Lisp where they happen. There's no problem 
> here, and I don't expect us to be able to fix everything by a small 
> number of quick fixes, and declare a victory once and for all.
>

I both agree and disagree with that.  It is true that it is, strictly 
speaking, impossible to fix _all_ such problems _only_ in low-level C 
intrastructure, and that there will always be left-overs.  But it is 
possible to fix _most_ of these problems only in low-level C 
infrastructure, and we should do so, just like an operating system kernel 
in which everything is done to avoid crashing the system/leaving it in an 
unusable state (which includes killing a mis-behaving process when 
necessary).  And we should do so even more when the amount of code to do 
so in the low-level C infrastructure remains small.

>
> Well, WDYT about a similar feature for very large files?  IOW, when the 
> buffer's size is above some threshold, turn on the 
> long_line_optimizations_p flag (which should perhaps be renamed to 
> better reflect its purpose) even if no long lines are seen?
>

I was thinking about such a feature indeed.  But it would be separate from 
the long_line_optimizations_p one, because the optimizations to activate 
in both cases are different, and their thresholds are different, too.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  8:22                                                                                           ` Lars Ingebrigtsen
@ 2022-07-31  8:38                                                                                             ` Eli Zaretskii
  2022-07-31  8:41                                                                                               ` Lars Ingebrigtsen
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-31  8:38 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: gerd.moellmann, 56682, gregory, monnier

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: Gregory Heytings <gregory@heytings.org>,  gerd.moellmann@gmail.com,
>   56682@debbugs.gnu.org,  monnier@iro.umontreal.ca
> Date: Sun, 31 Jul 2022 10:22:40 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> It's the same problem than with fontification-functions.  We cannot know 
> >> what all these hooks that are installed by major and minor modes will do, 
> >> we cannot hope to fix them one by one, so it seems to me that with 
> >> long_line_optimizations_p, which is an unusual case anyway, it makes sense 
> >> to "punish" them all in the same way.
> >
> > It sounds...too drastic.  Lars, WDYT?
> 
> I agree with Gregory that it makes sense to disable (some) fontification
> stuff in long-line buffers.  It much more important to be able to view
> and edit these files than getting fontification details completely
> correct.

No disagreement here, but I was asking about the proposal to make the
locked narrowing in effect when any pre-command-hook or
post-command-hook runs.  These are usually unrelated to
fontifications, although Gregory found an example where it is.

IOW, the issue for which I wanted to hear your opinion was whether you
think it's okay to preclude, in a buffer with long lines, all
pre/post-command-hooks from accessing the entire buffer.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  8:38                                                                                             ` Eli Zaretskii
@ 2022-07-31  8:41                                                                                               ` Lars Ingebrigtsen
  2022-07-31 22:45                                                                                                 ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Lars Ingebrigtsen @ 2022-07-31  8:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, gregory, monnier

Eli Zaretskii <eliz@gnu.org> writes:

> IOW, the issue for which I wanted to hear your opinion was whether you
> think it's okay to preclude, in a buffer with long lines, all
> pre/post-command-hooks from accessing the entire buffer.

Yes, I think that's reasonable.  There will inevitably be many modes out
there that'll wedge themselves in the presence of gargantuan lines, and
limiting these hooks is a reasonable line of defence.






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  8:30                                                                                           ` Gregory Heytings
@ 2022-07-31  9:04                                                                                             ` Eli Zaretskii
  2022-07-31 14:09                                                                                               ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-07-31  9:04 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sun, 31 Jul 2022 08:30:14 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > It sounds...too drastic.
> 
> Are you sure?  The docstring already says "It is a bad idea to use this 
> hook for expensive processing."  And Emacs already removes a function from 
> the hook when it misbhaves.  Adding something like "In a too large buffer 
> or in a buffer with long lines, the functions in this hook will only have 
> access to a small portion of the buffer" seems coherent, at least to me.

But that assumes these hooks will _always_ be "too expensive" in such
buffers.  Which is not necessarily true: I can think of many things a
hook can do in a long-line buffer without being expensive.

And since we already remove expensive hook functions, maybe that is
enough?  Or maybe we should use a different threshold for "expensive"
in buffers with long lines?

> > Once again, IME it is impossible to fix such problems only in low-level 
> > C infrastructure.  There will always be left-overs and fallouts that 
> > should be fixed locally in Lisp where they happen. There's no problem 
> > here, and I don't expect us to be able to fix everything by a small 
> > number of quick fixes, and declare a victory once and for all.
> 
> I both agree and disagree with that.  It is true that it is, strictly 
> speaking, impossible to fix _all_ such problems _only_ in low-level C 
> intrastructure, and that there will always be left-overs.  But it is 
> possible to fix _most_ of these problems only in low-level C 
> infrastructure, and we should do so, just like an operating system kernel 
> in which everything is done to avoid crashing the system/leaving it in an 
> unusable state (which includes killing a mis-behaving process when 
> necessary).  And we should do so even more when the amount of code to do 
> so in the low-level C infrastructure remains small.

I think we are in agreement: my point was that solving such problems
locally is not unthinkable.

> > Well, WDYT about a similar feature for very large files?  IOW, when the 
> > buffer's size is above some threshold, turn on the 
> > long_line_optimizations_p flag (which should perhaps be renamed to 
> > better reflect its purpose) even if no long lines are seen?
> >
> 
> I was thinking about such a feature indeed.  But it would be separate from 
> the long_line_optimizations_p one, because the optimizations to activate 
> in both cases are different, and their thresholds are different, too.

Different thresholds are easy to reconcile: the optimizations should
be turned on if either of the two thresholds is exceeded.  But why do
you say the optimizations will be different? what's wrong with using
the same optimizations, i.e. restrict the display code from accessing
the entire buffer?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  8:08                   ` Gregory Heytings
@ 2022-07-31 10:41                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 10:50                       ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-31 10:41 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

> That's not possible: the narrowing is (really) locked (with un uninterned
> symbol), it cannot be undone.  What would be possible would be to add an
> optional "unlock" argument to widen.  But somehow I don't think that would
> be TRT, as mode authors who now do a (widen) would simply take the habit to
> write (widen t) instead, and the same problems would surface again.

Emacs is not in the business of preventing people from shooting
themselves in the foot.  If we need this narrowing to be enforced
because Emacs would otherwise crash, then it's OK, but if not, then we
*should* provide a way to undo it.

> BTW, my tests show that syntax-ppss can be rather slow, when the file is
>  large enough (say 1 GB).

No doubt.  But it's no slower with long lines than with short lines.
[ Note that it calls `syntax-propertize` internally, which would need
  to be considered separately since `syntax-propertize` does work
  line-by-line, and should hence obey the narrowing.  ]


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 10:41                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-31 10:50                       ` Gregory Heytings
  2022-07-31 21:41                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 22:03                         ` Dmitry Gutov
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 10:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>
> Emacs is not in the business of preventing people from shooting 
> themselves in the foot.  If we need this narrowing to be enforced 
> because Emacs would otherwise crash, then it's OK, but if not, then we 
> *should* provide a way to undo it.
>

And how do you define "crash"?  Is Emacs becoming unresponsive because an 
operation takes say two minutes to complete and cannot be interrupted a 
"crash"?  Or is a "crash" only a segfault?

>> BTW, my tests show that syntax-ppss can be rather slow, when the file 
>> is large enough (say 1 GB).
>
> No doubt.  But it's no slower with long lines than with short lines.
>

Yes, I wasn't clear enough, I should have written "when the file is large 
enough (say 1 GB), even without long lines".

But you didn't answer my question: is it not possible to design a version 
of syntax-ppss that would approximate, with some heuristics, what 
syntax-ppss does, but on a smaller chunk of the buffer?  With such a 
syntax-ppss-approximate function, we could do something like

(defun syntax-ppss (args)
   (if (narrow-to-region-locked)
       (syntax-ppss-approximage args)
     (syntax-ppss-accurate args)))





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  9:04                                                                                             ` Eli Zaretskii
@ 2022-07-31 14:09                                                                                               ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 14:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> And since we already remove expensive hook functions, maybe that is 
> enough?  Or maybe we should use a different threshold for "expensive" in 
> buffers with long lines?
>

Do we?  As far as I can see, there is no time limit in safe_run_hooks. 
It could make sense to add one (or at least to clear out hook functions 
that have taken too long, as we already do it with functions that return 
an error... but doing that after the hook function returns is already too 
late, if it has taken say 30 seconds).

>>> Well, WDYT about a similar feature for very large files?  IOW, when 
>>> the buffer's size is above some threshold, turn on the 
>>> long_line_optimizations_p flag (which should perhaps be renamed to 
>>> better reflect its purpose) even if no long lines are seen?
>>
>> I was thinking about such a feature indeed.  But it would be separate 
>> from the long_line_optimizations_p one, because the optimizations to 
>> activate in both cases are different, and their thresholds are 
>> different, too.
>
> Different thresholds are easy to reconcile: the optimizations should be 
> turned on if either of the two thresholds is exceeded.  But why do you 
> say the optimizations will be different? what's wrong with using the 
> same optimizations, i.e. restrict the display code from accessing the 
> entire buffer?
>

I don't know exactly yet.  It seems to me that some of the optimizations 
for large buffers would be similar to the ones for long lines, and that 
many of the specific optimizations for long lines are not necessary for 
large buffers.  I think it would be better/safer to only enable the 
optimizations that are really necessary in each case.  But let's start 
thinking in more detail about the large buffer optimizations once the long 
lines optimizations are done, okay?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 10:50                       ` Gregory Heytings
@ 2022-07-31 21:41                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 22:06                           ` Gregory Heytings
  2022-07-31 22:03                         ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-31 21:41 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

Gregory Heytings [2022-07-31 10:50:18] wrote:
>> Emacs is not in the business of preventing people from shooting themselves
>> in the foot.  If we need this narrowing to be enforced because Emacs would
>> otherwise crash, then it's OK, but if not, then we *should* provide a way
>> to undo it.
> And how do you define "crash"?

Core dump.

> Is Emacs becoming unresponsive because an operation takes say two
> minutes to complete and cannot be interrupted a "crash"?  Or is
> a "crash" only a segfault?

Try `M-: (use-global-map (make-keymap)) RET`
Should we prevent users from doing that?

Let's focus on making it easy to make it work well, rather than making
it impossible to make it work poorly.

>>> BTW, my tests show that syntax-ppss can be rather slow, when the file is
>>> large enough (say 1 GB).
>> No doubt.  But it's no slower with long lines than with short lines.
>
> Yes, I wasn't clear enough, I should have written "when the file is large
> enough (say 1 GB), even without long lines".
>
> But you didn't answer my question: is it not possible to design a version of
> syntax-ppss that would approximate, with some heuristics, what syntax-ppss
> does, but on a smaller chunk of the buffer?

The answer is basically "no" but even before getting there, I have to
remind the reader that it hasn't really been requested.

In order to know if POS is within a string (which is one of the main
uses of `syntax-ppss`), you basically need to know if there's an odd or
even number of quotes before POS, which fundamentally needs to look at
all the chars between POS and BOB.  Of course we use a cache to try and
avoid looking at them over and over again, but the cache can't be of any
use the first time around.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 10:50                       ` Gregory Heytings
  2022-07-31 21:41                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-31 22:03                         ` Dmitry Gutov
  2022-07-31 22:23                           ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-07-31 22:03 UTC (permalink / raw)
  To: Gregory Heytings, Stefan Monnier; +Cc: 56682, Eli Zaretskii

On 31.07.2022 13:50, Gregory Heytings wrote:
>>> BTW, my tests show that syntax-ppss can be rather slow, when the file 
>>> is large enough (say 1 GB).
>>
>> No doubt.  But it's no slower with long lines than with short lines.
>>
> 
> Yes, I wasn't clear enough, I should have written "when the file is 
> large enough (say 1 GB), even without long lines".

What kind of scenario are you thinking of that would exhibit this 
slowness in a 1 GB file?

If we're talking about syntax-ppss only, regular editing operations 
(typing and deleting code) limited to, say, one screen should trigger 
only a rescan of a limited area inside such buffer.

Both when the editing happens near the beginning or near the end of the 
buffer. Or in the middle -- no difference.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 21:41                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-31 22:06                           ` Gregory Heytings
  2022-07-31 22:45                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 22:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 2452 bytes --]


>>> Emacs is not in the business of preventing people from shooting 
>>> themselves in the foot.  If we need this narrowing to be enforced 
>>> because Emacs would otherwise crash, then it's OK, but if not, then we 
>>> *should* provide a way to undo it.
>>
>> And how do you define "crash"?
>
> Core dump.
>

Aha, interesting!  So an infinite loop is not a crash, according to that 
definition?

>> Is Emacs becoming unresponsive because an operation takes say two 
>> minutes to complete and cannot be interrupted a "crash"?  Or is a 
>> "crash" only a segfault?
>
> Try `M-: (use-global-map (make-keymap)) RET`
>
> Should we prevent users from doing that?
>

It's a misleading question.  No "user" would ever do that.  Sure, it's a 
nice example, but only an Elisp hacker would do that, in the middle of a 
debugging session, and they would do that on purpose (although perhaps 
without knowing the effect in advance).  Which has nothing to do with a 
regular user who just opens a file.

>
> Let's focus on making it easy to make it work well, rather than making 
> it impossible to make it work poorly.
>

You lost me here.  I've read that sentence twenty times, and cannot 
understand what you mean.

>> But you didn't answer my question: is it not possible to design a 
>> version of syntax-ppss that would approximate, with some heuristics, 
>> what syntax-ppss does, but on a smaller chunk of the buffer?
>
> The answer is basically "no" but even before getting there, I have to 
> remind the reader that it hasn't really been requested.
>

It has, now 😉  Not "requested", however.  I respectfully, with all due 
respect, ask whether doing such a thing would be possible.

>
> In order to know if POS is within a string (which is one of the main 
> uses of `syntax-ppss`), you basically need to know if there's an odd or 
> even number of quotes before POS, which fundamentally needs to look at 
> all the chars between POS and BOB.  Of course we use a cache to try and 
> avoid looking at them over and over again, but the cache can't be of any 
> use the first time around.
>

But if you use heuristics, as I said, you don't need to look at all the 
chars between BOB and POS.  You try your best to guess, on a small (a few 
kilobytes) portion of the buffer, where the strings most likely start and 
stop.  And if you're only right in 95% of the cases, that's more than 
fine.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:03                         ` Dmitry Gutov
@ 2022-07-31 22:23                           ` Gregory Heytings
  2022-07-31 22:42                             ` Dmitry Gutov
  2022-07-31 22:47                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 22:23 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>> Yes, I wasn't clear enough, I should have written "when the file is 
>> large enough (say 1 GB), even without long lines".
>
> What kind of scenario are you thinking of that would exhibit this 
> slowness in a 1 GB file?
>
> If we're talking about syntax-ppss only, regular editing operations 
> (typing and deleting code) limited to, say, one screen should trigger 
> only a rescan of a limited area inside such buffer.
>

That's true, but with such big files, the initial scan is slow.  So the 
scenario is simple: you open a big enough file, type M->, and C-p.  M-> 
will be instantaneous, and C-p will take a while, because of syntax-ppss.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:23                           ` Gregory Heytings
@ 2022-07-31 22:42                             ` Dmitry Gutov
  2022-07-31 22:50                               ` Gregory Heytings
  2022-07-31 23:00                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 22:47                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-07-31 22:42 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 01.08.2022 01:23, Gregory Heytings wrote:
> 
>>> Yes, I wasn't clear enough, I should have written "when the file is 
>>> large enough (say 1 GB), even without long lines".
>>
>> What kind of scenario are you thinking of that would exhibit this 
>> slowness in a 1 GB file?
>>
>> If we're talking about syntax-ppss only, regular editing operations 
>> (typing and deleting code) limited to, say, one screen should trigger 
>> only a rescan of a limited area inside such buffer.
>>
> 
> That's true, but with such big files, the initial scan is slow.  So the 
> scenario is simple: you open a big enough file, type M->, and C-p.  M-> 
> will be instantaneous, and C-p will take a while, because of syntax-ppss.

Yeah, ok. If you are going to visit EOB, a single full scan seems 
unavoidable. I don't think 'M->' should be instantaneous, though: to 
display the last page, you need to fontify it, and font-lock depends on 
syntax-ppss.

But one big slow scan (and how slow it is actually depends on a 
particular major mode) followed by responsive editing sounds much better 
than what we've had before.

So I would recommend against trying to solve this part right now. And 
yes, some faster approximations of syntax-propertize-rules are possible, 
especially if we ask individual language modes to provide "simpler" 
syntax rules.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:06                           ` Gregory Heytings
@ 2022-07-31 22:45                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 23:12                               ` Gregory Heytings
                                                 ` (2 more replies)
  0 siblings, 3 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-31 22:45 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

>> Try `M-: (use-global-map (make-keymap)) RET`
>> Should we prevent users from doing that?
> It's a misleading question.  No "user" would ever do that.  Sure, it's
> a nice example, but only an Elisp hacker would do that, in the middle of
> a debugging session, and they would do that on purpose (although perhaps
> without knowing the effect in advance).  Which has nothing to do with
> a regular user who just opens a file.

FWIW, the above is my standard example because I ended up doing exactly
that by accident, locking myself out of the session I was trying to
debug, so in a sense, I'd have been happy (that one time) if Emacs had
prevented me from doing it.

>> Let's focus on making it easy to make it work well, rather than making it
>> impossible to make it work poorly.
> You lost me here.  I've read that sentence twenty times, and cannot
> understand what you mean.

Your current code makes it impossible for a major mode to make Emacs
slow by widening in a too-long-line.  I'd prefer if we made it easy
(i.e. the default) for Emacs to work well in that case, without making
it impossible for the major mode to mess things up.

E.g. use narrowing (and arrange for the known widening culprit to be
disabled) so that the default behavior is sane, but sllow an ELisp
package from re-widening (possibly using a specific call to do that) if
it thinks it's a good idea (even if it may turn out not to be so).

> But if you use heuristics, as I said, you don't need to look at all the
> chars between BOB and POS.  You try your best to guess, on a small (a few
> kilobytes) portion of the buffer, where the strings most likely start and
> stop.  And if you're only right in 95% of the cases, that's more than fine.

For specific languages, you can use various heuristics to guess which
quotes start and which quotes end a string (for some languages you can
even do it reliably), but `syntax-ppss` handles all kinds of languages
(and doesn't have access to such heuristics currently), such as ELisp
where it's hard to do it well.

I'd prefer to first see concrete examples where speeding up the
"syntax-ppss in a 1GB buffer" would make a significant difference to the
end-user's experience.  Then we can think about what's the better way to
solve the problem (which may be to just give up on font-lock altogether,
or maybe to refine the `syntax.el` code (maybe move some of it to C), or
to speed up `parse-partial-sexp`, or maybe let major modes provide
those heuristics to find a "safe point" again (these used to exist, see
`syntax-begin-function`, for example, but they tended to suck)).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  8:41                                                                                               ` Lars Ingebrigtsen
@ 2022-07-31 22:45                                                                                                 ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 22:45 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier


>> IOW, the issue for which I wanted to hear your opinion was whether you 
>> think it's okay to preclude, in a buffer with long lines, all 
>> pre/post-command-hooks from accessing the entire buffer.
>
> Yes, I think that's reasonable.  There will inevitably be many modes out 
> there that'll wedge themselves in the presence of gargantuan lines, and 
> limiting these hooks is a reasonable line of defence.
>

I've now done so, in a new branch feature/long-lines-improvements (to 
avoid breaking master).  A nice bonus is that it is not necessary anymore 
to disable flyspell-mode or flymake-mode for example, even in huge files.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:23                           ` Gregory Heytings
  2022-07-31 22:42                             ` Dmitry Gutov
@ 2022-07-31 22:47                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 23:15                               ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-31 22:47 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> That's true, but with such big files, the initial scan is slow.  So the
> scenario is simple: you open a big enough file, type M->, and C-p.  M-> will
> be instantaneous, and C-p will take a while, because of syntax-ppss.

Really?  I'd expect that `M->` is slow because of `syntax-ppss`
(called by font-lock) and then `C-p` is instantaneous.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:42                             ` Dmitry Gutov
@ 2022-07-31 22:50                               ` Gregory Heytings
  2022-07-31 23:21                                 ` Gregory Heytings
                                                   ` (2 more replies)
  2022-07-31 23:00                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 22:50 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>
> But one big slow scan (and how slow it is actually depends on a 
> particular major mode) followed by responsive editing sounds much better 
> than what we've had before.
>

Indeed.  But then the question is: is it possible to do that scan while 
opening the file, before it becomes editable?  It is way better to wait a 
few seconds more while the file is being opened than to wait before two 
basic motion commands when the file is already opened.

>
> So I would recommend against trying to solve this part right now.
>

It doesn't only solve the syntax-ppss problem, it also makes flyspell-mode 
usable in such files, for example.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31  7:11                                                                               ` Eli Zaretskii
@ 2022-07-31 22:54                                                                                 ` Gregory Heytings
  2022-08-01 12:38                                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 22:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> Here are my thoughts.
>

Thanks!

>
> First, I think the setting of narrowed_begv and narrowed_zv should be 
> done in 'reseat', not in init_iterator.  The latter always calls the 
> former when invoked to start iteration of buffer text, but we also call 
> 'reseat' from other places, when we "jump" the iterator to a new place, 
> potentially far from the last.  If nothing else, this should help with 
> truncate-lines, where using window_point is basically right only for the 
> point's line.  And currently, init_iterator computes and sets the 
> narrowing even when we iterate on strings, which is unneeded and 
> incorrect.
>
> Whether always to correct narrowed_begv and narrowed_zv if we are 
> reseating to a position outside the narrowing, is a more complicated 
> question.  The basic problem here is that we don't have an easy way of 
> restoring the previous narrowing (except by unwind_protect), and the 
> display code sometimes calls init_iterator or start_display using the 
> iterator that already has these members set by previous code, a 
> situation which we currently cannot easily detect.  However, when this 
> code runs as part of redisplay, we generally don't expect the original 
> narrowing to be insufficient, except perhaps in the truncate-line case.
>
> So I think we should correct narrowed_begv and narrowed_zv only if 
> either the 'redisplaying_p' flag is reset (meaning the display code is 
> being invoked outside of redisplay) or it->line_wrap == TRUNCATE.
>

I admit I do not really understand your last two paragraphs, but I tried 
to do what you suggested, and it doesn't seem to introduce regressions, so 
I pushed it to the new feature branch.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:42                             ` Dmitry Gutov
  2022-07-31 22:50                               ` Gregory Heytings
@ 2022-07-31 23:00                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-07-31 23:00 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Gregory Heytings, Eli Zaretskii

> But one big slow scan (and how slow it is actually depends on a particular
> major mode)

Indeed, I think the `syntax-ppss` part itself should be fast enough even
if very large files.  But the `syntax-propertize` part (which is called
by `syntax-ppss`) can take a long time in some major modes.

In those major modes where that's a problem (i.e. major modes that have
a complex `syntax-propertize-function` and that also happen to be used
in very large files) maybe it would be worth (re)introducing some sort
of `syntax(-propertize)-begin-function`.  But these kinds of heuristics
have proved problematic over the years (and they'd introduce extra
complexity since we won't be able to just rely on
a `syntax-propertize-done` high-watermark to know what's been
propertized and what hasn't, combined with the interaction with the
`syntax-ppss` cache), so we'd have to try a few different approaches.

In any case this is not a long-lines problem.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:45                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-31 23:12                               ` Gregory Heytings
  2022-08-01  7:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-01 11:58                               ` Eli Zaretskii
  2022-08-01 18:09                               ` Gregory Heytings
  2 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 23:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>>> Let's focus on making it easy to make it work well, rather than making 
>>> it impossible to make it work poorly.
>>
>> You lost me here.  I've read that sentence twenty times, and cannot 
>> understand what you mean.
>
> Your current code makes it impossible for a major mode to make Emacs 
> slow by widening in a too-long-line.
>

Which is a good thing, isn't it?  Or do you think that it's okay for Emacs 
to become unresponsive just because it is busy highlighting characters in 
the buffer?

>
> I'd prefer if we made it easy (i.e. the default) for Emacs to work well 
> in that case, without making it impossible for the major mode to mess 
> things up.
>

Sure, I would also welcome a better solution.  Until it materializes, the 
only reasonable way is to use a less optimal solution.

>
> For specific languages, you can use various heuristics to guess which 
> quotes start and which quotes end a string (for some languages you can 
> even do it reliably), but `syntax-ppss` handles all kinds of languages 
> (and doesn't have access to such heuristics currently), such as ELisp 
> where it's hard to do it well.
>

Don't worry, I've not yet seen an Elisp file with long lines.  If using 
various heuristics is sometimes or often feasible, that's already a good 
thing.

>
> I'd prefer to first see concrete examples where speeding up the 
> "syntax-ppss in a 1GB buffer" would make a significant difference to the 
> end-user's experience.
>

I just sent one such example to Dmitry.  And I pointed to another possible 
solution, namely to scan the whole buffer while opening it (instead of 
scanning it lazily, which is IIUC what currently happens).  From a user 
viewpoint, it's understandable that opening a big file takes some time.

>
> Then we can think about what's the better way to solve the problem 
> (which may be to just give up on font-lock altogether,
>

That would be regrettable, given the amount of effort that has been put 
into making font-lock work "as much as possible".

>
> or maybe to refine the `syntax.el` code (maybe move some of it to C), or 
> to speed up `parse-partial-sexp`, or maybe let major modes provide those 
> heuristics to find a "safe point" again (these used to exist, see 
> `syntax-begin-function`, for example, but they tended to suck)).
>

All this is possible.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:47                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-07-31 23:15                               ` Gregory Heytings
  2022-08-01  7:02                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 23:15 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>> That's true, but with such big files, the initial scan is slow.  So the 
>> scenario is simple: you open a big enough file, type M->, and C-p. 
>> M-> will be instantaneous, and C-p will take a while, because of 
>> syntax-ppss.
>
> Really?  I'd expect that `M->` is slow because of `syntax-ppss` (called 
> by font-lock) and then `C-p` is instantaneous.
>

Yes, really.  M-> is fast because syntax-ppss is called inside 
fontification-functions, which are evaluated in a small portion of the 
buffer (with locked narrowing).  And C-p is slow because post-command-hook 
is (or rather was) not subjected to the same locked narrowing.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:50                               ` Gregory Heytings
@ 2022-07-31 23:21                                 ` Gregory Heytings
  2022-08-01  1:23                                 ` Dmitry Gutov
  2022-08-01 12:04                                 ` Eli Zaretskii
  2 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-07-31 23:21 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>> But one big slow scan (and how slow it is actually depends on a 
>> particular major mode) followed by responsive editing sounds much 
>> better than what we've had before.
>
> Indeed.  But then the question is: is it possible to do that scan while 
> opening the file, before it becomes editable?  It is way better to wait 
> a few seconds more while the file is being opened than to wait before 
> two basic motion commands when the file is already opened.
>

Sorry, I meant "to wait _between_ two basic motion commands".





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:50                               ` Gregory Heytings
  2022-07-31 23:21                                 ` Gregory Heytings
@ 2022-08-01  1:23                                 ` Dmitry Gutov
  2022-08-01 12:08                                   ` Eli Zaretskii
  2022-08-01 12:04                                 ` Eli Zaretskii
  2 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-01  1:23 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 01.08.2022 01:50, Gregory Heytings wrote:
> 
>>
>> But one big slow scan (and how slow it is actually depends on a 
>> particular major mode) followed by responsive editing sounds much 
>> better than what we've had before.
>>
> 
> Indeed.  But then the question is: is it possible to do that scan while 
> opening the file, before it becomes editable?

IIUC this state of affairs is caused by your chosen approach to speeding 
up font-lock (hard narrowing while it is called), which makes the 
initial call to syntax-ppss happen inside that narrowing as well.

The alternative being that font-lock would call syntax-ppss right away 
with no restriction, but then only apply highlighting to limited parts 
of the buffer.

 > It is way better to wait
 > a few seconds more while the file is being opened than to wait before
 > two basic motion commands when the file is already opened.

I agree, yes.

>> So I would recommend against trying to solve this part right now.
>>
> 
> It doesn't only solve the syntax-ppss problem, it also makes 
> flyspell-mode usable in such files, for example.

Does flyspell-mode always scan the full buffer?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 23:15                               ` Gregory Heytings
@ 2022-08-01  7:02                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-01  8:38                                   ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-01  7:02 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

>>> That's true, but with such big files, the initial scan is slow.  So the
>>> scenario is simple: you open a big enough file, type M->, and C-p. M->
>>> will be instantaneous, and C-p will take a while, because of syntax-ppss.
>> Really?  I'd expect that `M->` is slow because of `syntax-ppss` (called by
>> font-lock) and then `C-p` is instantaneous.
> Yes, really.  M-> is fast because syntax-ppss is called inside
> fontification-functions, which are evaluated in a small portion of the
> buffer (with locked narrowing).

Ah, you're talking about a file with a long line.  I was thinking of just
a normal large file.

> And C-p is slow because post-command-hook is (or rather was) not
> subjected to the same locked narrowing.

I wonder what `C-p` has to do with `post-command-hook`.
After all, that same `post-command-hook` is also run after `M->` as well.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 23:12                               ` Gregory Heytings
@ 2022-08-01  7:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-01  7:51                                   ` Gregory Heytings
  2022-08-01 12:12                                   ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-01  7:11 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

>> Your current code makes it impossible for a major mode to make Emacs slow
>> by widening in a too-long-line.
> Which is a good thing, isn't it?

No: we don't want to prevent people from shooting themselves in the foot.
Basically if it says "impossible" there's a good chance it's not a good thing.

> Or do you think that it's okay for Emacs to become unresponsive just
> because it is busy highlighting characters in the buffer?

Widening in a too long line will not necessarily lead to an unresponsive
Emacs, so by preventing it you're also preventing useful cases.

>> I'd prefer if we made it easy (i.e. the default) for Emacs to work well in
>> that case, without making it impossible for the major mode to mess
>> things up.
> Sure, I would also welcome a better solution.  Until it materializes, the
> only reasonable way is to use a less optimal solution.

It's not difficult to make it possible to re-widen after your narrowing.

> I just sent one such example to Dmitry.  And I pointed to another possible
> solution, namely to scan the whole buffer while opening it (instead of
> scanning it lazily, which is IIUC what currently happens).  From a user
> viewpoint, it's understandable that opening a big file takes some time.

We used to scan eagerly in the background with `jit-lock-stealth`, but that was
not very popular (eats up your battery for fairly little benefit).
We also have "lazier" highlighting via `jit-lock-defer`, but that hasn't
been adapted to `syntax-ppss`.  It might be with investigating.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  7:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-01  7:51                                   ` Gregory Heytings
  2022-08-01 12:12                                   ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01  7:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 1976 bytes --]


>>> Your current code makes it impossible for a major mode to make Emacs 
>>> slow by widening in a too-long-line.
>>
>> Which is a good thing, isn't it?
>
> No: we don't want to prevent people from shooting themselves in the 
> foot. Basically if it says "impossible" there's a good chance it's not a 
> good thing.
>

Apparently you don't understand what I'm doing.  I'm trying to make Emacs 
handle exceptional cases gracefully, by:

1. adding as few hard limits as possible, and

2. keeping as many functionalities as possible (my task would have been _a 
lot_ easier if I had just stopped after saying "just open these files with 
M-x find-file-literally").

If you think there's anything wrong with that approach, I'll have to 
conclude that we live on different planets.  On Earth and on Theory, 
maybe?

And no, it's not even "impossible" for Emacs uses to shoot themselves in 
the foot, so your argument collapses.  Those who prefer the "Traditional 
Emacs Approach"™ when it encounters such exceptional cases, that is, 
those who:

1. do not want to add any limits whatsoever in any circumstance, and

2. agree to take the risk of making Emacs completely unfonctional,

can simply add (setq long-line-threshold nil) in their init file.

>> I just sent one such example to Dmitry.  And I pointed to another 
>> possible solution, namely to scan the whole buffer while opening it 
>> (instead of scanning it lazily, which is IIUC what currently happens). 
>> From a user viewpoint, it's understandable that opening a big file 
>> takes some time.
>
> We used to scan eagerly in the background with `jit-lock-stealth`, but 
> that was not very popular (eats up your battery for fairly little 
> benefit). We also have "lazier" highlighting via `jit-lock-defer`, but 
> that hasn't been adapted to `syntax-ppss`.  It might be worth 
> investigating.
>

It might be, indeed.  But I will most probably not do that myself.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  7:02                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-01  8:38                                   ` Gregory Heytings
  2022-08-01  9:34                                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01  8:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>>>> That's true, but with such big files, the initial scan is slow.  So 
>>>> the scenario is simple: you open a big enough file, type M->, and 
>>>> C-p. M-> will be instantaneous, and C-p will take a while, because of 
>>>> syntax-ppss.
>>>
>>> Really?  I'd expect that `M->` is slow because of `syntax-ppss` 
>>> (called by font-lock) and then `C-p` is instantaneous.
>>
>> Yes, really.  M-> is fast because syntax-ppss is called inside 
>> fontification-functions, which are evaluated in a small portion of the 
>> buffer (with locked narrowing).
>>
>> [...]
>> 
>> And C-p is slow because post-command-hook is (or rather was) not 
>> subjected to the same locked narrowing.
>
> I wonder what `C-p` has to do with `post-command-hook`.
>
> After all, that same `post-command-hook` is also run after `M->` as 
> well.
>

You're right here.  Technically it's M-> that is slow.  But it's not what 
the user sees (hence my own confusion): the effect of M-> is immediately 
visible.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  8:38                                   ` Gregory Heytings
@ 2022-08-01  9:34                                     ` Gregory Heytings
  2022-08-01  9:46                                       ` Dmitry Gutov
                                                         ` (2 more replies)
  0 siblings, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01  9:34 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


Given your and Dmitry's feedback, I just tried to add an explicit call to 
(syntax-ppss (point-max)), without narrowing, when the buffer is opened 
(see below).

The problem is that this is, as I said, slow.  On my laptop, opening a 1 
GB file takes about 6 seconds.  The call to syntax-ppss adds 70 seconds, 
so opening a large file becomes an order of magnitude slower (13 times 
slower).  Which I think is too much for the added benefit.

diff --git a/src/buffer.c b/src/buffer.c
index a07194aef7..bff6dce1d7 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -986,6 +986,7 @@ reset_buffer (register struct buffer *b)
    b->clip_changed = 0;
    b->prevent_redisplay_optimizations_p = 1;
    b->long_line_optimizations_p = 0;
+  b->long_line_syntax_ppss_done_p = 0;
    bset_backed_up (b, Qnil);
    bset_local_minor_modes (b, Qnil);
    BUF_AUTOSAVE_MODIFF (b) = 0;
@@ -2448,6 +2449,7 @@ #define swapfield_(field, type) \
    current_buffer->prevent_redisplay_optimizations_p = 1;
    other_buffer->prevent_redisplay_optimizations_p = 1;
    swapfield (long_line_optimizations_p, bool_bf);
+  swapfield (long_line_syntax_ppss_done_p, bool_bf);
    swapfield (overlays_before, struct Lisp_Overlay *);
    swapfield (overlays_after, struct Lisp_Overlay *);
    swapfield (overlay_center, ptrdiff_t);
diff --git a/src/buffer.h b/src/buffer.h
index 47b4bdf749..3e020f1953 100644
--- a/src/buffer.h
+++ b/src/buffer.h
@@ -686,6 +686,8 @@ #define BVAR(buf, field) ((buf)->field ## _)
       display optimizations must be used.  */
    bool_bf long_line_optimizations_p : 1;

+  bool_bf long_line_syntax_ppss_done_p : 1;
+
    /* List of overlays that end at or before the current center,
       in order of end-position.  */
    struct Lisp_Overlay *overlays_before;
diff --git a/src/xdisp.c b/src/xdisp.c
index 8a19b3bda9..d70ab6f9c1 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4391,6 +4391,13 @@ handle_fontified_prop (struct it *it)

        if (current_buffer->long_line_optimizations_p)
  	{
+	  if (!current_buffer->long_line_syntax_ppss_done_p)
+	    {
+	      current_buffer->long_line_syntax_ppss_done_p = 1;
+	      specbind (Qinhibit_quit, Qt);
+	      CALLN (Ffuncall, intern ("syntax-ppss"), Fpoint_max ());
+	      unbind_to (count, Qnil);
+	    }
  	  ptrdiff_t begv = it->narrowed_begv ? it->narrowed_begv : BEGV;
  	  ptrdiff_t zv = it->narrowed_zv;
  	  ptrdiff_t charpos = IT_CHARPOS (*it);





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  9:34                                     ` Gregory Heytings
@ 2022-08-01  9:46                                       ` Dmitry Gutov
  2022-08-01  9:56                                         ` Gregory Heytings
  2022-08-01 11:06                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02  3:01                                       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-01  9:46 UTC (permalink / raw)
  To: Gregory Heytings, Stefan Monnier; +Cc: 56682, Eli Zaretskii

On 01.08.2022 12:34, Gregory Heytings wrote:
> 
> Given your and Dmitry's feedback, I just tried to add an explicit call 
> to (syntax-ppss (point-max)), without narrowing, when the buffer is 
> opened (see below).
> 
> The problem is that this is, as I said, slow.  On my laptop, opening a 1 
> GB file takes about 6 seconds.  The call to syntax-ppss adds 70 seconds, 
> so opening a large file becomes an order of magnitude slower (13 times 
> slower).  Which I think is too much for the added benefit.

But that only has to happen when the buffer is scrolled to the bottom, 
right?

And syntax-ppss's speed depends on the rules applied by the particular 
major mode. Those could be sped up. Some optimization of this function's 
speed is not out of the question either.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  9:46                                       ` Dmitry Gutov
@ 2022-08-01  9:56                                         ` Gregory Heytings
  2022-08-01 10:00                                           ` Gregory Heytings
  2022-08-01 10:46                                           ` Dmitry Gutov
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01  9:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 1258 bytes --]


>> Given your and Dmitry's feedback, I just tried to add an explicit call 
>> to (syntax-ppss (point-max)), without narrowing, when the buffer is 
>> opened (see below).
>> 
>> The problem is that this is, as I said, slow.  On my laptop, opening a 
>> 1 GB file takes about 6 seconds.  The call to syntax-ppss adds 70 
>> seconds, so opening a large file becomes an order of magnitude slower 
>> (13 times slower).  Which I think is too much for the added benefit.
>
> But that only has to happen when the buffer is scrolled to the bottom, 
> right?
>

No, it happens when the buffer is opened.  Given the importance that you 
and Stefan seem to give to that function, it is, with the patch I sent in 
my previous post, called once on the whole buffer (without any narrowing) 
when the file is opened.  Later calls (inside fontification-functions or 
post-command-hook) are subject to a forced narrowing.

>
> And syntax-ppss's speed depends on the rules applied by the particular 
> major mode. Those could be sped up. Some optimization of this function's 
> speed is not out of the question either.
>

They would be more than welcome.  In fact, without such optimizations, it 
would be unreasonable to do what the patch does.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  9:56                                         ` Gregory Heytings
@ 2022-08-01 10:00                                           ` Gregory Heytings
  2022-08-01 10:46                                           ` Dmitry Gutov
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 10:00 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>> And syntax-ppss's speed depends on the rules applied by the particular 
>> major mode. Those could be sped up. Some optimization of this 
>> function's speed is not out of the question either.
>
> They would be more than welcome.  In fact, without such optimizations, 
> it would be unreasonable to do what the patch does.
>

To clarify: it should become at least six times faster.  That would mean 
that opening a large file would take 3N seconds instead of N seconds, 
which would be okay.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  9:56                                         ` Gregory Heytings
  2022-08-01 10:00                                           ` Gregory Heytings
@ 2022-08-01 10:46                                           ` Dmitry Gutov
  2022-08-01 11:01                                             ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-01 10:46 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 01.08.2022 12:56, Gregory Heytings wrote:
> 
>>> Given your and Dmitry's feedback, I just tried to add an explicit 
>>> call to (syntax-ppss (point-max)), without narrowing, when the buffer 
>>> is opened (see below).
>>>
>>> The problem is that this is, as I said, slow.  On my laptop, opening 
>>> a 1 GB file takes about 6 seconds.  The call to syntax-ppss adds 70 
>>> seconds, so opening a large file becomes an order of magnitude slower 
>>> (13 times slower).  Which I think is too much for the added benefit.
>>
>> But that only has to happen when the buffer is scrolled to the bottom, 
>> right?
>>
> 
> No, it happens when the buffer is opened.  Given the importance that you 
> and Stefan seem to give to that function, it is, with the patch I sent 
> in my previous post, called once on the whole buffer (without any 
> narrowing) when the file is opened.

But if the buffer is not scrolled to the end, shouldn't it be called 
with a position that's close to the beginning?

That shouldn't force the full buffer scan, meaning this call should 
complete quickly.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 10:46                                           ` Dmitry Gutov
@ 2022-08-01 11:01                                             ` Gregory Heytings
  2022-08-02 14:53                                               ` Dmitry Gutov
  2022-08-02 21:18                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 11:01 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 1610 bytes --]


>> No, it happens when the buffer is opened.  Given the importance that 
>> you and Stefan seem to give to that function, it is, with the patch I 
>> sent in my previous post, called once on the whole buffer (without any 
>> narrowing) when the file is opened.
>
> But if the buffer is not scrolled to the end, shouldn't it be called 
> with a position that's close to the beginning?
>
> That shouldn't force the full buffer scan, meaning this call should 
> complete quickly.
>

I'm trying to reconcile two conflicting constraints.

It is necessary to add a locked narrowing around fontification-functions 
and pre/post-command-hook to ensure that Emacs remains responsive.

At the same time, you and Stefan tell me that syntax-ppss does an 
important job and will not do it correctly with such a locked narrowing, 
IOW, that at least syntax-ppss should be called without a locked 
narrowing.  But you also tell me that its result is cached so that a full 
buffer scan isn't necessary anymore when it has happened at least once.

So what I'm suggesting is to do a full buffer scan immediately, when the 
file is opened, without any narrowing.  If that happens, later calls to 
syntax-ppss inside fontification-functions and pre/post-command-hook will 
use the cached result of the initial scan, and will do their job correctly 
even with a locked narrowing.

Unless I misunderstand something, I think (and my tests seem to confirm) 
that that would be a workable solution, provided that the initial scan is 
reasonably fast, that is, at least six times faster than it is now.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  9:34                                     ` Gregory Heytings
  2022-08-01  9:46                                       ` Dmitry Gutov
@ 2022-08-01 11:06                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-01 11:23                                         ` Gregory Heytings
  2022-08-02  3:01                                       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-01 11:06 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> The problem is that this is, as I said, slow.  On my laptop, opening a 1 GB
> file takes about 6 seconds.  The call to syntax-ppss adds 70 seconds, so
> opening a large file becomes an order of magnitude slower (13 times slower).

It's meaningless to talk about the time taken by `syntax-ppss` without specifying
the major mode that was in use.
You might also want to compare to the time to run

    (parse-partial-sexp (point-min) (point-max))

which is a kind of "speed of light" for `syntax-ppss`.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:06                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-01 11:23                                         ` Gregory Heytings
  2022-08-01 21:53                                           ` Dmitry Gutov
  2022-08-02 21:32                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 11:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>> The problem is that this is, as I said, slow.  On my laptop, opening a 
>> 1 GB file takes about 6 seconds.  The call to syntax-ppss adds 70 
>> seconds, so opening a large file becomes an order of magnitude slower 
>> (13 times slower).
>
> It's meaningless to talk about the time taken by `syntax-ppss` without 
> specifying the major mode that was in use.
>

It isn't.  The benchmark above was with a JSON file (js-mode), but you'll 
see the same ratio with an Elisp file for example:

for I in $(seq 1 2500); do cat lisp/simple.el; done > complex.el

That file opens in about 5 seconds, and (benchmark-run 1 (syntax-ppss 
(point-max))) takes about 45 seconds.

Sure, there are perhaps modes that are slower, but my tests seem to 
indicate that the 1/10 ratio is correct, or IOW that syntax-ppss is an 
order of magnitude slower than opening the file.

>
> You might also want to compare to the time to run
>
> (parse-partial-sexp (point-min) (point-max))
>
> which is a kind of "speed of light" for `syntax-ppss`.
>

What do you mean?  With the above file (benchmark-run 1 
(parse-partial-sexp (point-min) (point-max))) takes 55 seconds.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:45                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 23:12                               ` Gregory Heytings
@ 2022-08-01 11:58                               ` Eli Zaretskii
  2022-08-02  8:10                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05 12:59                                 ` Dmitry Gutov
  2022-08-01 18:09                               ` Gregory Heytings
  2 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 11:58 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org
> Date: Sun, 31 Jul 2022 18:45:16 -0400
> 
> Your current code makes it impossible for a major mode to make Emacs
> slow by widening in a too-long-line.  I'd prefer if we made it easy
> (i.e. the default) for Emacs to work well in that case, without making
> it impossible for the major mode to mess things up.
> 
> E.g. use narrowing (and arrange for the known widening culprit to be
> disabled) so that the default behavior is sane, but sllow an ELisp
> package from re-widening (possibly using a specific call to do that) if
> it thinks it's a good idea (even if it may turn out not to be so).

The problem is that too many popular major modes, notably including
those whose files tend to like having very long lines, are slow in
their fontifications.  In fact, I'd challenge you to find a major mode
that doesn't present such a degradation in behavior with long lines
(you should be able to measure it by comparing performance and
responsiveness in a buffer with and without font-lock).

Given this situation, it sounds reasonable to start by restricting
font-lock.

As I wrote elsewhere, I'm okay with extending 'widen' so that it could
"unlock" the locked narrowing, which could then be used in major modes
that convince us their performance is adequate (or clearly announce in
their docs that they don't care about files with long lines ;-).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:50                               ` Gregory Heytings
  2022-07-31 23:21                                 ` Gregory Heytings
  2022-08-01  1:23                                 ` Dmitry Gutov
@ 2022-08-01 12:04                                 ` Eli Zaretskii
  2022-08-01 12:20                                   ` Gregory Heytings
  2022-08-02  7:48                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 12:04 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier, dgutov

> Date: Sun, 31 Jul 2022 22:50:23 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>, 
>     Stefan Monnier <monnier@iro.umontreal.ca>
> 
> It is way better to wait a few seconds more while the file is being
> opened than to wait before two basic motion commands when the file
> is already opened.

I don't think I agree.  Having to wait for the initial display is
pretty annoying.  In fact, this was how the original font-lock worked:
it was triggered by find-file, and would fontify the entire buffer
before showing any of it.  That led very quickly to the likes of
lazy-lock and fast-lock, and was eventually fixed by jit-lock.

Try visiting a very large file encoded in something of the ISO-2022
family, and you will see what I mean.  It's an annoyance.

I have jit-lock-stealth enabled because I don't like the initial wait,
but don't want any subsequent waits, either.  We also have the (little
used, for some reason) jit-lock-defer feature: if you set
jit-lock-defer-time to a large enough value, M-> followed by C-p will
not be as slow as they are now, I think.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  1:23                                 ` Dmitry Gutov
@ 2022-08-01 12:08                                   ` Eli Zaretskii
  2022-08-02  1:05                                     ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 12:08 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Mon, 1 Aug 2022 04:23:21 +0300
> Cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
>  Stefan Monnier <monnier@iro.umontreal.ca>
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> IIUC this state of affairs is caused by your chosen approach to speeding 
> up font-lock (hard narrowing while it is called), which makes the 
> initial call to syntax-ppss happen inside that narrowing as well.
> 
> The alternative being that font-lock would call syntax-ppss right away 
> with no restriction, but then only apply highlighting to limited parts 
> of the buffer.

AFAIU, this seems to assume that highlighting is much faster than
syntax-ppss.  Is that a given?  If not, I don't think I understand how
this could help.

>  > It is way better to wait
>  > a few seconds more while the file is being opened than to wait before
>  > two basic motion commands when the file is already opened.
> 
> I agree, yes.

I don't.

> > It doesn't only solve the syntax-ppss problem, it also makes 
> > flyspell-mode usable in such files, for example.
> 
> Does flyspell-mode always scan the full buffer?

flyspell-mode doesn't scan anything but the text you type, as you type
it.  Anything else must be explicitly invoked by the user.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  7:11                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-01  7:51                                   ` Gregory Heytings
@ 2022-08-01 12:12                                   ` Eli Zaretskii
  2022-08-01 21:54                                     ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 12:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  56682@debbugs.gnu.org
> Date: Mon, 01 Aug 2022 03:11:43 -0400
> 
> >> Your current code makes it impossible for a major mode to make Emacs slow
> >> by widening in a too-long-line.
> > Which is a good thing, isn't it?
> 
> No: we don't want to prevent people from shooting themselves in the foot.

But the problem here is that it isn't "people shooting themselves in
the foot", it's that "major modes shoot their users in the foot".
IOW, the ones who shoot and the ones who get shot are not the same
people.

What do you want a user to do when he/she is faced with a mode which
makes Emacs very slow?  Such a user cannot blame his/herself; in many
cases the use doesn't even know enough to realize it's the major mode
and its fontifications that are the culprit.

> Widening in a too long line will not necessarily lead to an unresponsive
> Emacs, so by preventing it you're also preventing useful cases.

The experience up till now says otherwise.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:04                                 ` Eli Zaretskii
@ 2022-08-01 12:20                                   ` Gregory Heytings
  2022-08-01 13:04                                     ` Eli Zaretskii
  2022-08-02  7:51                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02  7:48                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 12:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier, dgutov


>> It is way better to wait a few seconds more while the file is being 
>> opened than to wait before two basic motion commands when the file is 
>> already opened.
>
> I don't think I agree.  Having to wait for the initial display is pretty 
> annoying.
>

It is, indeed.  But here we are talking about big files (say 1 GB).  In 
that case (and in that case only) it is much better, from a user 
viewpoint, to wait say 20 seconds before the file is opened and being at 
that point able to freely move through the file, instead of waiting only 6 
seconds, and then having to wait another 10 seconds between two motion 
commands like M-> C-p.  In fact, no user expects that a 1 GB file would 
open instantaneously.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:54                                                                                 ` Gregory Heytings
@ 2022-08-01 12:38                                                                                   ` Eli Zaretskii
  2022-08-01 12:51                                                                                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 12:38 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Sun, 31 Jul 2022 22:54:00 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Whether always to correct narrowed_begv and narrowed_zv if we are 
> > reseating to a position outside the narrowing, is a more complicated 
> > question.  The basic problem here is that we don't have an easy way of 
> > restoring the previous narrowing (except by unwind_protect), and the 
> > display code sometimes calls init_iterator or start_display using the 
> > iterator that already has these members set by previous code, a 
> > situation which we currently cannot easily detect.  However, when this 
> > code runs as part of redisplay, we generally don't expect the original 
> > narrowing to be insufficient, except perhaps in the truncate-line case.
> >
> > So I think we should correct narrowed_begv and narrowed_zv only if 
> > either the 'redisplaying_p' flag is reset (meaning the display code is 
> > being invoked outside of redisplay) or it->line_wrap == TRUNCATE.
> >
> 
> I admit I do not really understand your last two paragraphs, but I tried 
> to do what you suggested, and it doesn't seem to introduce regressions, so 
> I pushed it to the new feature branch.

Thanks.  I will try to explain some more; feel free to ask questions
if something's still unclear.

You can see in the sources that both init_iterator and start_display
are many times called with 'struct it' that was used by the caller, so
it could already have the narrowed_begv and narrowed_zv members
initialized by the caller.  If we discover that we want to recalculate
these values, we'd then need to restore the previous value before we
return, so that the caller will see the same values it used before the
call.  But we have no easy way of doing that, and moreover, cannot
even detect that these members were initialized.  The inability to
detect that they were initialized is due to the fact that we don't
initialize 'struct it' before we call init_iterator, and so these
fields can originally have any arbitrary value.  Which means, for
example, that tests like this:

  if (current_buffer->long_line_optimizations_p)
    {
      if (!it->narrowed_begv  <<<<<<<<<<<<<<<<<<<<<<<<<<
         || ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
             && (!redisplaying_p || it->line_wrap == TRUNCATE)))

are not necessarily reliable, because we never initialize
narrowed_begv to zero, AFAICT.  Right?

The other part of my reasoning is that callers of the display code
which are outside redisplay could legitimately move the iterator far
from point: think about pos-visible-in-window-p and its ilk.  So, when
we are not called by redisplay, I think it would be preferable to
update the narrowing due to these considerations.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:38                                                                                   ` Eli Zaretskii
@ 2022-08-01 12:51                                                                                     ` Gregory Heytings
  2022-08-01 13:13                                                                                       ` Eli Zaretskii
  2022-08-01 13:24                                                                                       ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 12:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> You can see in the sources that both init_iterator and start_display are 
> many times called with 'struct it' that was used by the caller, so it 
> could already have the narrowed_begv and narrowed_zv members initialized 
> by the caller.  If we discover that we want to recalculate these values, 
> we'd then need to restore the previous value before we return, so that 
> the caller will see the same values it used before the call.  But we 
> have no easy way of doing that, and moreover, cannot even detect that 
> these members were initialized.  The inability to detect that they were 
> initialized is due to the fact that we don't initialize 'struct it' 
> before we call init_iterator, and so these fields can originally have 
> any arbitrary value.
>

Thanks, it's much clearer now.

>
> Which means, for example, that tests like this:
>
>  if (current_buffer->long_line_optimizations_p)
>    {
>      if (!it->narrowed_begv  <<<<<<<<<<<<<<<<<<<<<<<<<<
>         || ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
>             && (!redisplaying_p || it->line_wrap == TRUNCATE)))
>
> are not necessarily reliable, because we never initialize narrowed_begv 
> to zero, AFAICT.  Right?
>

Indeed.  That wasn't a problem with the previous code in which 
narrowed_begv was unconditionally assigned.  Now it is.  I think the 
following change should be enough to fix this.  Agreed?

diff --git a/src/xdisp.c b/src/xdisp.c
index 8a19b3bda9..9574d06bd5 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -3472,6 +3472,9 @@ init_iterator (struct it *it, struct window *w,
                         &it->bidi_it);
         }

+      if (current_buffer->long_line_optimizations_p)
+       it->narrowed_begv = 0;
+
        /* Compute faces etc.  */
        reseat (it, it->current.pos, true);
      }

>
> The other part of my reasoning is that callers of the display code which 
> are outside redisplay could legitimately move the iterator far from 
> point: think about pos-visible-in-window-p and its ilk.  So, when we are 
> not called by redisplay, I think it would be preferable to update the 
> narrowing due to these considerations.
>

Thanks, this too clarifies what you meant.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:20                                   ` Gregory Heytings
@ 2022-08-01 13:04                                     ` Eli Zaretskii
  2022-08-01 13:14                                       ` Gregory Heytings
  2022-08-02  7:51                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 13:04 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier, dgutov

> Date: Mon, 01 Aug 2022 12:20:39 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: dgutov@yandex.ru, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > Having to wait for the initial display is pretty annoying.
> 
> It is, indeed.  But here we are talking about big files (say 1 GB).  In 
> that case (and in that case only) it is much better, from a user 
> viewpoint, to wait say 20 seconds before the file is opened and being at 
> that point able to freely move through the file, instead of waiting only 6 
> seconds, and then having to wait another 10 seconds between two motion 
> commands like M-> C-p.  In fact, no user expects that a 1 GB file would 
> open instantaneously.

You can maybe have that for C-p that follows M->, but wouldn't the
wait return, with a vengeance, if you insert a single character
(because then the buffer needs to be re-scanned)?  If so, we've gained
nothing, really.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:51                                                                                     ` Gregory Heytings
@ 2022-08-01 13:13                                                                                       ` Eli Zaretskii
  2022-08-01 13:30                                                                                         ` Gregory Heytings
  2022-08-01 13:24                                                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 13:13 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Mon, 01 Aug 2022 12:51:47 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Which means, for example, that tests like this:
> >
> >  if (current_buffer->long_line_optimizations_p)
> >    {
> >      if (!it->narrowed_begv  <<<<<<<<<<<<<<<<<<<<<<<<<<
> >         || ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
> >             && (!redisplaying_p || it->line_wrap == TRUNCATE)))
> >
> > are not necessarily reliable, because we never initialize narrowed_begv 
> > to zero, AFAICT.  Right?
> >
> 
> Indeed.  That wasn't a problem with the previous code in which 
> narrowed_begv was unconditionally assigned.  Now it is.  I think the 
> following change should be enough to fix this.  Agreed?

Yes, of course.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:04                                     ` Eli Zaretskii
@ 2022-08-01 13:14                                       ` Gregory Heytings
  2022-08-01 13:19                                         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 13:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier, dgutov


>> It is, indeed.  But here we are talking about big files (say 1 GB). 
>> In that case (and in that case only) it is much better, from a user 
>> viewpoint, to wait say 20 seconds before the file is opened and being 
>> at that point able to freely move through the file, instead of waiting 
>> only 6 seconds, and then having to wait another 10 seconds between two 
>> motion commands like M-> C-p.  In fact, no user expects that a 1 GB 
>> file would open instantaneously.
>
> You can maybe have that for C-p that follows M->, but wouldn't the wait 
> return, with a vengeance, if you insert a single character (because then 
> the buffer needs to be re-scanned)?  If so, we've gained nothing, 
> really.
>

Fortunately no: the buffer doesn't need to be rescanned, syntax-ppss 
caches its result, to avoid having to rescan the whole buffer again and 
again.  At least that's what Stefan and Dmitry told me, and that's what I 
see in my tests.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:14                                       ` Gregory Heytings
@ 2022-08-01 13:19                                         ` Eli Zaretskii
  2022-08-01 13:34                                           ` Gregory Heytings
  2022-08-01 21:50                                           ` Dmitry Gutov
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 13:19 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier, dgutov

> Date: Mon, 01 Aug 2022 13:14:20 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: dgutov@yandex.ru, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > You can maybe have that for C-p that follows M->, but wouldn't the wait 
> > return, with a vengeance, if you insert a single character (because then 
> > the buffer needs to be re-scanned)?  If so, we've gained nothing, 
> > really.
> 
> Fortunately no: the buffer doesn't need to be rescanned, syntax-ppss 
> caches its result, to avoid having to rescan the whole buffer again and 
> again.

But the buffer has changed, so the cache is not necessarily valid,
right?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:51                                                                                     ` Gregory Heytings
  2022-08-01 13:13                                                                                       ` Eli Zaretskii
@ 2022-08-01 13:24                                                                                       ` Eli Zaretskii
  2022-08-01 13:38                                                                                         ` Gregory Heytings
  2022-08-01 13:45                                                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 13:24 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Mon, 01 Aug 2022 12:51:47 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> diff --git a/src/xdisp.c b/src/xdisp.c
> index 8a19b3bda9..9574d06bd5 100644
> --- a/src/xdisp.c
> +++ b/src/xdisp.c
> @@ -3472,6 +3472,9 @@ init_iterator (struct it *it, struct window *w,
>                          &it->bidi_it);
>          }
> 
> +      if (current_buffer->long_line_optimizations_p)
> +       it->narrowed_begv = 0;
> +

Sorry, I wrote that this is OK, but it isn't: if init_iterator is
called with 'struct it' that was already initialized by a previous
call to 'reseat', the above will nuke the narrowing.

So we need something more complicated.  ATM I don't see how to solve
this without manually initializing narrowed_begv before the first call
to init_iterator or start_display.  Hmm...





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:13                                                                                       ` Eli Zaretskii
@ 2022-08-01 13:30                                                                                         ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 13:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>>> Which means, for example, that tests like this:
>>>
>>>  if (current_buffer->long_line_optimizations_p)
>>>    {
>>>      if (!it->narrowed_begv  <<<<<<<<<<<<<<<<<<<<<<<<<<
>>>         || ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
>>>             && (!redisplaying_p || it->line_wrap == TRUNCATE)))
>>>
>>> are not necessarily reliable, because we never initialize 
>>> narrowed_begv to zero, AFAICT.  Right?
>>
>> Indeed.  That wasn't a problem with the previous code in which 
>> narrowed_begv was unconditionally assigned.  Now it is.  I think the 
>> following change should be enough to fix this.  Agreed?
>
> Yes, of course.
>

Thanks; now pushed.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:19                                         ` Eli Zaretskii
@ 2022-08-01 13:34                                           ` Gregory Heytings
  2022-08-01 21:50                                           ` Dmitry Gutov
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 13:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier, dgutov


>>> You can maybe have that for C-p that follows M->, but wouldn't the 
>>> wait return, with a vengeance, if you insert a single character 
>>> (because then the buffer needs to be re-scanned)?  If so, we've gained 
>>> nothing, really.
>>
>> Fortunately no: the buffer doesn't need to be rescanned, syntax-ppss 
>> caches its result, to avoid having to rescan the whole buffer again and 
>> again.
>
> But the buffer has changed, so the cache is not necessarily valid, 
> right?
>

I didn't look at the internals of syntax-ppss, but I'd guess (and again my 
tests seem to confirm) that it was designed well enough, and doesn't need 
to rescan the whole buffer whenever a single character is inserted.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:24                                                                                       ` Eli Zaretskii
@ 2022-08-01 13:38                                                                                         ` Gregory Heytings
  2022-08-01 13:45                                                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 13:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier

[-- Attachment #1: Type: text/plain, Size: 995 bytes --]


>> diff --git a/src/xdisp.c b/src/xdisp.c
>> index 8a19b3bda9..9574d06bd5 100644
>> --- a/src/xdisp.c
>> +++ b/src/xdisp.c
>> @@ -3472,6 +3472,9 @@ init_iterator (struct it *it, struct window *w,
>>                          &it->bidi_it);
>>          }
>>
>> +      if (current_buffer->long_line_optimizations_p)
>> +       it->narrowed_begv = 0;
>> +
>
> Sorry, I wrote that this is OK, but it isn't: if init_iterator is called 
> with 'struct it' that was already initialized by a previous call to 
> 'reseat', the above will nuke the narrowing.
>
> So we need something more complicated.  ATM I don't see how to solve 
> this without manually initializing narrowed_begv before the first call 
> to init_iterator or start_display.  Hmm...
>

Hmmm...  So I wasn't completely wrong when I wasn't sure it was TRT 😉 
It's a chicken-and-egg problem.  On the other hand, even if we nuke the 
narrowing there, it will be recomputed two lines later, which should be 
okay.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:24                                                                                       ` Eli Zaretskii
  2022-08-01 13:38                                                                                         ` Gregory Heytings
@ 2022-08-01 13:45                                                                                         ` Eli Zaretskii
  2022-08-01 15:08                                                                                           ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 13:45 UTC (permalink / raw)
  To: gregory; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Mon, 01 Aug 2022 16:24:00 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org,
>  monnier@iro.umontreal.ca
> 
> > diff --git a/src/xdisp.c b/src/xdisp.c
> > index 8a19b3bda9..9574d06bd5 100644
> > --- a/src/xdisp.c
> > +++ b/src/xdisp.c
> > @@ -3472,6 +3472,9 @@ init_iterator (struct it *it, struct window *w,
> >                          &it->bidi_it);
> >          }
> > 
> > +      if (current_buffer->long_line_optimizations_p)
> > +       it->narrowed_begv = 0;
> > +
> 
> Sorry, I wrote that this is OK, but it isn't: if init_iterator is
> called with 'struct it' that was already initialized by a previous
> call to 'reseat', the above will nuke the narrowing.
> 
> So we need something more complicated.  ATM I don't see how to solve
> this without manually initializing narrowed_begv before the first call
> to init_iterator or start_display.  Hmm...

I think we should simply unconditionally recompute the narrowing in
'reseat'.  At least I couldn't think of a situation where that would
cause trouble, and 'reseat' is called rarely enough not to make this
expensive.  Am I missing something?

And another nit:

  if (current_buffer->long_line_optimizations_p)
    {
      if (!it->narrowed_begv
	  || ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
	      && (!redisplaying_p || it->line_wrap == TRUNCATE)))
	{
	  it->narrowed_begv = get_narrowed_begv (it->w, window_point (it->w));
	  it->narrowed_zv = get_narrowed_zv (it->w, window_point (it->w));
	}
    }

I think this should pass pos.charpos as the 2nd argument to
get_narrowed_begv and get_narrowed_zv, otherwise it might not really
correct anything, right?  In particular, when lines are truncated,
that will definitely happen when we display any line but the very
first.

Or perhaps we should check that using window-point indeed brings
pos.charpos into the narrowed region, and only use pos.charpos if it
doesn't?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:45                                                                                         ` Eli Zaretskii
@ 2022-08-01 15:08                                                                                           ` Gregory Heytings
  2022-08-01 15:49                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 15:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> I think we should simply unconditionally recompute the narrowing in 
> 'reseat'.  At least I couldn't think of a situation where that would 
> cause trouble, and 'reseat' is called rarely enough not to make this 
> expensive.  Am I missing something?
>

I don't think you are missing something, and that's what I suggested too. 
So there is nothing to change here, right?

>
> And another nit:
>
>  if (current_buffer->long_line_optimizations_p)
>    {
>      if (!it->narrowed_begv
> 	  || ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
> 	      && (!redisplaying_p || it->line_wrap == TRUNCATE)))
> 	{
> 	  it->narrowed_begv = get_narrowed_begv (it->w, window_point (it->w));
> 	  it->narrowed_zv = get_narrowed_zv (it->w, window_point (it->w));
> 	}
>    }
>
> I think this should pass pos.charpos as the 2nd argument to 
> get_narrowed_begv and get_narrowed_zv, otherwise it might not really 
> correct anything, right?  In particular, when lines are truncated, that 
> will definitely happen when we display any line but the very first.
>
> Or perhaps we should check that using window-point indeed brings 
> pos.charpos into the narrowed region, and only use pos.charpos if it 
> doesn't?
>

I changed this into:

   if (current_buffer->long_line_optimizations_p)
     {
       if (!it->narrowed_begv)
         {
           it->narrowed_begv = get_narrowed_begv (it->w, window_point (it->w));
           it->narrowed_zv = get_narrowed_zv (it->w, window_point (it->w));
         }
       else if ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
                 && (!redisplaying_p || it->line_wrap == TRUNCATE))
         {
           it->narrowed_begv = get_narrowed_begv (it->w, pos.charpos);
           it->narrowed_zv = get_narrowed_zv (it->w, pos.charpos);
         }
     }

which seems better indeed.  Is that okay from your point of view?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 15:08                                                                                           ` Gregory Heytings
@ 2022-08-01 15:49                                                                                             ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 15:49 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Mon, 01 Aug 2022 15:08:42 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Or perhaps we should check that using window-point indeed brings 
> > pos.charpos into the narrowed region, and only use pos.charpos if it 
> > doesn't?
> >
> 
> I changed this into:
> 
>    if (current_buffer->long_line_optimizations_p)
>      {
>        if (!it->narrowed_begv)
>          {
>            it->narrowed_begv = get_narrowed_begv (it->w, window_point (it->w));
>            it->narrowed_zv = get_narrowed_zv (it->w, window_point (it->w));
>          }
>        else if ((pos.charpos < it->narrowed_begv || pos.charpos > it->narrowed_zv)
>                  && (!redisplaying_p || it->line_wrap == TRUNCATE))
>          {
>            it->narrowed_begv = get_narrowed_begv (it->w, pos.charpos);
>            it->narrowed_zv = get_narrowed_zv (it->w, pos.charpos);
>          }
>      }
> 
> which seems better indeed.  Is that okay from your point of view?

Yes, thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-21 18:00 bug#56682: Fix the long lines font locking related slowdowns Gregory Heytings
  2022-07-21 18:04 ` Eli Zaretskii
@ 2022-08-01 16:34 ` Eli Zaretskii
  2022-08-01 16:49   ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 16:34 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

This hunk:

> @@ -8894,7 +8891,7 @@ get_visually_first_element (struct it *it)
>  				find_newline_no_quit (IT_CHARPOS (*it),
>  						      IT_BYTEPOS (*it), -1,
>  						      &it->bidi_it.bytepos),
> -				it->narrowed_begv);
> +				get_closer_narrowed_begv (it->w, IT_CHARPOS (*it)));
>        bidi_paragraph_init (it->paragraph_embedding, &it->bidi_it, true);
>        do
>  	{

isn't quite right, I think.

Can you explain which, if any, problems you saw or had in mind, and
what and how did you want to fix them with this particular hunk?  I'd
then suggest a better change (or agree with your change, in case I
missed something).

Thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 16:34 ` Eli Zaretskii
@ 2022-08-01 16:49   ` Gregory Heytings
  2022-08-01 17:08     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 16:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, larsi, monnier


>
> Can you explain which, if any, problems you saw or had in mind, and what 
> and how did you want to fix them with this particular hunk?  I'd then 
> suggest a better change (or agree with your change, in case I missed 
> something).
>

You're correct, now that I think again about this, it was just wrong. 
Fixed.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 16:49   ` Gregory Heytings
@ 2022-08-01 17:08     ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-01 17:08 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, larsi, monnier

> Date: Mon, 01 Aug 2022 16:49:55 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, larsi@gnus.org, 
>     monnier@iro.umontreal.ca
> 
> > Can you explain which, if any, problems you saw or had in mind, and what 
> > and how did you want to fix them with this particular hunk?  I'd then 
> > suggest a better change (or agree with your change, in case I missed 
> > something).
> 
> You're correct, now that I think again about this, it was just wrong. 
> Fixed.

OK, thanks.  But anyway: the possibly slow part there is not
find_newline_no_quit, it's the loop that calls
bidi_move_to_visually_next afterwards, in case the newline is very far
back.  So if we ever want to make that part faster, we should force
that loop to start from narrowed_begv, not from where we find the
previous newline.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-31 22:45                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-07-31 23:12                               ` Gregory Heytings
  2022-08-01 11:58                               ` Eli Zaretskii
@ 2022-08-01 18:09                               ` Gregory Heytings
  2022-08-02  8:12                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-01 18:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 1582 bytes --]


>>> Try `M-: (use-global-map (make-keymap)) RET`
>>>
>>> Should we prevent users from doing that?
>>
>> It's a misleading question.  No "user" would ever do that.  Sure, it's 
>> a nice example, but only an Elisp hacker would do that, in the middle 
>> of a debugging session, and they would do that on purpose (although 
>> perhaps without knowing the effect in advance).  Which has nothing to 
>> do with a regular user who just opens a file.
>
> FWIW, the above is my standard example because I ended up doing exactly 
> that by accident, locking myself out of the session I was trying to 
> debug, so in a sense, I'd have been happy (that one time) if Emacs had 
> prevented me from doing it.
>

So... what do you think of the following? 😉

diff --git a/src/keymap.c b/src/keymap.c
index 506b755e5d..e42bc64717 100644
--- a/src/keymap.c
+++ b/src/keymap.c
@@ -1881,6 +1881,17 @@ DEFUN ("use-global-map", Fuse_global_map, 
Suse_global_map, 1, 1, 0,
    (Lisp_Object keymap)
  {
    keymap = get_keymap (keymap, 1, 1);
+
+  /* Prevent locking Emacs if someone inadvertently evaluates
+     (use-global-map (make-keymap)) */
+  if (EQ (Fequal (keymap, Fmake_keymap (Qnil)), Qt))
+    {
+      Lisp_Object meta_colon_key = CALLN (Fvector, make_fixnum (134217786));
+      Lisp_Object default_key = CALLN (Fvector, Qt);
+      Fdefine_key (keymap, meta_colon_key, intern ("eval-expression"), Qnil);
+      Fdefine_key (keymap, default_key, intern ("self-insert-command"), Qnil);
+    }
+
    current_global_map = keymap;

    return Qnil;

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 13:19                                         ` Eli Zaretskii
  2022-08-01 13:34                                           ` Gregory Heytings
@ 2022-08-01 21:50                                           ` Dmitry Gutov
  2022-08-02  2:27                                             ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-01 21:50 UTC (permalink / raw)
  To: Eli Zaretskii, Gregory Heytings; +Cc: 56682, monnier

On 01.08.2022 16:19, Eli Zaretskii wrote:
>> Date: Mon, 01 Aug 2022 13:14:20 +0000
>> From: Gregory Heytings<gregory@heytings.org>
>> cc:dgutov@yandex.ru,56682@debbugs.gnu.org,monnier@iro.umontreal.ca
>>
>>> You can maybe have that for C-p that follows M->, but wouldn't the wait
>>> return, with a vengeance, if you insert a single character (because then
>>> the buffer needs to be re-scanned)?  If so, we've gained nothing,
>>> really.
>> Fortunately no: the buffer doesn't need to be rescanned, syntax-ppss
>> caches its result, to avoid having to rescan the whole buffer again and
>> again.
> But the buffer has changed, so the cache is not necessarily valid,
> right?

syntax-ppss cache is a list of checkpoints spread along the buffer.

After a modification, only the checkpoints below it are invalidated (to 
be recomputed on-demand later).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:23                                         ` Gregory Heytings
@ 2022-08-01 21:53                                           ` Dmitry Gutov
  2022-08-02  7:34                                             ` Gregory Heytings
  2022-08-02 21:32                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-01 21:53 UTC (permalink / raw)
  To: Gregory Heytings, Stefan Monnier; +Cc: 56682, Eli Zaretskii

On 01.08.2022 14:23, Gregory Heytings wrote:
>> You might also want to compare to the time to run
>>
>> (parse-partial-sexp (point-min) (point-max))
>>
>> which is a kind of "speed of light" for `syntax-ppss`.
>>
> 
> What do you mean?  With the above file (benchmark-run 1 
> (parse-partial-sexp (point-min) (point-max))) takes 55 seconds.

Then this is the base price of knowing (with confidence) whether the EOB 
position is inside a paren, inside a string, or inside a comment, all of 
which may be positioned anywhere inside the buffer.

Perhaps someone will come up with further optimizations for the 
parse-partial-sexp implementation (without changing its behavior). 
Likely not by an order of a magnitude, but who knows.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:12                                   ` Eli Zaretskii
@ 2022-08-01 21:54                                     ` Dmitry Gutov
  2022-08-02  2:31                                       ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-01 21:54 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: 56682, gregory

On 01.08.2022 15:12, Eli Zaretskii wrote:
> But the problem here is that it isn't "people shooting themselves in
> the foot", it's that "major modes shoot their users in the foot".
> IOW, the ones who shoot and the ones who get shot are not the same
> people.
> 
> What do you want a user to do when he/she is faced with a mode which
> makes Emacs very slow?  Such a user cannot blame his/herself; in many
> cases the use doesn't even know enough to realize it's the major mode
> and its fontifications that are the culprit.

Just like we do in such cases where an Emacs feature is not optimized 
enough for a given use case: wait for the user to realize the situation 
can and should be improved, and file a bug report/feature request.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:08                                   ` Eli Zaretskii
@ 2022-08-02  1:05                                     ` Dmitry Gutov
  2022-08-02  7:55                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 12:35                                       ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02  1:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 01.08.2022 15:08, Eli Zaretskii wrote:
>> Date: Mon, 1 Aug 2022 04:23:21 +0300
>> Cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
>>   Stefan Monnier <monnier@iro.umontreal.ca>
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> IIUC this state of affairs is caused by your chosen approach to speeding
>> up font-lock (hard narrowing while it is called), which makes the
>> initial call to syntax-ppss happen inside that narrowing as well.
>>
>> The alternative being that font-lock would call syntax-ppss right away
>> with no restriction, but then only apply highlighting to limited parts
>> of the buffer.
> 
> AFAIU, this seems to assume that highlighting is much faster than
> syntax-ppss.  Is that a given?  If not, I don't think I understand how
> this could help.

I don't have the concrete numbers at hand, but from experience I'd say:

- syntax-ppss over the whole buffer is fast-ish. But it takes O(N) time 
of course, and the bigger the buffer is, the longer it'll take. Not much 
we can do about it.
- font-lock has to do more work, so over the whole buffer it will take 
an order of a magnitude more time than syntax-ppss.

Further:

- syntax-ppss is also important for correctness: for commands to 
understand whether the point is inside a string, comments, etc. So it's 
better to avoid applying narrowing when calling it. Unless you're in a 
multiple-major-modes situation.
- font-lock calls syntax-ppss.

So ideally font-lock is either called with undo-able narrowing, or is 
simply passed a range of positions, and shouldn't fontify too far from them.

The latter seems to be the case already (if you open xdisp.c and press 
M->, only top and bottom of the buffer are fontified), with the caveat 
that font-lock always tries to backtrack to BOL when fontifying the 
current hunk. Which makes sense, of course, but could be tweaked for 
long lines to avoid re-fontifying the whole buffer again and again.

IOW, IIUC the fix for font-lock performance could be better implemented 
inside font-lock itself, as long as all the info about whether the 
current line is "long" is available to Lisp.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 21:50                                           ` Dmitry Gutov
@ 2022-08-02  2:27                                             ` Eli Zaretskii
  2022-08-02 14:10                                               ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02  2:27 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Tue, 2 Aug 2022 00:50:10 +0300
> Cc: 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 01.08.2022 16:19, Eli Zaretskii wrote:
> >> Date: Mon, 01 Aug 2022 13:14:20 +0000
> >> From: Gregory Heytings<gregory@heytings.org>
> >> cc:dgutov@yandex.ru,56682@debbugs.gnu.org,monnier@iro.umontreal.ca
> >>
> >>> You can maybe have that for C-p that follows M->, but wouldn't the wait
> >>> return, with a vengeance, if you insert a single character (because then
> >>> the buffer needs to be re-scanned)?  If so, we've gained nothing,
> >>> really.
> >> Fortunately no: the buffer doesn't need to be rescanned, syntax-ppss
> >> caches its result, to avoid having to rescan the whole buffer again and
> >> again.
> > But the buffer has changed, so the cache is not necessarily valid,
> > right?
> 
> syntax-ppss cache is a list of checkpoints spread along the buffer.
> 
> After a modification, only the checkpoints below it are invalidated (to 
> be recomputed on-demand later).

So a suitably-concocted replace command will still invalidate a lot of
that cache, right?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 21:54                                     ` Dmitry Gutov
@ 2022-08-02  2:31                                       ` Eli Zaretskii
  2022-08-02 14:29                                         ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02  2:31 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Tue, 2 Aug 2022 00:54:55 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 01.08.2022 15:12, Eli Zaretskii wrote:
> > But the problem here is that it isn't "people shooting themselves in
> > the foot", it's that "major modes shoot their users in the foot".
> > IOW, the ones who shoot and the ones who get shot are not the same
> > people.
> > 
> > What do you want a user to do when he/she is faced with a mode which
> > makes Emacs very slow?  Such a user cannot blame his/herself; in many
> > cases the use doesn't even know enough to realize it's the major mode
> > and its fontifications that are the culprit.
> 
> Just like we do in such cases where an Emacs feature is not optimized 
> enough for a given use case: wait for the user to realize the situation 
> can and should be improved, and file a bug report/feature request.

The intent of this activity is to make Emacs reasonably performant and
responsive in the relevant use cases without asking them to wait for
something that likely won't happen.

IOW, in this case the Emacs developers, due to long-standing bug
reports about this situation, recognized that it _can_ be improved,
albeit in slightly unorthodox ways, and have taken the measures to
optimize Emacs for the users.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01  9:34                                     ` Gregory Heytings
  2022-08-01  9:46                                       ` Dmitry Gutov
  2022-08-01 11:06                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02  3:01                                       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 0 replies; 416+ messages in thread
From: Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  3:01 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier, Dmitry Gutov

Gregory Heytings <gregory@heytings.org> writes:

>
> +  bool_bf long_line_syntax_ppss_done_p : 1;

Please write a comment for every bitfield in that struct.

> +	      current_buffer->long_line_syntax_ppss_done_p = 1;

Why not write "true" instead of "1"?

> +	      CALLN (Ffuncall, intern ("syntax-ppss"), Fpoint_max ());

Why not write this instead:

  call2 (Qsyntax_ppss, Fpoint_max ());

of course, with the appropriate DEFSYM added to the right file?
I think calling "intern" with a static string in C code is an example of
lazy programming.  At the very least, it unduly wastes cycles.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 21:53                                           ` Dmitry Gutov
@ 2022-08-02  7:34                                             ` Gregory Heytings
  2022-08-02 11:07                                               ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02  7:34 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>
> Perhaps someone will come up with further optimizations for the 
> parse-partial-sexp implementation (without changing its behavior). 
> Likely not by an order of a magnitude, but who knows.
>

Do I understand correctly that you do not plan to work on this?  If so, 
I'll add it to my TODO list.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:04                                 ` Eli Zaretskii
  2022-08-01 12:20                                   ` Gregory Heytings
@ 2022-08-02  7:48                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  7:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, Gregory Heytings, dgutov

> used, for some reason) jit-lock-defer feature: if you set
> jit-lock-defer-time to a large enough value, M-> followed by C-p will
> not be as slow as they are now, I think.

I think currently, with such a setting, in Gregory's example, the 50s of
`syntax-ppss` will still bite the users because they're uninterruptible,
although if the users happen to leave Emacs idle at the right time they
may not notice.

We could/should make sure the 50s are done "in the background" and can
be interrupted at any time.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 12:20                                   ` Gregory Heytings
  2022-08-01 13:04                                     ` Eli Zaretskii
@ 2022-08-02  7:51                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  7:51 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, dgutov

> It is, indeed.  But here we are talking about big files (say 1 GB).  In that
> case (and in that case only) it is much better, from a user viewpoint, to
> wait say 20 seconds before the file is opened and being at that point able
> to freely move through the file, instead of waiting only 6 seconds, and then
> having to wait another 10 seconds between two motion commands like M-> C-p.
> In fact, no user expects that a 1 GB file would open instantaneously.

The problem will re-occur if you change something at the beginning of
the buffer and then go back to the end of the buffer.

I think something like `font-lock-defer` which displays the text
unfontified until we had time to apply the fontification should make the
1GB case much more usable (tho currently, it's probably not quite good
enough).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  1:05                                     ` Dmitry Gutov
@ 2022-08-02  7:55                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 14:08                                         ` Dmitry Gutov
  2022-08-02 12:35                                       ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  7:55 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, gregory

> only top and bottom of the buffer are fontified), with the caveat that
> font-lock always tries to backtrack to BOL when fontifying the current
> hunk. Which makes sense, of course, but could be tweaked for long lines to
> avoid re-fontifying the whole buffer again and again.

It has already been (recently) tweaked for this on `master`.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:58                               ` Eli Zaretskii
@ 2022-08-02  8:10                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02  8:22                                   ` Gregory Heytings
  2022-08-05 12:59                                 ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  8:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory

> Given this situation, it sounds reasonable to start by restricting
> font-lock.

I fully agree.  The current behavior is the right default.

> As I wrote elsewhere, I'm okay with extending 'widen' so that it could
> "unlock" the locked narrowing, which could then be used in major modes
> that convince us their performance is adequate (or clearly announce in
> their docs that they don't care about files with long lines ;-).

`post-command-hook` also has been abused in all kinds of ways that suck
for the user if they have too-large buffers, or too many buffers, or too
many frames, or ...
That has never prompted calls to ban it completely.

So, I think it's important to provide this option to "unlock the
widening".  Even for purely philosophical reasons.

We can even add a user-option to "re-lock" the widening which would
prevent the "unlock the widening" from working, so that users can
override a poorly-thought-out use of widening which makes their large
file unusable (tho I'd argue that you can get the same result with an
`advice-add`).

Also, let's not forget that the speed impact of large buffers is not
limited to the redisplay, so trying to work extra-hard to eliminate all
possible cases of the redisplay spending too much time in large buffers
won't prevent "apparent lockups" where the time is spent in the command
(or some hook run at that occasion) rather than in the redisplay itself.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 18:09                               ` Gregory Heytings
@ 2022-08-02  8:12                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  8:12 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

> So... what do you think of the following? 😉

:-)


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  8:10                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02  8:22                                   ` Gregory Heytings
  2022-08-02  9:25                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02  8:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>
> `post-command-hook` also has been abused in all kinds of ways that suck 
> for the user if they have too-large buffers, or too many buffers, or too 
> many frames, or ...
>

Indeed.  Which is why you'll see on the feature branch that in buffers 
with long lines post-command-hook is now also subjected to a locked 
narrowing.

>
> We can even add a user-option to "re-lock" the widening which would 
> prevent the "unlock the widening" from working, so that users can 
> override a poorly-thought-out use of widening which makes their large 
> file unusable (tho I'd argue that you can get the same result with an 
> `advice-add`).
>

I don't understand what you have in mind here.  How would such a user 
option be different from (setq long-line-threshold nil)?  Do you mean that 
we should make it possible for users to fine-tune each and every aspect of 
the optimizations, with a bunch of user configurable options?

>
> Also, let's not forget that the speed impact of large buffers is not 
> limited to the redisplay, so trying to work extra-hard to eliminate all 
> possible cases of the redisplay spending too much time in large buffers 
> won't prevent "apparent lockups" where the time is spent in the command 
> (or some hook run at that occasion) rather than in the redisplay itself.
>

It is not limited to redisplay only, but by far the largest fraction of 
the speed impact is (or rather was) in redisplay, and asymptotically so. 
Commands that used to take minutes on a reasonably recent computer now 
take a fraction of a second, only because redisplay is now faster.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  8:22                                   ` Gregory Heytings
@ 2022-08-02  9:25                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 10:00                                       ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02  9:25 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

>> `post-command-hook` also has been abused in all kinds of ways that suck
>> for the user if they have too-large buffers, or too many buffers, or too
>> many frames, or ...
> Indeed.  Which is why you'll see on the feature branch that in buffers with
> long lines post-command-hook is now also subjected to a locked narrowing.

But the problems can also affect buffers without long lines (just with
many lines).  Or the case of having a thousand buffers opened.  Or ...

>> We can even add a user-option to "re-lock" the widening which would
>> prevent the "unlock the widening" from working, so that users can override
>> a poorly-thought-out use of widening which makes their large file unusable
>> (tho I'd argue that you can get the same result with an `advice-add`).
> I don't understand what you have in mind here.

If the major mode overrides the locked narrowing which causes the users
experience to be unbearable in long buffers, the users could set this
variable to override the override (while sending a bug report to the
major mode's maintainers and waiting for the bug to be fixed).

> It is not limited to redisplay only, but by far the largest fraction of the
> speed impact is (or rather was) in redisplay, and asymptotically
> so. Commands that used to take minutes on a reasonably recent computer now
> take a fraction of a second, only because redisplay is now faster.

Yes, your work is *really* appreciated in this area.  I'm just pointing
out that there's no point trying to make it technically *impossible* for
users to shoot themselves in the foot making redisplay too slow,
because there are still plenty of other ways they can shoot themselves
in the foot, and because every time we make something undesirable
impossible, we *also* make it impossible to do some desirable things
(even if we can't yet imagine what those might be).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  9:25                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 10:00                                       ` Gregory Heytings
  2022-08-02 21:40                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 10:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>
> But the problems can also affect buffers without long lines (just with 
> many lines).
>

Yes, and that's what sits next on my list: "too large" buffers.

>
> Or the case of having a thousand buffers opened.  Or ...
>

I don't think the "too many buffers" case is worth investigating. 
Because it won't lock Emacs suddenly, like what we had.  It could make 
Emacs gradually slower, in which case the user can take some appropriate 
measures.

>
> If the major mode overrides the locked narrowing which causes the users 
> experience to be unbearable in long buffers, the users could set this 
> variable to override the override (while sending a bug report to the 
> major mode's maintainers and waiting for the bug to be fixed).
>

I see what you mean now.  But I don't think it would work.  What I want is 
to take reasonable measures to ensure that Emacs remains responsive "in 
spite of" mode maintainers.

>
> Yes, your work is *really* appreciated in this area.  I'm just pointing 
> out that there's no point trying to make it technically *impossible* for 
> users to shoot themselves in the foot making redisplay too slow,
>

Yes, with that I agree.  But as Eli said, it's not users who shoot 
themselves in the foot, it's innocent users that are shooted in the foot 
by syntax-ppss and/or by mode maintainers who are not careful enough. 
And as I said its not technically impossible for this to happen, users can 
freely unset long-line-threshold (although that's not recommended of 
course).

>
> because there are still plenty of other ways they can shoot themselves 
> in the foot,
>

I'm curious, are there other ways for a regular user (*not* an Elisp 
hacker!) to make Emacs completely unresponsive with regular editing 
commands, starting with emacs -Q?

>
> and because every time we make something undesirable impossible, we 
> *also* make it impossible to do some desirable things (even if we can't 
> yet imagine what those might be).
>

I'm curious again, because I cannot imagine what that could be either.

Also note that you can blame yourself if you don't like the locked 
narrowing idea.  See the following three lines which you added ten years 
ago in eval.c:

/* Don't export this variable to Elisp, so no one can mess with it
    (Just imagine if someone makes it buffer-local).  */
Funintern (Qinternal_interpreter_environment, Qnil);

and which make it technically impossible to do something, incidentally 
without providing a way for Elisp hackers to escape that impossibility.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  7:34                                             ` Gregory Heytings
@ 2022-08-02 11:07                                               ` Dmitry Gutov
  0 siblings, 0 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02 11:07 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 02.08.2022 10:34, Gregory Heytings wrote:
>>
>> Perhaps someone will come up with further optimizations for the 
>> parse-partial-sexp implementation (without changing its behavior). 
>> Likely not by an order of a magnitude, but who knows.
>>
> 
> Do I understand correctly that you do not plan to work on this?  If so, 
> I'll add it to my TODO list.

Please go ahead. The C part of Emacs is far from my area of expertise.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  1:05                                     ` Dmitry Gutov
  2022-08-02  7:55                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 12:35                                       ` Eli Zaretskii
  2022-08-02 14:47                                         ` Dmitry Gutov
                                                           ` (2 more replies)
  1 sibling, 3 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02 12:35 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Tue, 2 Aug 2022 04:05:57 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> >> The alternative being that font-lock would call syntax-ppss right away
> >> with no restriction, but then only apply highlighting to limited parts
> >> of the buffer.
> > 
> > AFAIU, this seems to assume that highlighting is much faster than
> > syntax-ppss.  Is that a given?  If not, I don't think I understand how
> > this could help.
> 
> I don't have the concrete numbers at hand, but from experience I'd say:
> 
> - syntax-ppss over the whole buffer is fast-ish. But it takes O(N) time 
> of course, and the bigger the buffer is, the longer it'll take. Not much 
> we can do about it.
> - font-lock has to do more work, so over the whole buffer it will take 
> an order of a magnitude more time than syntax-ppss.
> 
> Further:
> 
> - syntax-ppss is also important for correctness: for commands to 
> understand whether the point is inside a string, comments, etc. So it's 
> better to avoid applying narrowing when calling it. Unless you're in a 
> multiple-major-modes situation.
> - font-lock calls syntax-ppss.

I believe I was talking about syntax-ppss being called from font-lock,
indeed.  Before Gregory's changes, if you visit a large file with very
long lines, and interrupt Emacs while it is non-responsive, you will
in many/most cases find yourself in syntax-propertize or its
subroutines, and you will see that it is almost always called to
traverse the entire long line.

> So ideally font-lock is either called with undo-able narrowing, or is 
> simply passed a range of positions, and shouldn't fontify too far from them.

Many major-modes do widen the buffer, though.

> The latter seems to be the case already (if you open xdisp.c and press 
> M->, only top and bottom of the buffer are fontified)

It is not enough to look for faces in order to realize how much of the
buffer was scanned.

> with the caveat that font-lock always tries to backtrack to BOL when
> fontifying the current hunk. Which makes sense, of course, but could
> be tweaked for long lines to avoid re-fontifying the whole buffer
> again and again.

"Tweaked" how?

> IOW, IIUC the fix for font-lock performance could be better implemented 
> inside font-lock itself, as long as all the info about whether the 
> current line is "long" is available to Lisp.

No one will object to making font-lock faster.  But the experts who
can do that are few and far in-between, and seem to have other itches
to scratch, since these issues are known for a long time, and several
times were even discussed at length.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  7:55                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 14:08                                         ` Dmitry Gutov
  0 siblings, 0 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02 14:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, gregory

On 02.08.2022 10:55, Stefan Monnier wrote:
>> only top and bottom of the buffer are fontified), with the caveat that
>> font-lock always tries to backtrack to BOL when fontifying the current
>> hunk. Which makes sense, of course, but could be tweaked for long lines to
>> avoid re-fontifying the whole buffer again and again.
> It has already been (recently) tweaked for this on `master`.

Sorry I missed that. Commit 15b2138719b34083967001c3903e7560d5e0947c, right?

Then, as long as redisplay avoids narrowing when running 
fontification-functions, everything should work well.

Or at least much better than before.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  2:27                                             ` Eli Zaretskii
@ 2022-08-02 14:10                                               ` Dmitry Gutov
  2022-08-02 15:46                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02 14:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 02.08.2022 05:27, Eli Zaretskii wrote:
>> syntax-ppss cache is a list of checkpoints spread along the buffer.
>>
>> After a modification, only the checkpoints below it are invalidated (to
>> be recomputed on-demand later).
> So a suitably-concocted replace command will still invalidate a lot of
> that cache, right?

For any cache, one can invent an operation that would result in 
thrashing it repeatedly.

A regular search-replace should work well enough, though. Because when 
the buffer is long, the user is likely, on average, to spend a lot more 
time examining the occurrences and deciding whether to replace each one. 
And since the operation goes from top to bottom, this will likely 
invalidate the list of caches once, and then rebuild it from the 
beginning (or from wherever the first replacement was).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02  2:31                                       ` Eli Zaretskii
@ 2022-08-02 14:29                                         ` Dmitry Gutov
  2022-08-02 14:57                                           ` Gregory Heytings
  2022-08-02 16:02                                           ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02 14:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 02.08.2022 05:31, Eli Zaretskii wrote:
>> Just like we do in such cases where an Emacs feature is not optimized
>> enough for a given use case: wait for the user to realize the situation
>> can and should be improved, and file a bug report/feature request.
> The intent of this activity is to make Emacs reasonably performant and
> responsive in the relevant use cases without asking them to wait for
> something that likely won't happen.
> 
> IOW, in this case the Emacs developers, due to long-standing bug
> reports about this situation, recognized that it_can_  be improved,
> albeit in slightly unorthodox ways, and have taken the measures to
> optimize Emacs for the users.

It would be a shame to have the better-behaving (faster) major modes 
exhibit worse behavior that they could have because of the approach we 
choose to solve the long-lines problem.

Regarding the long-standing bug reports, we did solve a bunch of issues 
already. One major one, IIUC, was redisplay of already fontified text on 
long lines. Another piece of the puzzle was added by Stefan in 
15b2138719b340.

So perhaps we should re-evaluate the testing scenario to see where the 
current bottlenecks are. If we current main issue is the 55s spent in 
syntax-ppss, a more constructive approach would be to look into 
optimizing parse-partial-sexp. Or even give up on certain scenarios, 
admitting that waiting 55s once to visit the end of a 1 GB buffer is not 
so bad (and that could part could also be sped up by setting 
syntax-propertize-function to nil and using a very simple syntax table, 
for instance).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 12:35                                       ` Eli Zaretskii
@ 2022-08-02 14:47                                         ` Dmitry Gutov
  2022-08-02 15:06                                           ` Gregory Heytings
  2022-08-02 16:11                                           ` Eli Zaretskii
  2022-08-02 21:46                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 21:49                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02 14:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 02.08.2022 15:35, Eli Zaretskii wrote:

>> - syntax-ppss is also important for correctness: for commands to
>> understand whether the point is inside a string, comments, etc. So it's
>> better to avoid applying narrowing when calling it. Unless you're in a
>> multiple-major-modes situation.
>> - font-lock calls syntax-ppss.
> 
> I believe I was talking about syntax-ppss being called from font-lock,
> indeed.  Before Gregory's changes, if you visit a large file with very
> long lines, and interrupt Emacs while it is non-responsive, you will
> in many/most cases find yourself in syntax-propertize or its
> subroutines, and you will see that it is almost always called to
> traverse the entire long line.

Interrupt it right after pressing 'M->'? Or at any time during editing 
the buffer later?

The latter really shouldn't happen. If it does, perhaps it's the result 
of narrowing during redisplay, which might blow syntax-ppss's caches.

In any case, if you could point to a scenario and the revision to test 
it on, I'll be sure to take a look.

>> So ideally font-lock is either called with undo-able narrowing, or is
>> simply passed a range of positions, and shouldn't fontify too far from them.
> 
> Many major-modes do widen the buffer, though.

Whether they do or not, font-lock widens by default, see 
font-lock-dont-widen.

>> The latter seems to be the case already (if you open xdisp.c and press
>> M->, only top and bottom of the buffer are fontified)
> 
> It is not enough to look for faces in order to realize how much of the
> buffer was scanned.

I evaluated (next-single-property-change 1 'fontified), when near BOB 
and when near EOB.

>> with the caveat that font-lock always tries to backtrack to BOL when
>> fontifying the current hunk. Which makes sense, of course, but could
>> be tweaked for long lines to avoid re-fontifying the whole buffer
>> again and again.
> 
> "Tweaked" how?

15b2138719b34083 is one example.

>> IOW, IIUC the fix for font-lock performance could be better implemented
>> inside font-lock itself, as long as all the info about whether the
>> current line is "long" is available to Lisp.
> 
> No one will object to making font-lock faster.  But the experts who
> can do that are few and far in-between, and seem to have other itches
> to scratch, since these issues are known for a long time, and several
> times were even discussed at length.

The fact that we have +1 contributor to the C part of Emacs (the display 
engine, etc), and a successful one at that, does nothing about the fact 
that Lisp is easier to write and debug.

If we're able to demonstrate that the remaining bottlenecks are inside 
font-lock, it should be easier to improve there.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:01                                             ` Gregory Heytings
@ 2022-08-02 14:53                                               ` Dmitry Gutov
  2022-08-02 15:09                                                 ` Gregory Heytings
  2022-08-02 21:18                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-02 14:53 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 01.08.2022 14:01, Gregory Heytings wrote:
> 
>>> No, it happens when the buffer is opened.  Given the importance that 
>>> you and Stefan seem to give to that function, it is, with the patch I 
>>> sent in my previous post, called once on the whole buffer (without 
>>> any narrowing) when the file is opened.
>>
>> But if the buffer is not scrolled to the end, shouldn't it be called 
>> with a position that's close to the beginning?
>>
>> That shouldn't force the full buffer scan, meaning this call should 
>> complete quickly.
>>
> 
> I'm trying to reconcile two conflicting constraints.
> 
> It is necessary to add a locked narrowing around fontification-functions 
> and pre/post-command-hook to ensure that Emacs remains responsive.

Perhaps tweaking the value of syntax-wholeline-max could have a similar 
enough effect?

> At the same time, you and Stefan tell me that syntax-ppss does an 
> important job and will not do it correctly with such a locked narrowing, 
> IOW, that at least syntax-ppss should be called without a locked 
> narrowing.  But you also tell me that its result is cached so that a 
> full buffer scan isn't necessary anymore when it has happened at least 
> once.
> 
> So what I'm suggesting is to do a full buffer scan immediately, when the 
> file is opened, without any narrowing.  If that happens, later calls to 
> syntax-ppss inside fontification-functions and pre/post-command-hook 
> will use the cached result of the initial scan, and will do their job 
> correctly even with a locked narrowing.

syntax-ppss has two caches: one for the widened buffer (when point-min 
equals to 1), and another for narrowings. The latter cache is always 
cleared when point-min changes.

The latter cache works okay-ish for multiple-major-mode scenario, but if 
redisplay narrows with different limits, that would blow cache often. 
And if the whole chunk is big enough (i.e. bigger than the average 
distance between the cached syntax-ppss positions), that could actually 
result in syntax-ppss working slower than it could.

> Unless I misunderstand something, I think (and my tests seem to confirm) 
> that that would be a workable solution, provided that the initial scan 
> is reasonably fast, that is, at least six times faster than it is now.

I'm afraid this approach is too blunt to be a solution.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:29                                         ` Dmitry Gutov
@ 2022-08-02 14:57                                           ` Gregory Heytings
  2022-08-02 16:14                                             ` Eli Zaretskii
                                                               ` (2 more replies)
  2022-08-02 16:02                                           ` Eli Zaretskii
  1 sibling, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 14:57 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier


>
> Regarding the long-standing bug reports, we did solve a bunch of issues 
> already. One major one, IIUC, was redisplay of already fontified text on 
> long lines.
>

Try to open the dictionary.json with Emacs on master a month ago.  It's a 
small file (only 18 MB).  On my computer just opening the file with emacs 
-Q takes 220 seconds.  220 seconds during which Emacs is completely 
locked, because of font-lock mode.  If you're not convinced, turn 
font-lock mode off, open the file, and turn font-lock mode on.

>
> Another piece of the puzzle was added by Stefan in 15b2138719b340.
>

That looked promising, but sadly it had only a very limited effect.

>
> So perhaps we should re-evaluate the testing scenario to see where the 
> current bottlenecks are. If we current main issue is the 55s spent in 
> syntax-ppss, a more constructive approach would be to look into 
> optimizing parse-partial-sexp. Or even give up on certain scenarios, 
> admitting that waiting 55s once to visit the end of a 1 GB buffer is not 
> so bad (and that could part could also be sped up by setting 
> syntax-propertize-function to nil and using a very simple syntax table, 
> for instance).
>

It is bad, especially now that it became clear that in fact it's not 
"waiting 55s once" but "waiting 55s each time the buffer is modified and 
you move to another position in the buffer".





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:47                                         ` Dmitry Gutov
@ 2022-08-02 15:06                                           ` Gregory Heytings
  2022-08-02 21:51                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 16:11                                           ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 15:06 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier


>
> Whether they do or not, font-lock widens by default, see 
> font-lock-dont-widen.
>

And even if one sets font-lock-dont-widen to nil, mode authors are 
completely free to ignore that setting (assuming they're aware it exists).

>
> If we're able to demonstrate that the remaining bottlenecks are inside 
> font-lock, it should be easier to improve there.
>

It has been demonstrated, as far as I can see, and repeatedly so.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:53                                               ` Dmitry Gutov
@ 2022-08-02 15:09                                                 ` Gregory Heytings
  2022-08-02 21:19                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 15:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>> It is necessary to add a locked narrowing around 
>> fontification-functions and pre/post-command-hook to ensure that Emacs 
>> remains responsive.
>
> Perhaps tweaking the value of syntax-wholeline-max could have a similar 
> enough effect?
>

It doesn't, alas.

>> Unless I misunderstand something, I think (and my tests seem to 
>> confirm) that that would be a workable solution, provided that the 
>> initial scan is reasonably fast, that is, at least six times faster 
>> than it is now.
>
> I'm afraid this approach is too blunt to be a solution.
>

Indeed, that's my conclusion too.  So until syntax-ppss (at least) is made 
an order of magnitude faster, the right thing to do is to use the forced 
narrowing method.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:10                                               ` Dmitry Gutov
@ 2022-08-02 15:46                                                 ` Eli Zaretskii
  2022-08-04  1:08                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02 15:46 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Tue, 2 Aug 2022 17:10:53 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 02.08.2022 05:27, Eli Zaretskii wrote:
> >> syntax-ppss cache is a list of checkpoints spread along the buffer.
> >>
> >> After a modification, only the checkpoints below it are invalidated (to
> >> be recomputed on-demand later).
> > So a suitably-concocted replace command will still invalidate a lot of
> > that cache, right?
> 
> For any cache, one can invent an operation that would result in 
> thrashing it repeatedly.

Yes, but when thrashing causes delays of dozens of seconds, the result
is not just a rare delay, the result is simply unacceptable.

> A regular search-replace should work well enough, though. Because when 
> the buffer is long, the user is likely, on average, to spend a lot more 
> time examining the occurrences and deciding whether to replace each one. 
> And since the operation goes from top to bottom, this will likely 
> invalidate the list of caches once, and then rebuild it from the 
> beginning (or from wherever the first replacement was).

We want every basic operation in such buffers to perform reasonably
well.  That's the goal of this activity.  Because partial solutions
that sometimes work we already have: there's so-long-mode, there's
longlines.el, and a couple of other trick up our sleeve.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:29                                         ` Dmitry Gutov
  2022-08-02 14:57                                           ` Gregory Heytings
@ 2022-08-02 16:02                                           ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02 16:02 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Tue, 2 Aug 2022 17:29:47 +0300
> Cc: monnier@iro.umontreal.ca, 56682@debbugs.gnu.org, gregory@heytings.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > IOW, in this case the Emacs developers, due to long-standing bug
> > reports about this situation, recognized that it_can_  be improved,
> > albeit in slightly unorthodox ways, and have taken the measures to
> > optimize Emacs for the users.
> 
> It would be a shame to have the better-behaving (faster) major modes 
> exhibit worse behavior that they could have because of the approach we 
> choose to solve the long-lines problem.

Like I said: I'm okay with extending 'widen' to allow it to optionally
break the lock on the narrowing.  Then those modes which will indeed
show reasonable performance in these cases can use that optional
behavior.  Otherwise, I think we've waited for such improvements in
the modes long enough.

> Regarding the long-standing bug reports, we did solve a bunch of issues 
> already. One major one, IIUC, was redisplay of already fontified text on 
> long lines. Another piece of the puzzle was added by Stefan in 
> 15b2138719b340.

I invite you and Stefan to show that the improvement is performant
enough in the cases we are using as examples of long lines.

> So perhaps we should re-evaluate the testing scenario to see where the 
> current bottlenecks are. If we current main issue is the 55s spent in 
> syntax-ppss, a more constructive approach would be to look into 
> optimizing parse-partial-sexp. Or even give up on certain scenarios, 
> admitting that waiting 55s once to visit the end of a 1 GB buffer is not 
> so bad (and that could part could also be sped up by setting 
> syntax-propertize-function to nil and using a very simple syntax table, 
> for instance).

I reported the problem with syntax-propertize to Stefan 1.5 month ago,
see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45898#92.  There was
an improvement in that department, but AFAICT the net result is still
unsatisfactory if you disable the recent long-line optimizations
(e.g., by setting long-line-threshold to nil).

So, unless I'm mistaken, the same scenario described there is still
relevant, and doesn't yet need to be reevaluated.  But more data
points are always welcome, so feel free (you and everyone else) to
test the performance with and without this feature, on different files
with different major modes, and report back what you have found.  At
least I don't think the work on this is completed, we still have a lot
of turf to cover and probably deal with fallout we haven't yet heard
about.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:47                                         ` Dmitry Gutov
  2022-08-02 15:06                                           ` Gregory Heytings
@ 2022-08-02 16:11                                           ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02 16:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Tue, 2 Aug 2022 17:47:12 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > I believe I was talking about syntax-ppss being called from font-lock,
> > indeed.  Before Gregory's changes, if you visit a large file with very
> > long lines, and interrupt Emacs while it is non-responsive, you will
> > in many/most cases find yourself in syntax-propertize or its
> > subroutines, and you will see that it is almost always called to
> > traverse the entire long line.
> 
> Interrupt it right after pressing 'M->'? Or at any time during editing 
> the buffer later?

Just after "C-x C-f" and answering YES to the question whether to
visit it normally.

> The latter really shouldn't happen. If it does, perhaps it's the result 
> of narrowing during redisplay, which might blow syntax-ppss's caches.

I'm talking about running with narrowing disabled.

> In any case, if you could point to a scenario and the revision to test 
> it on, I'll be sure to take a look.

See my other message I sent a few minutes ago.  And I believe Gregory
also pointed to it.

> > Many major-modes do widen the buffer, though.
> 
> Whether they do or not, font-lock widens by default, see 
> font-lock-dont-widen.

Which is a breach of contract with jit-lock, if you ask me.  It's no
accident that jit-lock invokes fontification-functions only on a small
region of the buffer: that's how this feature was designed and
implemented, and its entirely consistent with the overall design of
the basic principle of the Emacs display engine: examine only as much
of the buffer as needed to be displayed.

> >> The latter seems to be the case already (if you open xdisp.c and press
> >> M->, only top and bottom of the buffer are fontified)
> > 
> > It is not enough to look for faces in order to realize how much of the
> > buffer was scanned.
> 
> I evaluated (next-single-property-change 1 'fontified), when near BOB 
> and when near EOB.

That's not enough either.  Try running under a debugger with a
watchpoint on the position point, while fontification-functions run,
and be amazed how many time it moves point waaay out of the region
that is about to be displayed.

> >> with the caveat that font-lock always tries to backtrack to BOL when
> >> fontifying the current hunk. Which makes sense, of course, but could
> >> be tweaked for long lines to avoid re-fontifying the whole buffer
> >> again and again.
> > 
> > "Tweaked" how?
> 
> 15b2138719b34083 is one example.

It's a good improvement, but much more is needed.

Again, why don't you try this yourself, after disabling the recently
added optimizations?

> > No one will object to making font-lock faster.  But the experts who
> > can do that are few and far in-between, and seem to have other itches
> > to scratch, since these issues are known for a long time, and several
> > times were even discussed at length.
> 
> The fact that we have +1 contributor to the C part of Emacs (the display 
> engine, etc), and a successful one at that, does nothing about the fact 
> that Lisp is easier to write and debug.
> 
> If we're able to demonstrate that the remaining bottlenecks are inside 
> font-lock, it should be easier to improve there.

And it will be very easy to disable these optimizations by default
when we decide that they are no longer needed.  Meanwhile, Emacs users
get to edit long lines with reasonable performance.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:57                                           ` Gregory Heytings
@ 2022-08-02 16:14                                             ` Eli Zaretskii
  2022-08-02 16:19                                               ` Gregory Heytings
  2022-08-02 22:04                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05  2:03                                             ` Dmitry Gutov
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-02 16:14 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, monnier, dgutov

> Date: Tue, 02 Aug 2022 14:57:25 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Eli Zaretskii <eliz@gnu.org>, 56682@debbugs.gnu.org, 
>     monnier@iro.umontreal.ca
> 
> > Regarding the long-standing bug reports, we did solve a bunch of issues 
> > already. One major one, IIUC, was redisplay of already fontified text on 
> > long lines.
> 
> Try to open the dictionary.json with Emacs on master a month ago.

I believe it can also be done with the current master, just after
setting long-line-threshold to the nil value.  Right?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 16:14                                             ` Eli Zaretskii
@ 2022-08-02 16:19                                               ` Gregory Heytings
  2022-08-03  0:00                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 16:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, monnier, dgutov


>> Try to open the dictionary.json with Emacs on master a month ago.
>
> I believe it can also be done with the current master, just after 
> setting long-line-threshold to the nil value.  Right?
>

Indeed.  With master from one month ago it's even more crystal-clear that 
you see the statu quo ante.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:01                                             ` Gregory Heytings
  2022-08-02 14:53                                               ` Dmitry Gutov
@ 2022-08-02 21:18                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 21:26                                                 ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:18 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> It is necessary to add a locked narrowing around
> fontification-functions and pre/post-command-hook to ensure that Emacs
> remains responsive.

It's neither necessary (it's perfectly possible to do something quickly
that just needs to look at the first few lines of the buffer to decide
in which way to parse the nearby surrounding bytes, for example) nor
sufficient (it's easy to spend minutes wasting time running in circles
because of a bug, e.g. a bug triggered by the fact that the "unusual"
nature of the visible part of the buffer after an arbitrary narrowing).

It's very useful to get closer to this goal like your code does, but
let's keep things in perspective.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 15:09                                                 ` Gregory Heytings
@ 2022-08-02 21:19                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03  2:30                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:19 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> Indeed, that's my conclusion too.  So until syntax-ppss (at least) is made
> an order of magnitude faster, the right thing to do is to use the forced
> narrowing method.

By tying this to "long lines" is wrong, since it has nothing to do with
long lines, only with large buffers.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:18                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 21:26                                                 ` Gregory Heytings
  2022-08-02 22:26                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 21:26 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>> It is necessary to add a locked narrowing around 
>> fontification-functions and pre/post-command-hook to ensure that Emacs 
>> remains responsive.
>
> It's neither necessary (it's perfectly possible to do something quickly 
> that just needs to look at the first few lines of the buffer to decide 
> in which way to parse the nearby surrounding bytes, for example) nor 
> sufficient (it's easy to spend minutes wasting time running in circles 
> because of a bug, e.g. a bug triggered by the fact that the "unusual" 
> nature of the visible part of the buffer after an arbitrary narrowing).
>

You're a mathematician, aren't you?  I'm not speaking of mathematical 
(i.e. absolute) necessity here, but of a practical necessity.  And I did 
not say (or thought) that it is sufficient.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:23                                         ` Gregory Heytings
  2022-08-01 21:53                                           ` Dmitry Gutov
@ 2022-08-02 21:32                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 21:43                                             ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:32 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

Gregory Heytings [2022-08-01 11:23:52] wrote:

>>> The problem is that this is, as I said, slow.  On my laptop, opening
>>> a 1 GB file takes about 6 seconds.  The call to syntax-ppss adds 70
>>> seconds, so opening a large file becomes an order of magnitude slower (13
>>> times slower).
>>
>> It's meaningless to talk about the time taken by `syntax-ppss` without
>> specifying the major mode that was in use.
>>
>
> It isn't.  The benchmark above was with a JSON file (js-mode), but you'll
> see the same ratio with an Elisp file for example:

Hmmm was that using the GNU ELPA `json-mode` package?
If so, then indeed there's no `syntax-propertize-function` setup there
and `syntax-ppss` should be ideally not much slower than
`parse-partial-sexp`.

> for I in $(seq 1 2500); do cat lisp/simple.el; done > complex.el
>
> That file opens in about 5 seconds, and (benchmark-run 1 (syntax-ppss
> (point-max))) takes about 45 seconds.
>
> Sure, there are perhaps modes that are slower, but my tests seem to indicate
> that the 1/10 ratio is correct, or IOW that syntax-ppss is an order of
> magnitude slower than opening the file.

You might be right.
But there are still significant differences between different major modes:

    LISP> (benchmark-run 1 (fundamental-mode) (parse-partial-sexp (point-min) (point-max)))
    (0.276774213 0 0.0)
    
    LISP> (benchmark-run 1 (fundamental-mode) (syntax-ppss (point-max)))
    (0.329234636 0 0.0)
    
    ELISP> (benchmark-run 1 (emacs-lisp-mode) (syntax-ppss (point-max)))
    (0.392759479 0 0.0)
    
    ELISP> (benchmark-run 1 (js-mode) (syntax-ppss (point-max)))
    (1.036089104 7 0.20054423700000001)
    
    ELISP> (benchmark-run 1 (nxml-mode) (syntax-ppss (point-max)))
    (1.169055192 7 0.15886504199999996)
    
    ELISP> (benchmark-run 1 (cperl-mode) (syntax-ppss (point-max)))
    (1.857638439 9 0.19724271499999996)

(this was in a 5MB buffer).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 10:00                                       ` Gregory Heytings
@ 2022-08-02 21:40                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 22:14                                           ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:40 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

> I see what you mean now.  But I don't think it would work.  What I want is
> to take reasonable measures to ensure that Emacs remains responsive "in
> spite of" mode maintainers.

"Reasonable measures" is good.
But once you step into "impossible to circumvent", you start going
directly against the goals of Free Software (and Emacs) to empower the user.

>> because there are still plenty of other ways they can shoot themselves in
>> the foot,
> I'm curious, are there other ways for a regular user (*not* an Elisp
> hacker!) to make Emacs completely unresponsive with regular editing
> commands, starting with emacs -Q?

We've had enough bug reports over the years (and I experienced such
things many times as well), yes.  Usually linked to bugs in some
package, of course, but in many case it wasn't a clear-cut bug, just
some bad interaction of various elements.

> I'm curious again, because I cannot imagine what that could be either.

I suffer from the same lack of imagination, as I sure many others here
do.  To make up for that, we've learned to follow a philosophy of
empowering the users (and not imposing arbitrary limits just because we
couldn't think of good reasons to go beyond them).

> Also note that you can blame yourself if you don't like the locked narrowing
> idea.  See the following three lines which you added ten years ago in
> eval.c:
>
> /* Don't export this variable to Elisp, so no one can mess with it
>    (Just imagine if someone makes it buffer-local).  */
> Funintern (Qinternal_interpreter_environment, Qnil);
>
> and which make it technically impossible to do something, incidentally
> without providing a way for Elisp hackers to escape that impossibility.

:-)


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:32                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 21:43                                             ` Gregory Heytings
  2022-08-03 21:26                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 21:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>
> Hmmm was that using the GNU ELPA `json-mode` package?
>

No, it's with emacs -Q.

>> Sure, there are perhaps modes that are slower, but my tests seem to 
>> indicate that the 1/10 ratio is correct, or IOW that syntax-ppss is an 
>> order of magnitude slower than opening the file.
>
> You might be right.
>
> But there are still significant differences between different major 
> modes:
>
> LISP> (benchmark-run 1 (fundamental-mode) (parse-partial-sexp (point-min) (point-max)))
> (0.276774213 0 0.0)
>
> LISP> (benchmark-run 1 (fundamental-mode) (syntax-ppss (point-max)))
> (0.329234636 0 0.0)
>
> ELISP> (benchmark-run 1 (emacs-lisp-mode) (syntax-ppss (point-max)))
> (0.392759479 0 0.0)
>
> ELISP> (benchmark-run 1 (js-mode) (syntax-ppss (point-max)))
> (1.036089104 7 0.20054423700000001)
>
> ELISP> (benchmark-run 1 (nxml-mode) (syntax-ppss (point-max)))
> (1.169055192 7 0.15886504199999996)
>
> ELISP> (benchmark-run 1 (cperl-mode) (syntax-ppss (point-max)))
> (1.857638439 9 0.19724271499999996)
>
> (this was in a 5MB buffer).
>

Yes, that's correct.  (But did you test each mode with the same 5 MB 
buffer?  If so, that's perhaps not representative of what happens in 
reality.)  The general idea is that syntax-ppss is currently an order of 
magnitude too slow for "too large" buffers.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 12:35                                       ` Eli Zaretskii
  2022-08-02 14:47                                         ` Dmitry Gutov
@ 2022-08-02 21:46                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03  2:33                                           ` Eli Zaretskii
  2022-08-02 21:49                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, Dmitry Gutov

>> So ideally font-lock is either called with undo-able narrowing, or is 
>> simply passed a range of positions, and shouldn't fontify too far from them.
> Many major-modes do widen the buffer, though.

Actually, IIUC this should be considered a bug (it breaks the use of
that major in mmm-mode and friends).

[ And `grep` suggests that less than half of progmodes/*.el use `widen`.  ]


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 12:35                                       ` Eli Zaretskii
  2022-08-02 14:47                                         ` Dmitry Gutov
  2022-08-02 21:46                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 21:49                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 21:53                                           ` Gregory Heytings
  2 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, Dmitry Gutov

> No one will object to making font-lock faster.  But the experts who
> can do that are few and far in-between, and seem to have other itches
> to scratch, since these issues are known for a long time, and several
> times were even discussed at length.

With a locked narrowing that can't be circumvented, this itch can't be
scratched any more :-(


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 15:06                                           ` Gregory Heytings
@ 2022-08-02 21:51                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 21:51 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> And even if one sets font-lock-dont-widen to nil, mode authors are
> completely free to ignore that setting (assuming they're aware it exists).

Hmm... `font-lock-dont-widen` is not about telling the major modes's
font-lock code to refrain from narrowing.  It's there so a major mode
(like mmm-mode) can tell font-lock.el to refrain from widening.

The major-mode's own font-lock rules should never need to widen
(because font-lock itself does it, normally).  If they do, it's a bug.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:49                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 21:53                                           ` Gregory Heytings
  2022-08-03  8:42                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 21:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>
> With a locked narrowing that can't be circumvented, this itch can't be 
> scratched any more :-(
>

Once again, it can!  Just M-: (setq long-line-threshold nil) RET.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:57                                           ` Gregory Heytings
  2022-08-02 16:14                                             ` Eli Zaretskii
@ 2022-08-02 22:04                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05  2:03                                             ` Dmitry Gutov
  2 siblings, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 22:04 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

>> Another piece of the puzzle was added by Stefan in 15b2138719b340.
> That looked promising, but sadly it had only a very limited effect.

FWIW, that was also my experience :-(

>> So perhaps we should re-evaluate the testing scenario to see where the
>> current bottlenecks are. If we current main issue is the 55s spent in
>> syntax-ppss, a more constructive approach would be to look into optimizing
>> parse-partial-sexp. Or even give up on certain scenarios, admitting that
>> waiting 55s once to visit the end of a 1 GB buffer is not so bad (and that
>> could part could also be sped up by setting syntax-propertize-function to
>> nil and using a very simple syntax table, for instance).

Those 55s were with a `syntax-propertize-function` set to nil already.

We could do a bit better in some modes, tho, e.g. in fundamental mode no
char has string or comment syntax, IIRC, so we could arguably make
`syntax-ppss` return a value without parsing anything at all (except
that `syntax-ppss` is also supposed to count parentheses, which *are*
present in fundamental-mode).

> It is bad, especially now that it became clear that in fact it's not
> "waiting 55s once" but "waiting 55s each time the buffer is modified and you
> move to another position in the buffer".

Actually, if 55s is the time it takes from BOB to EOB, then the time for
"change at POS1 plus move to POS2" should be approximately

    55s * (POS2 - POS1) / (buffer-size)


-- Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:40                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 22:14                                           ` Gregory Heytings
  2022-08-03  8:39                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 22:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii


>> I see what you mean now.  But I don't think it would work.  What I want 
>> is to take reasonable measures to ensure that Emacs remains responsive 
>> "in spite of" mode maintainers.
>
> "Reasonable measures" is good.
>
> But once you step into "impossible to circumvent", you start going 
> directly against the goals of Free Software (and Emacs) to empower the 
> user.
>

But it is *not* impossible to circumvent.  Just add (setq 
long-line-threshold nil) in your init file, or evaluate it with M-:, or...

>
> I suffer from the same lack of imagination, as I sure many others here 
> do.  To make up for that, we've learned to follow a philosophy of 
> empowering the users (and not imposing arbitrary limits just because we 
> couldn't think of good reasons to go beyond them).
>

I'm all for empowering the users who want power, and the changeset is, 
intentionally, fully backward compatible.  In fact, strictly speaking, 
this changeset does exactly that: it empowers users, they can do things 
they couldn't do earlier.  But I'm also for protecting regular users who 
want a powerful editor that "just works".  In the same vein, we are 
marking a number of local variables as risky, to protect regular users, 
without preventing users from shooting themselves in the foot by adding

(defalias 'risky-local-variable-p 'ignore)

in their init file if they see fit.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:26                                                 ` Gregory Heytings
@ 2022-08-02 22:26                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 22:52                                                     ` Gregory Heytings
  2022-08-03 11:54                                                     ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-02 22:26 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> You're a mathematician, aren't you?  I'm not speaking of mathematical
> (i.e. absolute) necessity here, but of a practical necessity.

I agree with the practical necessity to narrow.
I strongly disagree with the necessity to make re-widening
technically impossible, I find it fundamentally incompatible with
Emacs's philosophy and can't see any practical justification here.
Just narrow and make sure jit-lock.el and font-lock.el don't
accidentally widen it.  Any other accidental widening should be
considered as a bug anyway (and we could even easily cook up some ad-hoc
advice to try and detect those cases for people like me who lke to run
their Emacs with lots of extra runtime debugging checks).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 22:26                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-02 22:52                                                     ` Gregory Heytings
  2022-08-03  8:34                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03 11:54                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-02 22:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>
> I strongly disagree with the necessity to make re-widening technically 
> impossible, I find it fundamentally incompatible with Emacs's philosophy 
> and can't see any practical justification here.
>

As I said earlier, offering a way to escape the locked narrowing right 
away simply means that the current solution isn't one anymore.  Elisp 
programmers would take the habit of using (widen-unlock) instead of 
(widen) in their programs, and in a couple of years we'll see again bug 
reports by users who cannot edit buffers with long lines.

>
> Just narrow and make sure jit-lock.el and font-lock.el don't 
> accidentally widen it.
>

Now I'm lost.  Isn't this what is happening right now: "narrow and make 
sure jit-lock and font-lock don't accidentally widen it"?  What am I 
missing?

>
> Any other accidental widening should be considered as a bug anyway (and 
> we could even easily cook up some ad-hoc advice to try and detect those 
> cases for people like me who like to run their Emacs with lots of extra 
> runtime debugging checks).
>

There are I fear too many bugs related to that problem (you just said that 
half of the modes in core do use widen), and it does not seem reasonable 
to hope that they will all be fixed anytime soon.  If at some point they 
are, the current solution will not be necessary anymore, and it will be 
very easy to reset the default value of long-line-threshold to nil.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 16:19                                               ` Gregory Heytings
@ 2022-08-03  0:00                                                 ` Dmitry Gutov
  2022-08-03  0:26                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-03  0:00 UTC (permalink / raw)
  To: Gregory Heytings, Eli Zaretskii; +Cc: 56682, monnier

On 02.08.2022 19:19, Gregory Heytings wrote:
> 
>>> Try to open the dictionary.json with Emacs on master a month ago.
>>
>> I believe it can also be done with the current master, just after 
>> setting long-line-threshold to the nil value.  Right?
>>
> 
> Indeed.  With master from one month ago it's even more crystal-clear 
> that you see the statu quo ante.

If I set long-line-threshold to nil, does that also disable the 
redisplay optimizations related to long lines?

Ones that caused scrolling delays even after the buffer has been fully 
fontified.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  0:00                                                 ` Dmitry Gutov
@ 2022-08-03  0:26                                                   ` Dmitry Gutov
  2022-08-03  8:11                                                     ` Gregory Heytings
  2022-08-03 11:56                                                     ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-03  0:26 UTC (permalink / raw)
  To: Gregory Heytings, Eli Zaretskii; +Cc: 56682, monnier

On 03.08.2022 03:00, Dmitry Gutov wrote:
> On 02.08.2022 19:19, Gregory Heytings wrote:
>>
>>>> Try to open the dictionary.json with Emacs on master a month ago.
>>>
>>> I believe it can also be done with the current master, just after 
>>> setting long-line-threshold to the nil value.  Right?
>>>
>>
>> Indeed.  With master from one month ago it's even more crystal-clear 
>> that you see the statu quo ante.
> 
> If I set long-line-threshold to nil, does that also disable the 
> redisplay optimizations related to long lines?
> 
> Ones that caused scrolling delays even after the buffer has been fully 
> fontified.

I mean those that *fixed* the said scrolling delays, of course.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:19                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-03  2:30                                                     ` Eli Zaretskii
  2022-08-03  8:37                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-03  2:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Dmitry Gutov <dgutov@yandex.ru>,  56682@debbugs.gnu.org,  Eli Zaretskii
>  <eliz@gnu.org>
> Date: Tue, 02 Aug 2022 17:19:53 -0400
> 
> > Indeed, that's my conclusion too.  So until syntax-ppss (at least) is made
> > an order of magnitude faster, the right thing to do is to use the forced
> > narrowing method.
> 
> By tying this to "long lines" is wrong, since it has nothing to do with
> long lines, only with large buffers.

I thought you told me once that syntax-propertize needs to consider
complete lines in some (frequent) situations?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:46                                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-03  2:33                                           ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-03  2:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Dmitry Gutov <dgutov@yandex.ru>,  56682@debbugs.gnu.org,
>   gregory@heytings.org
> Date: Tue, 02 Aug 2022 17:46:49 -0400
> 
> >> So ideally font-lock is either called with undo-able narrowing, or is 
> >> simply passed a range of positions, and shouldn't fontify too far from them.
> > Many major-modes do widen the buffer, though.
> 
> Actually, IIUC this should be considered a bug (it breaks the use of
> that major in mmm-mode and friends).
> 
> [ And `grep` suggests that less than half of progmodes/*.el use `widen`.  ]

That fits my definition of "many".

And we also need to consider specifically those modes which are likely
to happen in files with long lines.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  0:26                                                   ` Dmitry Gutov
@ 2022-08-03  8:11                                                     ` Gregory Heytings
  2022-08-03 11:56                                                     ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-03  8:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]


>>>>> Try to open the dictionary.json with Emacs on master a month ago.
>>>> 
>>>> I believe it can also be done with the current master, just after 
>>>> setting long-line-threshold to the nil value.  Right?
>>> 
>>> Indeed.  With master from one month ago it's even more crystal-clear 
>>> that you see the statu quo ante.
>> 
>> If I set long-line-threshold to nil, does that also disable the 
>> redisplay optimizations related to long lines?
>> 
>> Ones that caused scrolling delays even after the buffer has been fully 
>> fontified.
>
> I mean those that *fixed* the said scrolling delays, of course.
>

To clarify, you need to (setq long-line-threshold nil syntax-wholeline-max 
most-positive-fixnum) to "recover" (more or less) how master was behaving 
a month ago.  Otherwise you'll see the effect of syntax-wholeline-max, 
which can be either positive or negative.

Three recipes (with today's master):

emacs -Q
M-: (setq long-line-threshold nil syntax-wholeline-max most-positive-fixnum) RET
C-x C-f dictionary.json RET y ;; takes 160 seconds
C-e ;; takes 200 seconds

emacs -Q
M-: (setq long-line-threshold nil) RET
C-x C-f dictionary.json RET y ;; immediate
C-e ;; not finished after 1200 seconds (20 minutes), I killed Emacs

emacs -Q
C-x C-f dictionary.json RET y ;; immediate
C-e ;; immediate

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 22:52                                                     ` Gregory Heytings
@ 2022-08-03  8:34                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03  9:04                                                         ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03  8:34 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> As I said earlier, offering a way to escape the locked narrowing right away
> simply means that the current solution isn't one anymore.
> Elisp programmers would take the habit of using (widen-unlock) instead of
> (widen) in their programs, and in a couple of years we'll see again bug
> reports by users who cannot edit buffers with long lines.

I don't think we need such a paternalistic view of ELisp programmers.
ELisp programmers aren't out there looking for ways to mess things up.
If we give them good tools that make it easy to solve the usual problems
with needing `widen-unlock`, they won't start using it recklessly everywhere.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  2:30                                                     ` Eli Zaretskii
@ 2022-08-03  8:37                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03 12:08                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03  8:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, dgutov

>> > Indeed, that's my conclusion too.  So until syntax-ppss (at least) is made
>> > an order of magnitude faster, the right thing to do is to use the forced
>> > narrowing method.
>> 
>> By tying this to "long lines" is wrong, since it has nothing to do with
>> long lines, only with large buffers.
>
> I thought you told me once that syntax-propertize needs to consider
> complete lines in some (frequent) situations?

Yes, but we're talking about `syntax-ppss` here.  Admittedly,
`syntax-ppss` uses `syntax-propertize` internally, but I think the two
need to be considered separately (and `syntax-propertize` already tries
to bound its work via `syntax-wholeline-max`).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 22:14                                           ` Gregory Heytings
@ 2022-08-03  8:39                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03  8:39 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii

>> But once you step into "impossible to circumvent", you start going
>> directly against the goals of Free Software (and Emacs) to empower
>> the user.
> But it is *not* impossible to circumvent.  Just add (setq
> long-line-threshold nil) in your init file, or evaluate it with M-:, or...

That's like saying you can fix the ugly color on strings by turning off font-lock.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:53                                           ` Gregory Heytings
@ 2022-08-03  8:42                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03  8:42 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

>> With a locked narrowing that can't be circumvented, this itch can't be
>> scratched any more :-(
> Once again, it can!  Just M-: (setq long-line-threshold nil) RET.

The itch is to make long-lines work better than with your code.
Not to make them completely unusable.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  8:34                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-03  9:04                                                         ` Gregory Heytings
  2022-08-03 20:33                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-03  9:04 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>
> I don't think we need such a paternalistic view of ELisp programmers. 
> ELisp programmers aren't out there looking for ways to mess things up. 
> If we give them good tools that make it easy to solve the usual problems 
> with needing `widen-unlock`, they won't start using it recklessly 
> everywhere.
>

I'd say it is realistic, based on the observation that half of the 
programming language modes in core do use widen, which they weren't 
supposed to do.

But this discussion is leading nowhere.  If you could point out to an 
actual (or even potential) problem caused by this locked narrowing, apart 
from an occasional mis-fontification, that would perhaps help it to 
advance.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 22:26                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-02 22:52                                                     ` Gregory Heytings
@ 2022-08-03 11:54                                                     ` Eli Zaretskii
  2022-08-03 20:36                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-03 11:54 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Dmitry Gutov <dgutov@yandex.ru>,  56682@debbugs.gnu.org,  Eli Zaretskii
>  <eliz@gnu.org>
> Date: Tue, 02 Aug 2022 18:26:51 -0400
> 
> I strongly disagree with the necessity to make re-widening
> technically impossible, I find it fundamentally incompatible with
> Emacs's philosophy and can't see any practical justification here.
> Just narrow and make sure jit-lock.el and font-lock.el don't
> accidentally widen it.

What do you mean by "accidentally" in this context?  A program rarely
does anything "accidentally", it almost always does what the
programmer wanted it to do.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  0:26                                                   ` Dmitry Gutov
  2022-08-03  8:11                                                     ` Gregory Heytings
@ 2022-08-03 11:56                                                     ` Eli Zaretskii
  2022-08-04  1:08                                                       ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-03 11:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Wed, 3 Aug 2022 03:26:02 +0300
> From: Dmitry Gutov <dgutov@yandex.ru>
> Cc: 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> On 03.08.2022 03:00, Dmitry Gutov wrote:
> > On 02.08.2022 19:19, Gregory Heytings wrote:
> >>
> >>>> Try to open the dictionary.json with Emacs on master a month ago.
> >>>
> >>> I believe it can also be done with the current master, just after 
> >>> setting long-line-threshold to the nil value.  Right?
> >>>
> >>
> >> Indeed.  With master from one month ago it's even more crystal-clear 
> >> that you see the statu quo ante.
> > 
> > If I set long-line-threshold to nil, does that also disable the 
> > redisplay optimizations related to long lines?
> > 
> > Ones that caused scrolling delays even after the buffer has been fully 
> > fontified.
> 
> I mean those that *fixed* the said scrolling delays, of course.

I don't think I understand which changes you had in mind (could you
give a less vague pointer?), but I think the answer is NO: that
variable disables only the recent changes related to long lines.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  8:37                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-03 12:08                                                         ` Eli Zaretskii
  2022-08-03 20:38                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-03 12:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  dgutov@yandex.ru,  56682@debbugs.gnu.org
> Date: Wed, 03 Aug 2022 04:37:55 -0400
> 
> >> > Indeed, that's my conclusion too.  So until syntax-ppss (at least) is made
> >> > an order of magnitude faster, the right thing to do is to use the forced
> >> > narrowing method.
> >> 
> >> By tying this to "long lines" is wrong, since it has nothing to do with
> >> long lines, only with large buffers.
> >
> > I thought you told me once that syntax-propertize needs to consider
> > complete lines in some (frequent) situations?
> 
> Yes, but we're talking about `syntax-ppss` here.  Admittedly,
> `syntax-ppss` uses `syntax-propertize` internally, but I think the two
> need to be considered separately (and `syntax-propertize` already tries
> to bound its work via `syntax-wholeline-max`).

What about parse-partial-sexp, which calls scan_sexps_forward?  It
looks like I've misremembered, and that was the culprit in the
scenario we discussed, see
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45898#92.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03  9:04                                                         ` Gregory Heytings
@ 2022-08-03 20:33                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03 21:37                                                             ` Gregory Heytings
  2022-08-03 22:10                                                             ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03 20:33 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> I'd say it is realistic, based on the observation that half of the
> programming language modes in core do use widen, which they weren't supposed
> to do.

AFAICT, most uses of widen in lisp/progmodes is *not* within code
related to font-lock (unsurprinsingly, since font-lock.el has been
widening "for ever", so widening in the major mode has always been
redundant).

> But this discussion is leading nowhere.  If you could point out to an
> actual (or even potential) problem caused by this locked narrowing,
> apart from an occasional mis-fontification, that would perhaps help it
> to advance.

If I were maintainer I'd just refuse such a change without corresponding
escape hatch, based on the experience gained over the years about how
Emacs is and should be designed.

As a package contributor I just find it offensive that the C code would
go to extra trouble in order to cut me out under the premise that "Elisp
programmers would take the habit of using (widen-unlock) instead of
(widen) in their programs", which I read as "we have to protect ELisp
contributors from themselves".

Also, I'm trying to imagine  scenario that leads to such an abuse:
- under normal circumstances, there are no long lines, so they'll never
  hit a "locked" narrowing and it will thus never occur to them to use
  a `widen-unlock`.
- when they get a bug report with a locked narrowing because of long
  lines, using `widen-unlock` naively is likely to lead to an immediate
  performance problem, so it's unlikely they'll use it.
I just don't buy it.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 11:54                                                     ` Eli Zaretskii
@ 2022-08-03 20:36                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-04  5:30                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03 20:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, dgutov

> What do you mean by "accidentally" in this context?  A program rarely
> does anything "accidentally", it almost always does what the
> programmer wanted it to do.

I mean that currently font-lock just blindly widens all the time, so we
should fix it so it only widens when it's expected to (i.e. not when
the widening is installed by the LLT code).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 12:08                                                         ` Eli Zaretskii
@ 2022-08-03 20:38                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-04  5:40                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03 20:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, dgutov

> What about parse-partial-sexp, which calls scan_sexps_forward?  It
> looks like I've misremembered, and that was the culprit in the
> scenario we discussed, see
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45898#92.

The wholelines problem did not kick in because of PPS nor `syntax-ppss`
but because of font-lock (which then called `syntax-ppss` which then
called PPS).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 21:43                                             ` Gregory Heytings
@ 2022-08-03 21:26                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03 21:42                                                 ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03 21:26 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

Gregory Heytings [2022-08-02 21:43:35] wrote:
>> Hmmm was that using the GNU ELPA `json-mode` package?
> No, it's with emacs -Q.

Sorry, I'm not sure how I missed that you said you used `js-mode`.

> Yes, that's correct.  (But did you test each mode with the same 5 MB buffer?

Yup, a 5MB binary file.

> If so, that's perhaps not representative of what happens in reality.)

Indeed, tho it gives a rough idea of how it can change.

> The general idea is that syntax-ppss is currently an order of
> magnitude too slow for "too large" buffers.

And that it can be almost another order of magnitude worse in some major
modes because of `syntax-propertize`.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 20:33                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-03 21:37                                                             ` Gregory Heytings
  2022-08-03 22:42                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03 22:10                                                             ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-03 21:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>
> Also, I'm trying to imagine scenario that leads to such an abuse:
>
> - under normal circumstances, there are no long lines, so they'll never 
> hit a "locked" narrowing and it will thus never occur to them to use a 
> `widen-unlock`.
>
> - when they get a bug report with a locked narrowing because of long 
> lines, using `widen-unlock` naively is likely to lead to an immediate 
> performance problem, so it's unlikely they'll use it.
>

When I read this, I thought you had a point, but there's a fallacy in your 
reasoning: using widen-unlock is in fact not likely to lead to an 
immediate performance problem.  The long-line-threshold limit is 
sufficiently high to never be reached in "normal" files, but nothing would 
happen if you cross that limit by a small amount, and nothing would even 
happen at twice or even thrice that limit.

If a mode author gets a bug report that is caused by locked narrowing, 
there is something wrong in the way the mode fontifies the buffer.  There 
is no reason to require access the whole buffer to fontify a small chunk 
of that buffer.  IOW, using widen-unlock there is nearly always wrong (I 
add "nearly" to leave open the possibility that there might be an 
exception).

This is becoming so litigious (you're now telling me that you're offended) 
that I start to believe that the right thing might in fact be to 
completely disable font locking in such buffers.  Would "no highlighting" 
be better than "occasional mis-highlighting" from your point of view?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 21:26                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-03 21:42                                                 ` Gregory Heytings
  2022-08-03 22:43                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-03 21:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>> The general idea is that syntax-ppss is currently an order of magnitude 
>> too slow for "too large" buffers.
>
> And that it can be almost another order of magnitude worse in some major 
> modes because of `syntax-propertize`.
>

That's possible, I don't know.  I might take a look at that later.  But 
note that your reactions are not exactly encouraging me to do so.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 20:33                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-03 21:37                                                             ` Gregory Heytings
@ 2022-08-03 22:10                                                             ` Gregory Heytings
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-03 22:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


By the way, I found at least one other case in which Emacs silently 
refuses to do what the programmer asked it to do.

A recipe:

emacs -Q
M-: (setq max-specpdl-size 100) RET
M-: max-specpdl-size RET ;; this returns 100, as expected
C-x C-f lisp/simple.el RET
M-: max-specpdl-size RET ;; this returns 400, why?
M-: (setq max-specpdl-size 100) RET
M-: max-specpdl-size RET ;; this returns 100 again, as expected
C-v ;; lean on the key until the end of the buffer
M-: max-specpdl-size RET ;; this returns 265, why?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 21:37                                                             ` Gregory Heytings
@ 2022-08-03 22:42                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-04  1:29                                                                 ` Gregory Heytings
  2022-08-04  6:08                                                                 ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03 22:42 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

>> - when they get a bug report with a locked narrowing because of long
>> lines, using `widen-unlock` naively is likely to lead to an immediate
>> performance problem, so it's unlikely they'll use it.
> When I read this, I thought you had a point, but there's a fallacy in your
> reasoning: using widen-unlock is in fact not likely to lead to an immediate
> performance problem.  The long-line-threshold limit is sufficiently high to
> never be reached in "normal" files, but nothing would happen if you cross
> that limit by a small amount, and nothing would even happen at twice or even
> thrice that limit.

That's a valid point.  A bit like Alan's bug report, where he gets
a regression for 10K-long lines where the performance would be tolerable.

> If a mode author gets a bug report that is caused by locked narrowing, there
> is something wrong in the way the mode fontifies the buffer.  There is no
> reason to require access the whole buffer to fontify a small chunk of that
> buffer.  IOW, using widen-unlock there is nearly always wrong (I add
> "nearly" to leave open the possibility that there might be an exception).

As I explained already, it's basically always wrong for a major mode's
font-lock rules to widen, regardless if the narrowing is due to
something like LLT or MMM-mode.

> This is becoming so litigious (you're now telling me that you're offended)
> that I start to believe that the right thing might in fact be to completely
> disable font locking in such buffers.  Would "no highlighting" be better
> than "occasional mis-highlighting" from your point of view?

I don't care about the mishighlighting and find the current behavior
perfectly acceptable from an end-user point of view.  I only care about
the extra enforcement done in C code without providing any mechanism to
circumvent it.  Especially since this discussion seems to suggest that
if I were to propose a patch that makes this locking a bit more "soft",
it might be rejected on the grounds that it opens the door to abuse, so
not only I strongly dislike this design but I can't even try and
improve it.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 21:42                                                 ` Gregory Heytings
@ 2022-08-03 22:43                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-04  1:30                                                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-03 22:43 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

>>> The general idea is that syntax-ppss is currently an order of magnitude
>>> too slow for "too large" buffers.
>> And that it can be almost another order of magnitude worse in some major
>> modes because of `syntax-propertize`.
> That's possible, I don't know.  I might take a look at that later.  But note
> that your reactions are not exactly encouraging me to do so.

I don't see any need to look into it, so don't waste your time there,
because whether what I said above is true or not is mostly irrelevant
(e.g. what would you do with that info anyway?).


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-07-28  7:49                                 ` Gregory Heytings
@ 2022-08-04  0:49                                   ` Dmitry Gutov
  2022-08-04  1:26                                     ` Gregory Heytings
  2022-08-04  7:29                                     ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-04  0:49 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier

[-- Attachment #1: Type: text/plain, Size: 2900 bytes --]

On 28.07.2022 10:49, Gregory Heytings wrote:
> 
>>> Yes, occasional mis-fontification is expected.  It's a compromise 
>>> between "no fontification" and "slow fontification".
>>
>> I wonder now if the majority of the slowdown was caused by the 
>> redisplay, whereas font-lock (which only has to run once per 
>> screenful) was actually "fast enough".
>>
> 
> Those two statements are not mutually exclusive.  The majority of the 
> slowdown was indeed caused by redisplay, but font-lock was not fast 
> enough.  Try to open a sufficiently large file (e.g. the dictionary.json 
> one) with the code on master, and type M->.

Where does one obtain dictionary.json from?

> You'll see that Emacs needs 
> about five seconds (on my laptop) to display the end of the buffer.

Does that come from the long lines, or solely from the size of the buffer?

> Now 
> compare that with the feature branch, with which the end of the buffer 
> is displayed instantaneously.  That five seconds delay is caused by 
> fontification-functions.

At some point we should accept that visiting a huge file might take some 
time (5 seconds doesn't sound terrible, depending on the context). 
Because the alternative is mis-fontifications and broken display.

>> Could you clarify what you mean by "access ... to the place where ... 
>> is defined"? "new Downloadify.Container" is highlighted by a regular 
>> regexp matcher, not some custom elisp code which has to visit the 
>> position where the identifier is defined.
>>
> 
> Sorry, I cannot be more precise, I don't have the "downloadify.js" file 
> here.  It was just a guess, based on what I saw on the screenshot, that 
> one function called by fontification-functions collects all class 
> definitions and highlights their identifiers elsewhere in the buffer 
> with a specific face.  When the buffer is narrowed, that function may 
> not see the Downloadify.Container definition (which is, I guess, placed 
> near the beginning of the file) anymore.

Here I'm attaching a version of downloadify.js we can use for comparison 
(please rename the extension from .sj to .js locally; Gmail was not 
letting it through otherwise). It's not a huge file, just about 88K.

As long as I keep my Emacs window/frame width half of the desktop, I can 
reliably reproduce the problem with the lack of highlighting for 
"Downloadify.Container" while other tokens are still highlighted.

I'm also attaching a screenshot of another problem: suddenly the bottom 
several screens of the buffer are mis-highlighted as if starting inside 
a string. That very much look like a result of breaking syntax-ppss's 
visibility of the buffer.

So the buffer scrolls quickly but looks bad.

Branch feature/long-lines-and-font-locking, revision cd41ce8c6c1079 from 
July 25. That branch is not there anymore, so let me know if I should 
re-test this with some later version of your work.

[-- Attachment #2: downloadify-min-example.sj --]
[-- Type: application/x-javascript, Size: 90077 bytes --]

[-- Attachment #3: Screenshot from 2022-08-04 03-32-22.png --]
[-- Type: image/png, Size: 1050723 bytes --]

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 11:56                                                     ` Eli Zaretskii
@ 2022-08-04  1:08                                                       ` Dmitry Gutov
  2022-08-04  1:34                                                         ` Gregory Heytings
  2022-08-04  6:40                                                         ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-04  1:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 03.08.2022 14:56, Eli Zaretskii wrote:
>> Date: Wed, 3 Aug 2022 03:26:02 +0300
>> From: Dmitry Gutov<dgutov@yandex.ru>
>> Cc:56682@debbugs.gnu.org,monnier@iro.umontreal.ca
>>
>> On 03.08.2022 03:00, Dmitry Gutov wrote:
>>> On 02.08.2022 19:19, Gregory Heytings wrote:
>>>>>> Try to open the dictionary.json with Emacs on master a month ago.
>>>>> I believe it can also be done with the current master, just after
>>>>> setting long-line-threshold to the nil value.  Right?
>>>>>
>>>> Indeed.  With master from one month ago it's even more crystal-clear
>>>> that you see the statu quo ante.
>>> If I set long-line-threshold to nil, does that also disable the
>>> redisplay optimizations related to long lines?
>>>
>>> Ones that caused scrolling delays even after the buffer has been fully
>>> fontified.
>> I mean those that*fixed*  the said scrolling delays, of course.
> I don't think I understand which changes you had in mind (could you
> give a less vague pointer?), but I think the answer is NO: that
> variable disables only the recent changes related to long lines.

I'm not sure if there are separate/specific changes to speak of (sorry, 
I'm flying blind), but previously I remarked in this discussion that on 
master pushing C-p can still result in sluggish response (at the end of 
a long line), even after the current window has been fontified (meaning 
font-lock has finished its work), and Gregory remarked that I should try 
the branch feature/long-lines-and-font-locking where this was fixed.

I did try the branch and indeed did not experience that sluggishness 
anymore. But that probably is a separate issue from slow font-lock.

So to try to debug any remaining speed issues with font-lock, it would 
be great to eliminate other sources of slow display first. In particular 
ones coming from a low-level subsystem I cannot as easily 
benchmark/debug/etc as Lisp code.

The setting

(setq long-line-threshold nil syntax-wholeline-max most-positive-fixnum)

indeed makes all the speed issues come back, including the 
aforementioned sluggishness.

You can try it yourself with downloadify.js I attached to the previous 
email. It's not a huge file either: the line is 90077 characters long.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 15:46                                                 ` Eli Zaretskii
@ 2022-08-04  1:08                                                   ` Dmitry Gutov
  2022-08-04  1:41                                                     ` Gregory Heytings
  2022-08-04  7:45                                                     ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-04  1:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 02.08.2022 18:46, Eli Zaretskii wrote:
>> Date: Tue, 2 Aug 2022 17:10:53 +0300
>> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> On 02.08.2022 05:27, Eli Zaretskii wrote:
>>>> syntax-ppss cache is a list of checkpoints spread along the buffer.
>>>>
>>>> After a modification, only the checkpoints below it are invalidated (to
>>>> be recomputed on-demand later).
>>> So a suitably-concocted replace command will still invalidate a lot of
>>> that cache, right?
>>
>> For any cache, one can invent an operation that would result in
>> thrashing it repeatedly.
> 
> Yes, but when thrashing causes delays of dozens of seconds, the result
> is not just a rare delay, the result is simply unacceptable.

What is unacceptable is the behavior I see from the narrowing solution. 
See the screenshots I attached in this thread.

>> A regular search-replace should work well enough, though. Because when
>> the buffer is long, the user is likely, on average, to spend a lot more
>> time examining the occurrences and deciding whether to replace each one.
>> And since the operation goes from top to bottom, this will likely
>> invalidate the list of caches once, and then rebuild it from the
>> beginning (or from wherever the first replacement was).
> 
> We want every basic operation in such buffers to perform reasonably
> well.  That's the goal of this activity.  Because partial solutions
> that sometimes work we already have: there's so-long-mode, there's
> longlines.el, and a couple of other trick up our sleeve.

We cannot perform every basic operation in fixed time for any 
arbitrarily sized file. There are limits of what we can possibly do.

If we narrow willy-nilly, we step on the toes of syntax parsing and get 
other weird behaviors as a result.

Which means we got another partial solution.

so-long-mode, longlines, etc, were all targeted as buffers with long 
lines. I'd really like it if we could scope this discussion to solving 
that particular problem. Not the speed of operations in large files in 
general.

The long lines problem is caused by pathologic complexity of some 
operations (like O(N^2) of line length, I guess). syntax-ppss's 
performance is nothing like that: it's O(N) for initial full scan, and 
O(1) for most operations afterward.

You can't really get better than that. Maye get a better multiplier with 
tree-sitter or a more optimized version of parse-partial-sexp, but take 
a 10x bigger file (or 100x bigger, or 1000x bigger) - and voila, the 
delay can be observed again.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  0:49                                   ` Dmitry Gutov
@ 2022-08-04  1:26                                     ` Gregory Heytings
  2022-08-04  7:50                                       ` Eli Zaretskii
  2022-08-04 10:35                                       ` Dmitry Gutov
  2022-08-04  7:29                                     ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  1:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier

[-- Attachment #1: Type: text/plain, Size: 2346 bytes --]


>
> Where does one obtain dictionary.json from?
>

From here, for example: https://www.heytings.org/data/dictionary.json .

>> You'll see that Emacs needs about five seconds (on my laptop) to 
>> display the end of the buffer.
>
> Does that come from the long lines, or solely from the size of the 
> buffer?
>

You're replying to a week old email.  IIRC that delay comes solely from 
font locking.

>> Now compare that with the feature branch, with which the end of the 
>> buffer is displayed instantaneously.  That five seconds delay is caused 
>> by fontification-functions.
>
> At some point we should accept that visiting a huge file might take some 
> time (5 seconds doesn't sound terrible, depending on the context). 
> Because the alternative is mis-fontifications and broken display.
>

If you dislike mis-fontification, turn font-lock mode off.

>
> Here I'm attaching a version of downloadify.js we can use for comparison 
> (please rename the extension from .sj to .js locally; Gmail was not 
> letting it through otherwise). It's not a huge file, just about 88K.
>

It's a tiny file, not in any way representative of the ones we're dealing 
with.  But amusingly, even with that tiny file, you can see the problem at 
hand.  Do M-: (setq long-line-threshold nil) RET, and open it in a large 
enough window (e.g. 160 characters).  Type M->, and try to move point 
there with C-p or C-n.  You'll see that Emacs is already sluggish.

>
> I'm also attaching a screenshot of another problem: suddenly the bottom 
> several screens of the buffer are mis-highlighted as if starting inside 
> a string. That very much look like a result of breaking syntax-ppss's 
> visibility of the buffer.
>
> So the buffer scrolls quickly but looks bad.
>

If you dislike mis-fontification, turn font-lock mode off.  It's as easy 
as that.  Mis-fontification is expected in such cases.  The docstring of 
syntax-wholeline-max also mentions that "misfontification may then occur". 
Why did you not protest at that time?

>
> Branch feature/long-lines-and-font-locking, revision cd41ce8c6c1079 from 
> July 25. That branch is not there anymore, so let me know if I should 
> re-test this with some later version of your work.
>

That branch doesn't exist anymore, it has been merged in master.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 22:42                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-04  1:29                                                                 ` Gregory Heytings
  2022-08-04  6:08                                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  1:29 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>
> I only care about the extra enforcement done in C code without providing 
> any mechanism to circumvent it.  Especially since this discussion seems 
> to suggest that if I were to propose a patch that makes this locking a 
> bit more "soft", it might be rejected on the grounds that it opens the 
> door to abuse, so not only I strongly dislike this design but I can't 
> even try and improve it.
>

I did not say that.  I do consider that it's not the right thing to do, 
for all the reasons I already gave, but if you have something concrete in 
mind, please show it.  I'm not the one who decides anyway.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 22:43                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-04  1:30                                                     ` Gregory Heytings
  2022-08-04 21:24                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  1:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, Dmitry Gutov


>>>> The general idea is that syntax-ppss is currently an order of 
>>>> magnitude too slow for "too large" buffers.
>>>
>>> And that it can be almost another order of magnitude worse in some 
>>> major modes because of `syntax-propertize`.
>>
>> That's possible, I don't know.  I might take a look at that later. 
>> But note that your reactions are not exactly encouraging me to do so.
>
> I don't see any need to look into it, so don't waste your time there,
>

To make syntax-ppss faster, if possible?  Is that not a sensible thing to 
do?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:08                                                       ` Dmitry Gutov
@ 2022-08-04  1:34                                                         ` Gregory Heytings
  2022-08-04  6:40                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  1:34 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier


>
> So to try to debug any remaining speed issues with font-lock, it would 
> be great to eliminate other sources of slow display first. In particular 
> ones coming from a low-level subsystem I cannot as easily 
> benchmark/debug/etc as Lisp code.
>
> The setting
>
> (setq long-line-threshold nil syntax-wholeline-max most-positive-fixnum)
>
> indeed makes all the speed issues come back, including the 
> aforementioned sluggishness.
>

If you want to see only the slowdowns cause by font locking, just comment 
out the "if (current_buffer->long_line_optimizations_p)" in 
xdisp.c:handle_fontified_prop.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:08                                                   ` Dmitry Gutov
@ 2022-08-04  1:41                                                     ` Gregory Heytings
  2022-08-05 12:28                                                       ` Dmitry Gutov
  2022-08-04  7:45                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  1:41 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier


>
> We cannot perform every basic operation in fixed time for any 
> arbitrarily sized file. There are limits of what we can possibly do.
>

Apparently the limits are lower than what you think.  Provided that we 
accept some compromises, such as mis-fontification, which is also what 
syntax-wholeline-max does, and against which you didn't protest.

>
> I'd really like it if we could scope this discussion to solving that 
> particular problem. Not the speed of operations in large files in 
> general.
>

I don't understand what you mean.  Which "particular problem"?  The point 
of this discussion is of course the speed of operations in large files in 
general.  If you take that out of the picture, everything is of course 
possible.  I'm not even sure what remains in fact, Emacs is an editor, not 
a displayer.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 20:36                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-04  5:30                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  5:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  dgutov@yandex.ru,  56682@debbugs.gnu.org
> Date: Wed, 03 Aug 2022 16:36:05 -0400
> 
> > What do you mean by "accidentally" in this context?  A program rarely
> > does anything "accidentally", it almost always does what the
> > programmer wanted it to do.
> 
> I mean that currently font-lock just blindly widens all the time, so we
> should fix it so it only widens when it's expected to (i.e. not when
> the widening is installed by the LLT code).

So let's please fix that, and take it from there.  We have plenty of
time to make followup decisions after we see what that does.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 20:38                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-04  5:40                                                             ` Eli Zaretskii
  2022-08-04 22:35                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  5:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  dgutov@yandex.ru,  56682@debbugs.gnu.org
> Date: Wed, 03 Aug 2022 16:38:57 -0400
> 
> > What about parse-partial-sexp, which calls scan_sexps_forward?  It
> > looks like I've misremembered, and that was the culprit in the
> > scenario we discussed, see
> > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45898#92.
> 
> The wholelines problem did not kick in because of PPS nor `syntax-ppss`
> but because of font-lock (which then called `syntax-ppss` which then
> called PPS).

If it's font-lock that forces syntax-ppss to examine the whole huge
line, then what is your proposal for avoiding that which doesn't
involve some more-or-less arbitrary restrictions on the part of the
buffer that can be examined by syntax-ppss?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-03 22:42                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-04  1:29                                                                 ` Gregory Heytings
@ 2022-08-04  6:08                                                                 ` Eli Zaretskii
  2022-08-04  6:23                                                                   ` Lars Ingebrigtsen
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  6:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: 56682@debbugs.gnu.org,  Eli Zaretskii <eliz@gnu.org>,  Dmitry Gutov
>  <dgutov@yandex.ru>
> Date: Wed, 03 Aug 2022 18:42:03 -0400
> 
> I don't care about the mishighlighting and find the current behavior
> perfectly acceptable from an end-user point of view.  I only care about
> the extra enforcement done in C code without providing any mechanism to
> circumvent it.  Especially since this discussion seems to suggest that
> if I were to propose a patch that makes this locking a bit more "soft",
> it might be rejected on the grounds that it opens the door to abuse

"Show me the code", instead of imagining what would be the reaction to
that.  (To say nothing of the fact that what I already wrote several
times explicitly contradicts your impression.)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  6:08                                                                 ` Eli Zaretskii
@ 2022-08-04  6:23                                                                   ` Lars Ingebrigtsen
  2022-08-04 11:21                                                                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-04  6:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, Stefan Monnier, dgutov

By the way, playing with Alan's example here a bit...  To recap, this is
the test case (in a .cc file):

---
char long_line[] = R"foo(

)foo"
---

If I insert a 1M long line there (with `C-y'), Emacs will hang
indefinitely.  Wasn't the long-line stuff supposed to trigger in these
situations?  Or is it hanging in some cc-mode stuff before we get that
far?

`C-g' a number of times will eventually get out of the hang, but then
Emacs hangs again.  `C-g' again a few times breaks out of that, and then
finally Emacs becomes responsive.  If I load the file with the 1M line
from the start, then Emacs is responsive all the time.






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:08                                                       ` Dmitry Gutov
  2022-08-04  1:34                                                         ` Gregory Heytings
@ 2022-08-04  6:40                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  6:40 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Thu, 4 Aug 2022 04:08:20 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 03.08.2022 14:56, Eli Zaretskii wrote:
> >> Date: Wed, 3 Aug 2022 03:26:02 +0300
> >> From: Dmitry Gutov<dgutov@yandex.ru>
> >> Cc:56682@debbugs.gnu.org,monnier@iro.umontreal.ca
> >>
> >> On 03.08.2022 03:00, Dmitry Gutov wrote:
> >>> On 02.08.2022 19:19, Gregory Heytings wrote:
> >>>>>> Try to open the dictionary.json with Emacs on master a month ago.
> >>>>> I believe it can also be done with the current master, just after
> >>>>> setting long-line-threshold to the nil value.  Right?
> >>>>>
> >>>> Indeed.  With master from one month ago it's even more crystal-clear
> >>>> that you see the statu quo ante.
> >>> If I set long-line-threshold to nil, does that also disable the
> >>> redisplay optimizations related to long lines?
> >>>
> >>> Ones that caused scrolling delays even after the buffer has been fully
> >>> fontified.
> >> I mean those that*fixed*  the said scrolling delays, of course.
> > I don't think I understand which changes you had in mind (could you
> > give a less vague pointer?), but I think the answer is NO: that
> > variable disables only the recent changes related to long lines.
> 
> I'm not sure if there are separate/specific changes to speak of (sorry, 
> I'm flying blind), but previously I remarked in this discussion that on 
> master pushing C-p can still result in sluggish response (at the end of 
> a long line), even after the current window has been fontified (meaning 
> font-lock has finished its work), and Gregory remarked that I should try 
> the branch feature/long-lines-and-font-locking where this was fixed.
> 
> I did try the branch and indeed did not experience that sluggishness 
> anymore. But that probably is a separate issue from slow font-lock.
> 
> So to try to debug any remaining speed issues with font-lock, it would 
> be great to eliminate other sources of slow display first. In particular 
> ones coming from a low-level subsystem I cannot as easily 
> benchmark/debug/etc as Lisp code.
> 
> The setting
> 
> (setq long-line-threshold nil syntax-wholeline-max most-positive-fixnum)
> 
> indeed makes all the speed issues come back, including the 
> aforementioned sluggishness.

OK, so I guess my answer to your question was accurate, and you now
have a way to reproduce those issues by disabling the recent speedups.

The changes in what was the long-lines-and-font-locking branch are now
on master, AFAIK.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  0:49                                   ` Dmitry Gutov
  2022-08-04  1:26                                     ` Gregory Heytings
@ 2022-08-04  7:29                                     ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  7:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, gregory, monnier

> Date: Thu, 4 Aug 2022 03:49:43 +0300
> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org,
>  Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > Now 
> > compare that with the feature branch, with which the end of the buffer 
> > is displayed instantaneously.  That five seconds delay is caused by 
> > fontification-functions.
> 
> At some point we should accept that visiting a huge file might take some 
> time (5 seconds doesn't sound terrible, depending on the context). 
> Because the alternative is mis-fontifications and broken display.
> 
> >> Could you clarify what you mean by "access ... to the place where ... 
> >> is defined"? "new Downloadify.Container" is highlighted by a regular 
> >> regexp matcher, not some custom elisp code which has to visit the 
> >> position where the identifier is defined.
> >>
> > 
> > Sorry, I cannot be more precise, I don't have the "downloadify.js" file 
> > here.  It was just a guess, based on what I saw on the screenshot, that 
> > one function called by fontification-functions collects all class 
> > definitions and highlights their identifiers elsewhere in the buffer 
> > with a specific face.  When the buffer is narrowed, that function may 
> > not see the Downloadify.Container definition (which is, I guess, placed 
> > near the beginning of the file) anymore.
> 
> Here I'm attaching a version of downloadify.js we can use for comparison 
> (please rename the extension from .sj to .js locally; Gmail was not 
> letting it through otherwise). It's not a huge file, just about 88K.
> 
> As long as I keep my Emacs window/frame width half of the desktop, I can 
> reliably reproduce the problem with the lack of highlighting for 
> "Downloadify.Container" while other tokens are still highlighted.
> 
> I'm also attaching a screenshot of another problem: suddenly the bottom 
> several screens of the buffer are mis-highlighted as if starting inside 
> a string. That very much look like a result of breaking syntax-ppss's 
> visibility of the buffer.
> 
> So the buffer scrolls quickly but looks bad.

I don't understand the point you are trying to make.

On the one hand, you say "At some point we should accept that visiting
a huge file might take some time", which seems to imply that you agree
in general with some (hopefully graceful) degradation when editing
files with such long lines.  But OTOH you object to have that
degradation in the fontification?  IOW, you prefer Emacs to become
much slower, but still fontify correctly?  If so, just enlarge the
value of long-line-threshold, with the effect that Emacs will become
more sluggish before the long-line optimizations kick in.  If this is
your point, then maybe lobby for enlarging the default value of that
variable.

But if you are saying that Emacs should behave as it does with that
variable being nil, then I don't understand your position.  With that
variable nil, Emacs becomes _unusable_, not just slow, with files that
are not even too large by any modern measure, just because the lines
are very long.  And in those cases, why is it wrong to decide that
occasional glitches in fontifications are a lesser evil than a
complete lockup of the Emacs UI, which usually results in users
killing the session?  We decided that some imperfection in
fontifications are the "graceful degradation" we are willing to endure
in order to make Emacs reasonably performant in those cases.  What is
wrong with that tradeoff?  And what alternative tradeoff would you
suggest instead?

(You mention "broken display" in addition to inaccurate
fontifications, but I don't understand what does that allude to.
Which instances of broken display did you see, and how to reproduce
them?)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:08                                                   ` Dmitry Gutov
  2022-08-04  1:41                                                     ` Gregory Heytings
@ 2022-08-04  7:45                                                     ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  7:45 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Thu, 4 Aug 2022 04:08:20 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > Yes, but when thrashing causes delays of dozens of seconds, the result
> > is not just a rare delay, the result is simply unacceptable.
> 
> What is unacceptable is the behavior I see from the narrowing solution. 
> See the screenshots I attached in this thread.

So that behavior is unacceptable, but declaring we are unable to allow
editing of such files is acceptable?  If that is your position, we
will have to agree to disagree.  We have decided that we want a more
graceful degradation in these cases, and in particular to sacrifice
some accuracy of font-lock to allow reasonable editing of such files.
If you consider that unacceptable, you can still have the lockups you
prefer by setting long-line-threshold to nil.  But we disagree with
having nil be the default value of that variable.

> > We want every basic operation in such buffers to perform reasonably
> > well.  That's the goal of this activity.  Because partial solutions
> > that sometimes work we already have: there's so-long-mode, there's
> > longlines.el, and a couple of other trick up our sleeve.
> 
> We cannot perform every basic operation in fixed time for any 
> arbitrarily sized file. There are limits of what we can possibly do.

I said "reasonably well", not "in fixed time".  Some slowdown is
acceptable.  But when a simple editing operation takes several
minutes, we cannot in good faith claim that Emacs is still usable.  If
you prefer to have that to some degradation in font-lock, then we
disagree.

> If we narrow willy-nilly, we step on the toes of syntax parsing and get 
> other weird behaviors as a result.
> 
> Which means we got another partial solution.

The experience till now seems to indicate that the degradation caused
by this solution is much milder and rarer than any other solution, and
allows users to edit files with much longer lines.  It certainly
sounds like a better solution than completely disabling font-lock in
such buffers (but that is still an option, if someone prefers it).
Reporting specific problems with that solution will allow us to solve
at least some of them, with the goal of making the degradation even
more graceful -- this is what this bug report is for.

But if you want to claim that it is better to have Emacs lock up for
minutes in these cases, you will have to do that in your local
customization, because we disagree with having Emacs behave like that
when it can be avoided.

> so-long-mode, longlines, etc, were all targeted as buffers with long 
> lines. I'd really like it if we could scope this discussion to solving 
> that particular problem. Not the speed of operations in large files in 
> general.

The "long lines problem" is directly related to the speed of
operations.  We didn't yet make any changes that affect large files in
general, only files with long lines.  In those cases, the speed of
operations becomes unacceptably slow beyond some threshold, and we
want Emacs to remain usable beyond that threshold, even if that makes
fontifications sometimes inaccurate.

> The long lines problem is caused by pathologic complexity of some 
> operations (like O(N^2) of line length, I guess).

That assumption is incorrect, at least according to my analysis of the
relevant bottlenecks.  The slowdown is mostly linear in the number of
buffer positions redisplay and its subroutines need to traverse.

> syntax-ppss's performance is nothing like that: it's O(N) for
> initial full scan, and O(1) for most operations afterward.
> 
> You can't really get better than that. Maye get a better multiplier with 
> tree-sitter or a more optimized version of parse-partial-sexp, but take 
> a 10x bigger file (or 100x bigger, or 1000x bigger) - and voila, the 
> delay can be observed again.

No one denies that beyond some threshold the performance will be too
slow again.  We just want to make that threshold much farther.

In addition, the idea of using narrowing is a good one precisely
because it is unaffected by the buffer size.  So its effect doesn't
deteriorate when the buffer or the line length becomes larger.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:26                                     ` Gregory Heytings
@ 2022-08-04  7:50                                       ` Eli Zaretskii
  2022-08-04  9:24                                         ` Gregory Heytings
  2022-08-04 10:35                                       ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  7:50 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 01:26:04 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, 
>     Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca
> 
> > Branch feature/long-lines-and-font-locking, revision cd41ce8c6c1079 from 
> > July 25. That branch is not there anymore, so let me know if I should 
> > re-test this with some later version of your work.
> 
> That branch doesn't exist anymore, it has been merged in master.

Right.

Gregory, is there a reason why the long-lines-improvements branch is
not yet merged?  I think it already includes important improvements
that should be exposed to a larger population for testing and
feedback.

If you for some reason prefer to keep that branch active, can you
please merge the current master to it, so that the changes in
narrow-to-region will be on the branch?  As things are now, the
problems you solved on master in that part are not yet solved on the
branch, and so working on the branch runs the risk of hitting problems
unrelated to changes in display code.

Thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  7:50                                       ` Eli Zaretskii
@ 2022-08-04  9:24                                         ` Gregory Heytings
  2022-08-04  9:36                                           ` Eli Zaretskii
                                                             ` (2 more replies)
  0 siblings, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  9:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>
> Gregory, is there a reason why the long-lines-improvements branch is not 
> yet merged?  I think it already includes important improvements that 
> should be exposed to a larger population for testing and feedback.
>

I was planning to add more improvements, but indeed merging that one and 
creating yet another one seems better at this point.

I just pushed an improvement for Bidi.  I did not do so earlier because I 
was trying to improve it further, but somehow I'm hitting a brick wall 
here.  Could you please check, and tell me if what I already did is okay?

In this case I did not find a good test case in the wild (or more 
precisely, I had no idea how to find one), so I concoted one myself. 
You'll find it here: https://www.heytings.org/data/locales.json .  Emacs 
is still a bit sluggish with that file, but behaves much better than 
before.

>
> If you for some reason prefer to keep that branch active, can you please 
> merge the current master to it, so that the changes in narrow-to-region 
> will be on the branch?
>

Is that feasible?  Non fast forwards are not allowed in the Emacs 
repository, so rebasing a feature branch is not possible (without using 
the workaround of deleting and recreating the branch).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:24                                         ` Gregory Heytings
@ 2022-08-04  9:36                                           ` Eli Zaretskii
  2022-08-04  9:43                                             ` Gregory Heytings
  2022-08-04  9:40                                           ` Eli Zaretskii
  2022-08-04  9:52                                           ` Stefan Kangas
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  9:36 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 09:24:00 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: dgutov@yandex.ru, gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, 
>     monnier@iro.umontreal.ca
> 
> I just pushed an improvement for Bidi.  I did not do so earlier because I 
> was trying to improve it further, but somehow I'm hitting a brick wall 
> here.  Could you please check, and tell me if what I already did is okay?

Will do, thanks.

> > If you for some reason prefer to keep that branch active, can you please 
> > merge the current master to it, so that the changes in narrow-to-region 
> > will be on the branch?
> 
> Is that feasible?  Non fast forwards are not allowed in the Emacs 
> repository, so rebasing a feature branch is not possible (without using 
> the workaround of deleting and recreating the branch).

I didn't mean a rebase, I meant a merge.  A merge from master to a
feature branch will work exactly as a merge from emacs-28 to master
or, indeed, a merge from a feature branch to master.  There are no
problems here.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:24                                         ` Gregory Heytings
  2022-08-04  9:36                                           ` Eli Zaretskii
@ 2022-08-04  9:40                                           ` Eli Zaretskii
  2022-08-04  9:46                                             ` Gregory Heytings
  2022-08-04  9:52                                           ` Stefan Kangas
  2 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  9:40 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 09:24:00 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: dgutov@yandex.ru, gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, 
>     monnier@iro.umontreal.ca
> 
> I just pushed an improvement for Bidi.  I did not do so earlier because I 
> was trying to improve it further, but somehow I'm hitting a brick wall 
> here.  Could you please check, and tell me if what I already did is okay?

Hmm... which part(s) of the recent commit(s) on the branch are related
to bidi?  I only see changes related to composed characters.  What did
I miss?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:36                                           ` Eli Zaretskii
@ 2022-08-04  9:43                                             ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  9:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>>> If you for some reason prefer to keep that branch active, can you 
>>> please merge the current master to it, so that the changes in 
>>> narrow-to-region will be on the branch?
>>
>> Is that feasible?  Non fast forwards are not allowed in the Emacs 
>> repository, so rebasing a feature branch is not possible (without using 
>> the workaround of deleting and recreating the branch).
>
> I didn't mean a rebase, I meant a merge.
>

Indeed, I did not read what you wrote with enough attention.  Sorry for 
that.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:40                                           ` Eli Zaretskii
@ 2022-08-04  9:46                                             ` Gregory Heytings
  2022-08-04  9:57                                               ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04  9:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>> I just pushed an improvement for Bidi.  I did not do so earlier because 
>> I was trying to improve it further, but somehow I'm hitting a brick 
>> wall here.  Could you please check, and tell me if what I already did 
>> is okay?
>
> Hmm... which part(s) of the recent commit(s) on the branch are related 
> to bidi?  I only see changes related to composed characters.  What did I 
> miss?
>

82b602dc2f improves bidi in long lines, without indeed touching bidi.c.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:24                                         ` Gregory Heytings
  2022-08-04  9:36                                           ` Eli Zaretskii
  2022-08-04  9:40                                           ` Eli Zaretskii
@ 2022-08-04  9:52                                           ` Stefan Kangas
  2 siblings, 0 replies; 416+ messages in thread
From: Stefan Kangas @ 2022-08-04  9:52 UTC (permalink / raw)
  To: Gregory Heytings, Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov

Gregory Heytings <gregory@heytings.org> writes:

>> If you for some reason prefer to keep that branch active, can you please
>> merge the current master to it, so that the changes in narrow-to-region
>> will be on the branch?
>
> Is that feasible?  Non fast forwards are not allowed in the Emacs
> repository, so rebasing a feature branch is not possible (without using
> the workaround of deleting and recreating the branch).

Rebasing is not possible, but merging is.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:46                                             ` Gregory Heytings
@ 2022-08-04  9:57                                               ` Eli Zaretskii
  2022-08-04 10:33                                                 ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04  9:57 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 09:46:17 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: dgutov@yandex.ru, gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, 
>     monnier@iro.umontreal.ca
> 
> 
> >> I just pushed an improvement for Bidi.  I did not do so earlier because 
> >> I was trying to improve it further, but somehow I'm hitting a brick 
> >> wall here.  Could you please check, and tell me if what I already did 
> >> is okay?
> >
> > Hmm... which part(s) of the recent commit(s) on the branch are related 
> > to bidi?  I only see changes related to composed characters.  What did I 
> > miss?
> >
> 
> 82b602dc2f improves bidi in long lines, without indeed touching bidi.c.

The changes are related to compositions, not to bidi.  Displaying
Arabic (and maybe also other characters in that file) requires
character composition, but it has nothing in particular to do with
bidi per se.

Are you saying that if you replace the Arabic text there with some
other script that also requires composition processing (like one of
the Indic scripts, see lisp/language/indian.el), editing this file is
significantly faster?

Btw, I'm unable to edit that file on the branch, because
show-paren--default causes an assertion violation.  I'm pretty sure
that's due to the issues in narrow-to-region that were already fixed
on master.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  9:57                                               ` Eli Zaretskii
@ 2022-08-04 10:33                                                 ` Gregory Heytings
  2022-08-04 13:10                                                   ` Eli Zaretskii
  2022-08-04 14:14                                                   ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 10:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>
> The changes are related to compositions, not to bidi.  Displaying Arabic 
> (and maybe also other characters in that file) requires character 
> composition, but it has nothing in particular to do with bidi per se.
>

I know next to nothing about bidi, so it's very well possible indeed that 
I confused "bidi" and "composition" (or "bidi composition"?).  Anyway, 
navigating through the locales.json file was slow (at some positions) 
before the change and is now reasonably fast (but alas not instantaneous).

>
> Are you saying that if you replace the Arabic text there with some other 
> script that also requires composition processing (like one of the Indic 
> scripts, see lisp/language/indian.el), editing this file is 
> significantly faster?
>

I think the locales.json file contains samples of pretty much all 
available scripts.  Devanagari for example is around position 3260000. 
As far as I can tell, navigating in that part of the file is not 
significantly faster with the change.  It is only in the parts of the file 
that contain e.g. Arabic text that the speedup is visible, around position 
70000 for example.

>
> Btw, I'm unable to edit that file on the branch, because 
> show-paren--default causes an assertion violation.  I'm pretty sure 
> that's due to the issues in narrow-to-region that were already fixed on 
> master.
>

I just merged master into the feature branch.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:26                                     ` Gregory Heytings
  2022-08-04  7:50                                       ` Eli Zaretskii
@ 2022-08-04 10:35                                       ` Dmitry Gutov
  2022-08-04 11:29                                         ` Gregory Heytings
  2022-08-04 13:09                                         ` Eli Zaretskii
  1 sibling, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-04 10:35 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier

On 04.08.2022 04:26, Gregory Heytings wrote:

>> Here I'm attaching a version of downloadify.js we can use for 
>> comparison (please rename the extension from .sj to .js locally; Gmail 
>> was not letting it through otherwise). It's not a huge file, just 
>> about 88K.
>>
> 
> It's a tiny file, not in any way representative of the ones we're 
> dealing with.  But amusingly, even with that tiny file, you can see the 
> problem at hand.  Do M-: (setq long-line-threshold nil) RET, and open it 
> in a large enough window (e.g. 160 characters).  Type M->, and try to 
> move point there with C-p or C-n.  You'll see that Emacs is already 
> sluggish.

That's the scenario I described, and that's my point: this file's 
display is sluggish. Even though font-lock has already finished its 
work. And it didn't have to spend any significant time in syntax-ppss.

So there is a particular performance problem with the display of 
fontified buffers which I'd really like your help in fixing.

Fixing in a way that doesn't add narrowing around 
fontification-functions, because as we can see it's not necessary in 
examples like this.

Then it would be much easier to evaluate font-lock's effect on 
performance in larger files.

>> I'm also attaching a screenshot of another problem: suddenly the 
>> bottom several screens of the buffer are mis-highlighted as if 
>> starting inside a string. That very much look like a result of 
>> breaking syntax-ppss's visibility of the buffer.
>>
>> So the buffer scrolls quickly but looks bad.
>>
> 
> If you dislike mis-fontification, turn font-lock mode off.  It's as easy 
> as that.  Mis-fontification is expected in such cases.  The docstring of 
> syntax-wholeline-max also mentions that "misfontification may then 
> occur". Why did you not protest at that time?

I think we could have both speed and correctness, at least for files of 
this size.

>> Branch feature/long-lines-and-font-locking, revision cd41ce8c6c1079 
>> from July 25. That branch is not there anymore, so let me know if I 
>> should re-test this with some later version of your work.
>>
> 
> That branch doesn't exist anymore, it has been merged in master.

Thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  6:23                                                                   ` Lars Ingebrigtsen
@ 2022-08-04 11:21                                                                     ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 11:21 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 56682, Eli Zaretskii, Stefan Monnier, dgutov


>
> By the way, playing with Alan's example here a bit...  To recap, this is 
> the test case (in a .cc file):
>
> ---
> char long_line[] = R"foo(
>
> )foo"
> ---
>
> If I insert a 1M long line there (with `C-y'), Emacs will hang 
> indefinitely.  Wasn't the long-line stuff supposed to trigger in these 
> situations?  Or is it hanging in some cc-mode stuff before we get that 
> far?
>

No wonder.  CC Mode is a slow mode, and one of the worst offenders here. 
In this cas, IIUC, what you see is because CC Mode adds c-after-change to 
after-change-functions, which has the effect that

(put-text-property 27 1000028 'face 'font-lock-string-face)

is called no less than 1335 times after that C-y.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 10:35                                       ` Dmitry Gutov
@ 2022-08-04 11:29                                         ` Gregory Heytings
  2022-08-04 11:59                                           ` Stefan Kangas
  2022-08-04 13:09                                         ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 11:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier


>
> So there is a particular performance problem with the display of 
> fontified buffers which I'd really like your help in fixing.
>

Thank you, but that problem has, as you can see, already been fixed.

>
> Fixing in a way that doesn't add narrowing around 
> fontification-functions, because as we can see it's not necessary in 
> examples like this.
>

As I said, your 88K example file is not representative.  So the fact that 
it's not necessary in that tiny example file says nothing about much 
larger files.

>
> I think we could have both speed and correctness, at least for files of 
> this size.
>

Feel free to enlarge long-line-threshold in your init file.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 11:29                                         ` Gregory Heytings
@ 2022-08-04 11:59                                           ` Stefan Kangas
  2022-08-04 12:05                                             ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Kangas @ 2022-08-04 11:59 UTC (permalink / raw)
  To: Gregory Heytings, Dmitry Gutov
  Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier

Gregory Heytings <gregory@heytings.org> writes:

>> I think we could have both speed and correctness, at least for files of
>> this size.
>
> Feel free to enlarge long-line-threshold in your init file.

Just a curious question: How did we arrive at the current value of
`long-line-threshold'?

I've been trying to follow the discussion but I don't think this has
been mentioned anywhere.  My apologies if I missed it.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 11:59                                           ` Stefan Kangas
@ 2022-08-04 12:05                                             ` Gregory Heytings
  2022-08-04 12:40                                               ` Eli Zaretskii
  2022-08-04 21:37                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 12:05 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier, Dmitry Gutov

[-- Attachment #1: Type: text/plain, Size: 568 bytes --]


>>> I think we could have both speed and correctness, at least for files 
>>> of this size.
>>
>> Feel free to enlarge long-line-threshold in your init file.
>
> Just a curious question: How did we arrive at the current value of 
> `long-line-threshold'?
>

I took the same value that Stefan had chosen for syntax-wholeline-max.

>
> I've been trying to follow the discussion but I don't think this has 
> been mentioned anywhere.  My apologies if I missed it.
>

No need to apologize, you're a careful reader 😃, it hasn't been mentioned 
indeed.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 12:05                                             ` Gregory Heytings
@ 2022-08-04 12:40                                               ` Eli Zaretskii
  2022-08-04 13:10                                                 ` Gregory Heytings
  2022-08-04 21:37                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 12:40 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, stefankangas, monnier, dgutov

> Date: Thu, 04 Aug 2022 12:05:14 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Dmitry Gutov <dgutov@yandex.ru>, gerd.moellmann@gmail.com, 
>     56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>, 
>     monnier@iro.umontreal.ca
> 
> >> Feel free to enlarge long-line-threshold in your init file.
> >
> > Just a curious question: How did we arrive at the current value of 
> > `long-line-threshold'?
> >
> 
> I took the same value that Stefan had chosen for syntax-wholeline-max.

And it isn't sacred in any way.  If we decide a different value
strikes a better balance, we will change it.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 10:35                                       ` Dmitry Gutov
  2022-08-04 11:29                                         ` Gregory Heytings
@ 2022-08-04 13:09                                         ` Eli Zaretskii
  2022-08-05  1:39                                           ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 13:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, gregory, monnier

> Date: Thu, 4 Aug 2022 13:35:39 +0300
> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org,
>  Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > It's a tiny file, not in any way representative of the ones we're 
> > dealing with.  But amusingly, even with that tiny file, you can see the 
> > problem at hand.  Do M-: (setq long-line-threshold nil) RET, and open it 
> > in a large enough window (e.g. 160 characters).  Type M->, and try to 
> > move point there with C-p or C-n.  You'll see that Emacs is already 
> > sluggish.
> 
> That's the scenario I described, and that's my point: this file's 
> display is sluggish. Even though font-lock has already finished its 
> work. And it didn't have to spend any significant time in syntax-ppss.
> 
> So there is a particular performance problem with the display of 
> fontified buffers which I'd really like your help in fixing.

Maybe it is in display, and maybe it isn't.  Do you have any evidence
that the sluggish response is due to redisplay?  C-n, for example, is
mostly not redisplay, but Lisp code in simple.el and occasional calls
to vertical-motion.

But even if the slow response is due to redisplay, we just have
another cause that we need to investigate and try fixing.  It says
nothing about the measures we've already taken on master.  They
definitely make even this case faster, and with an unoptimized build I
can now reasonably edit this file, something I couldn't do before.

> Fixing in a way that doesn't add narrowing around 
> fontification-functions, because as we can see it's not necessary in 
> examples like this.

If that is possible, sure.  No one said that from now on every problem
in Emacs that causes slow responses will be handled by narrowing.  But
if, for example, it turns out that the slow responses is due to time
it takes some code to traverse a long stretch of fontified buffer,
what other solution would you suggest except making the portion to be
traversed shorter?

> > If you dislike mis-fontification, turn font-lock mode off.  It's as easy 
> > as that.  Mis-fontification is expected in such cases.  The docstring of 
> > syntax-wholeline-max also mentions that "misfontification may then 
> > occur". Why did you not protest at that time?
> 
> I think we could have both speed and correctness, at least for files of 
> this size.

That is not a given, and the experience till now suggests otherwise.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 10:33                                                 ` Gregory Heytings
@ 2022-08-04 13:10                                                   ` Eli Zaretskii
  2022-08-04 14:14                                                   ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 13:10 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 10:33:48 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> I know next to nothing about bidi, so it's very well possible indeed that 
> I confused "bidi" and "composition" (or "bidi composition"?).  Anyway, 
> navigating through the locales.json file was slow (at some positions) 
> before the change and is now reasonably fast (but alas not instantaneous).
> 
> >
> > Are you saying that if you replace the Arabic text there with some other 
> > script that also requires composition processing (like one of the Indic 
> > scripts, see lisp/language/indian.el), editing this file is 
> > significantly faster?
> >
> 
> I think the locales.json file contains samples of pretty much all 
> available scripts.  Devanagari for example is around position 3260000. 
> As far as I can tell, navigating in that part of the file is not 
> significantly faster with the change.  It is only in the parts of the file 
> that contain e.g. Arabic text that the speedup is visible, around position 
> 70000 for example.

OK, I will take a look.

> > Btw, I'm unable to edit that file on the branch, because 
> > show-paren--default causes an assertion violation.  I'm pretty sure 
> > that's due to the issues in narrow-to-region that were already fixed on 
> > master.
> 
> I just merged master into the feature branch.

Thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 12:40                                               ` Eli Zaretskii
@ 2022-08-04 13:10                                                 ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 13:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, dgutov, stefankangas, monnier


>>>> Feel free to enlarge long-line-threshold in your init file.
>>>
>>> Just a curious question: How did we arrive at the current value of 
>>> `long-line-threshold'?
>>
>> I took the same value that Stefan had chosen for syntax-wholeline-max.
>
> And it isn't sacred in any way.  If we decide a different value strikes 
> a better balance, we will change it.
>

Indeed (although I fear a bikeshedding here).





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 10:33                                                 ` Gregory Heytings
  2022-08-04 13:10                                                   ` Eli Zaretskii
@ 2022-08-04 14:14                                                   ` Eli Zaretskii
  2022-08-04 14:31                                                     ` Eli Zaretskii
  2022-08-04 15:08                                                     ` Gregory Heytings
  1 sibling, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 14:14 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 10:33:48 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> navigating through the locales.json file was slow (at some positions) 
> before the change and is now reasonably fast (but alas not instantaneous).

Which navigation commands were slow, as compared to the same commands
in other portions of this file?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 14:14                                                   ` Eli Zaretskii
@ 2022-08-04 14:31                                                     ` Eli Zaretskii
  2022-08-04 15:25                                                       ` Gregory Heytings
  2022-08-04 15:08                                                     ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 14:31 UTC (permalink / raw)
  To: gregory, gerd.moellmann, 56682, monnier, dgutov

> Cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca,
>  dgutov@yandex.ru
> Date: Thu, 04 Aug 2022 17:14:58 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Date: Thu, 04 Aug 2022 10:33:48 +0000
> > From: Gregory Heytings <gregory@heytings.org>
> > cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
> >     dgutov@yandex.ru
> > 
> > navigating through the locales.json file was slow (at some positions) 
> > before the change and is now reasonably fast (but alas not instantaneous).
> 
> Which navigation commands were slow, as compared to the same commands
> in other portions of this file?

Just before the Devanagari portion of the file, there's the Hebrew
portion, starting around buffer position 3243400.  If you go there and
try the same navigation commands that were slow with Arabic, are they
as slow with Hebrew (which is also a right-to-left script, but doesn't
use character compositions nearly as heavily as Arabic)?  Here it
looks like Hebrew is noticeably faster, as fast as Devanagari.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 14:14                                                   ` Eli Zaretskii
  2022-08-04 14:31                                                     ` Eli Zaretskii
@ 2022-08-04 15:08                                                     ` Gregory Heytings
  2022-08-04 16:00                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 15:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>> navigating through the locales.json file was slow (at some positions) 
>> before the change and is now reasonably fast (but alas not 
>> instantaneous).
>
> Which navigation commands were slow, as compared to the same commands in 
> other portions of this file?
>

For example C-p, C-n, C-v, M-v.  C-f and C-b also, but much less so.

For example, open the file and do M-g c 70000 RET.  This took about 5 
seconds.  Now do C-n, this again took about 5 seconds.  With the 
optimizations, M-g c 70000 RET is almost immediate, and C-n there takes 
less than a second.

But you're right, this slowdown has little to do with bidi.  A file with a 
sufficiently long single line of Arabic text has the same problem.  (But 
not one with a line of Hebrew text.)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 14:31                                                     ` Eli Zaretskii
@ 2022-08-04 15:25                                                       ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 15:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>
> Just before the Devanagari portion of the file, there's the Hebrew 
> portion, starting around buffer position 3243400.  If you go there and 
> try the same navigation commands that were slow with Arabic, are they as 
> slow with Hebrew (which is also a right-to-left script, but doesn't use 
> character compositions nearly as heavily as Arabic)?  Here it looks like 
> Hebrew is noticeably faster, as fast as Devanagari.
>

Sorry, I hadn't seen this post before replying to the previous one. 
Indeed, as I said in my previous reply navigation commands in Hebrew texts 
are much faster than in Arabic texts.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 15:08                                                     ` Gregory Heytings
@ 2022-08-04 16:00                                                       ` Eli Zaretskii
  2022-08-04 16:25                                                         ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 16:00 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 15:08:07 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> For example C-p, C-n, C-v, M-v.  C-f and C-b also, but much less so.
> 
> For example, open the file and do M-g c 70000 RET.  This took about 5 
> seconds.  Now do C-n, this again took about 5 seconds.  With the 
> optimizations, M-g c 70000 RET is almost immediate, and C-n there takes 
> less than a second.

I'm on the branch, so I am _after_ the optimizations.  I thought you
said even after that the navigation is sluggish.  I see somewhat
slower response than in "normal" files, but my build is unoptimized,
so where it takes 3 or 4 seconds, and optimized build should be almost
instantaneous.  And that looks good enough to me, since being a bit
slower in such files is IMO fine.

> But you're right, this slowdown has little to do with bidi.  A file with a 
> sufficiently long single line of Arabic text has the same problem.  (But 
> not one with a line of Hebrew text.)

OK.  Text that goes through character compositions is expected to be
slower in redisplay, because character composition in Emacs works by
calling into Lisp.  So I think we are good here, do you agree?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 16:00                                                       ` Eli Zaretskii
@ 2022-08-04 16:25                                                         ` Gregory Heytings
  2022-08-04 17:06                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 16:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>> For example C-p, C-n, C-v, M-v.  C-f and C-b also, but much less so.
>>
>> For example, open the file and do M-g c 70000 RET.  This took about 5 
>> seconds.  Now do C-n, this again took about 5 seconds.  With the 
>> optimizations, M-g c 70000 RET is almost immediate, and C-n there takes 
>> less than a second.
>
> I'm on the branch, so I am _after_ the optimizations.  I thought you 
> said even after that the navigation is sluggish.
>

Yes, that's what I said indeed.

>
> I see somewhat slower response than in "normal" files, but my build is 
> unoptimized, so where it takes 3 or 4 seconds, and optimized build 
> should be almost instantaneous. And that looks good enough to me, since 
> being a bit slower in such files is IMO fine.
>

It's not really "almost instantaneous", moving point can take (depending 
on factors I do not understand at the moment) something between 0.2 
seconds and 2 seconds.

>
> OK.  Text that goes through character compositions is expected to be 
> slower in redisplay, because character composition in Emacs works by 
> calling into Lisp.
>

So Arabic text goes through character compositions and Hebrew text 
doesn't, is that correct?

>
> So I think we are good here, do you agree?
>

Hmmm...  I still think it would be possible to do better.  With the above 
recipe (M-g g 70000 RET C-n), composition_compute_stop_pos is called 
627663 times and uses about 2 seconds of CPU time.  What surprises me (and 
makes me believe it's perhaps possible to do better) is that it is called 
repeatedly with the same arguments.  For example, when doing C-n it is 
called 26 times with charpos = 69980.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 16:25                                                         ` Gregory Heytings
@ 2022-08-04 17:06                                                           ` Eli Zaretskii
  2022-08-04 18:16                                                             ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 17:06 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 16:25:28 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> > I'm on the branch, so I am _after_ the optimizations.  I thought you 
> > said even after that the navigation is sluggish.
> 
> Yes, that's what I said indeed.
> 
> > I see somewhat slower response than in "normal" files, but my build is 
> > unoptimized, so where it takes 3 or 4 seconds, and optimized build 
> > should be almost instantaneous. And that looks good enough to me, since 
> > being a bit slower in such files is IMO fine.
> 
> It's not really "almost instantaneous", moving point can take (depending 
> on factors I do not understand at the moment) something between 0.2 
> seconds and 2 seconds.

I saw slower response when point was at a parentesis or a brace -- due
to show-paren-mode.  Try disabling it and see if that affects
performance.  Another problem could be the BPA algorithm in bidi.c,
but in my testing setting bidi-inhibit-bpa non-nil didn't affect
performance in any tangible way, with this file.

> > OK.  Text that goes through character compositions is expected to be 
> > slower in redisplay, because character composition in Emacs works by 
> > calling into Lisp.
> 
> So Arabic text goes through character compositions and Hebrew text 
> doesn't, is that correct?

More accurately, Arabic text needs to call the shaping engine
(HarfBuzz) for all the characters, whereas Hebrew does that only
rarely (and not at all in locales.json).  You can see from the
composition rules in, respectively, lisp/language/misc-lang.el and
lisp/language/hebrew.el that for Arabic, the entire range of Arabic
characters is populated with composition rules in
composition-function-table, whereas for Hebrew, only some relatively
rare characters have non-nil rules.

> > So I think we are good here, do you agree?
> 
> Hmmm...  I still think it would be possible to do better.  With the above 
> recipe (M-g g 70000 RET C-n), composition_compute_stop_pos is called 
> 627663 times and uses about 2 seconds of CPU time.  What surprises me (and 
> makes me believe it's perhaps possible to do better) is that it is called 
> repeatedly with the same arguments.  For example, when doing C-n it is 
> called 26 times with charpos = 69980.

I'll see where these come from and whether some of them could be
avoided.

Are you capable of running under perf and producing the profile of
these commands?  Because I wonder whether we correctly identify the
main bottlenecks in these scenarios.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 17:06                                                           ` Eli Zaretskii
@ 2022-08-04 18:16                                                             ` Gregory Heytings
  2022-08-04 18:52                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 18:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>> It's not really "almost instantaneous", moving point can take 
>> (depending on factors I do not understand at the moment) something 
>> between 0.2 seconds and 2 seconds.
>
> I saw slower response when point was at a parentesis or a brace -- due 
> to show-paren-mode.  Try disabling it and see if that affects 
> performance.
>

I forgot to mention that this is with show-paren-mode disabled.

>
> Another problem could be the BPA algorithm in bidi.c, but in my testing 
> setting bidi-inhibit-bpa non-nil didn't affect performance in any 
> tangible way, with this file.
>

Indeed.

>
> More accurately, Arabic text needs to call the shaping engine (HarfBuzz) 
> for all the characters, whereas Hebrew does that only rarely (and not at 
> all in locales.json).  You can see from the composition rules in, 
> respectively, lisp/language/misc-lang.el and lisp/language/hebrew.el 
> that for Arabic, the entire range of Arabic characters is populated with 
> composition rules in composition-function-table, whereas for Hebrew, 
> only some relatively rare characters have non-nil rules.
>

Okay, thanks!

>> Hmmm...  I still think it would be possible to do better.  With the 
>> above recipe (M-g g 70000 RET C-n), composition_compute_stop_pos is 
>> called 627663 times and uses about 2 seconds of CPU time.  What 
>> surprises me (and makes me believe it's perhaps possible to do better) 
>> is that it is called repeatedly with the same arguments.  For example, 
>> when doing C-n it is called 26 times with charpos = 69980.
>
> I'll see where these come from and whether some of them could be 
> avoided.
>
> Are you capable of running under perf and producing the profile of these 
> commands?  Because I wonder whether we correctly identify the main 
> bottlenecks in these scenarios.
>

I can't speak in general, but in this particular scenario, the bottleneck 
is clearly composition_compute_stop_pos.

You didn't tell me whether it's okay to merge the branch with the latest 
changes?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 18:16                                                             ` Gregory Heytings
@ 2022-08-04 18:52                                                               ` Eli Zaretskii
  2022-08-04 19:26                                                                 ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-04 18:52 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 18:16:02 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> >> Hmmm...  I still think it would be possible to do better.  With the 
> >> above recipe (M-g g 70000 RET C-n), composition_compute_stop_pos is 
> >> called 627663 times and uses about 2 seconds of CPU time.  What 
> >> surprises me (and makes me believe it's perhaps possible to do better) 
> >> is that it is called repeatedly with the same arguments.  For example, 
> >> when doing C-n it is called 26 times with charpos = 69980.
> >
> > I'll see where these come from and whether some of them could be 
> > avoided.
> >
> > Are you capable of running under perf and producing the profile of these 
> > commands?  Because I wonder whether we correctly identify the main 
> > bottlenecks in these scenarios.
> >
> 
> I can't speak in general, but in this particular scenario, the bottleneck 
> is clearly composition_compute_stop_pos.

I looked into this, and I don't see how this could be avoided,
unfortunately, not without disabling auto-composition-mode (which I'm
told produces display of Arabic that makes readers of that script turn
away in disgust).  Disabling auto-composition-mode makes navigation
there about twice faster, as fast as in the other parts of the file.

The problem here is that bidi iteration through buffer text is
non-linear (it follows the visual order, not the order of buffer
positions), so the iterator frequently finds itself out of sync with
the next known "stop position", and needs to resync, to be able to
find which faces, overlays, and invisible and display properties are
in effect.  That's what handle_stop_backwards does, and that involves
going back and rescanning text.  With Arabic, this is exacerbated by
the fact that every Arabic character is "composable" (in the sense of
the CHAR_COMPOSED_P macro), and triggers a call to
composition_compute_stop_pos to find the next (or previous) one.

And on top of that, C-n calls pos-visible-in-window-p two or 3 times,
posn-at-point 2 times, and then vertical-motion.  Each one of these
needs to scan text around point starting from the beginning of the
previous visible line.  The narrowing limits that to some reasonable
distance, but it is still several thousand characters back.

So if composition_compute_stop_pos is the bottleneck, perhaps some
simple caching could help?  But note that when this function is called
twice with the same character position, it is called to search in
different directions -- once forward and another time back.  So even
such low-hanging fruit is not simple to reap, as these two calls will
return two different results.

For now, I don't see how to speed this up, without producing woefully
incorrect display.  I will keep thinking, but I'm not too worried
about this case, since the current performance is tolerable enough,
even if somewhat sluggish.

> You didn't tell me whether it's okay to merge the branch with the latest 
> changes?

I think you can merge.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 18:52                                                               ` Eli Zaretskii
@ 2022-08-04 19:26                                                                 ` Gregory Heytings
  2022-08-05  6:05                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-04 19:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>
> So if composition_compute_stop_pos is the bottleneck, perhaps some 
> simple caching could help?
>

This is exactly what I tried...

>
> But note that when this function is called twice with the same character 
> position, it is called to search in different directions -- once forward 
> and another time back.
>

... but I missed that important piece of the puzzle, thanks!

>
> For now, I don't see how to speed this up, without producing woefully 
> incorrect display.  I will keep thinking, but I'm not too worried about 
> this case, since the current performance is tolerable enough, even if 
> somewhat sluggish.
>

Alas, it becomes much more sluggish if with a larger frame (160 columns 
instead of 80), and/or with only Arabic characters.  Emacs takes ~10 
seconds to open a file with only 6000 characters on a single line, and 
motion commands are slow.

>> You didn't tell me whether it's okay to merge the branch with the 
>> latest changes?
>
> I think you can merge.
>

Done.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:30                                                     ` Gregory Heytings
@ 2022-08-04 21:24                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-04 21:24 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Dmitry Gutov

> To make syntax-ppss faster, if possible?
> Is that not a sensible thing to do?

It's surely possible and useful, regardless if what I said is true
or not :-)


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 12:05                                             ` Gregory Heytings
  2022-08-04 12:40                                               ` Eli Zaretskii
@ 2022-08-04 21:37                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-04 21:37 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: gerd.moellmann, 56682, Eli Zaretskii, Stefan Kangas, Dmitry Gutov

>> Just a curious question: How did we arrive at the current value of
>> `long-line-threshold'?
> I took the same value that Stefan had chosen for syntax-wholeline-max.

For the record, I chose 10K based on ... a wild guess that a 10K is
"definitely" in the camp of "very long lines" (it's not just a normal
line that happens to be a bit long) yet at the same time it is hopefully
short enough that it shouldn't take too much time to process.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  5:40                                                             ` Eli Zaretskii
@ 2022-08-04 22:35                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05  6:20                                                                 ` Eli Zaretskii
  2022-08-05 10:00                                                                 ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-04 22:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, dgutov

>> The wholelines problem did not kick in because of PPS nor `syntax-ppss`
>> but because of font-lock (which then called `syntax-ppss` which then
>> called PPS).
> If it's font-lock that forces syntax-ppss to examine the whole huge
> line, then what is your proposal for avoiding that which doesn't
> involve some more-or-less arbitrary restrictions on the part of the
> buffer that can be examined by syntax-ppss?

The use of `syntax-wholeline-max` in
`font-lock-extend-region-wholelines` supposedly fixed this problem since
it changed `font-lock` so it doesn't ask `syntax-ppss` to compute the
whole line/buffer.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 13:09                                         ` Eli Zaretskii
@ 2022-08-05  1:39                                           ` Dmitry Gutov
  2022-08-05  7:38                                             ` Eli Zaretskii
  2022-08-05  8:21                                             ` Gregory Heytings
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05  1:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, gregory, monnier

On 04.08.2022 16:09, Eli Zaretskii wrote:

>> That's the scenario I described, and that's my point: this file's
>> display is sluggish. Even though font-lock has already finished its
>> work. And it didn't have to spend any significant time in syntax-ppss.
>>
>> So there is a particular performance problem with the display of
>> fontified buffers which I'd really like your help in fixing.
> 
> Maybe it is in display, and maybe it isn't.  Do you have any evidence
> that the sluggish response is due to redisplay?  C-n, for example, is
> mostly not redisplay, but Lisp code in simple.el and occasional calls
> to vertical-motion.

I come to that conclusion by observing said sluggish movement in a 
buffer that is fully fontified. And yet the delays after pressing 'C-n' 
or 'C-p' or 'C-n' seemed noticeable and very similar to delays on 
PgUp/PgDown.

Are these delays in fontification-functions? That seems unlikely given 
the buffer is fontified already in full, and none of the commands are 
causing modifications (which would force syntax-ppss cache and 
fontifications to be recalculated).

If the delays are in font-lock anyway, that might be in the code which 
checks that the area between window-start and window-end is fontified. I 
don't see why that code has to be slow (long lines or not), so it should 
be fixable easily enough. Unless 'get-text-property' or 
'next-single-property-change' can exhibit pathologic performance in the 
presence of long lines, of course. But that doesn't show up in my testing.

> But even if the slow response is due to redisplay, we just have
> another cause that we need to investigate and try fixing.

It seems to me that that cause actually has larger impact than 
font-lock, because it does show itself in a moderately-sized (88K) 
buffer, where font-lock doesn't feel like a problem. It stands to reason 
that the same "cause" might have a proportionally bigger impact in large 
buffers as well, and only after we remove it (alone), then we can 
evaluate how font-lock itself affects user experience, and how much of 
its correctness (and for buffers of which size) we want to sacrifice.

> It says
> nothing about the measures we've already taken on master.  They
> definitely make even this case faster, and with an unoptimized build I
> can now reasonably edit this file, something I couldn't do before.

If my guess is right, the fix on master whammied all over the redisplay 
with narrowing, both fixing the "cause" and restricting font-lock to the 
same narrowed region. The latter part might be unnecessary in the usual 
case (we might still decide to do that later for much larger buffers, 
but that should be decided by a separate threshold variable).

>> Fixing in a way that doesn't add narrowing around
>> fontification-functions, because as we can see it's not necessary in
>> examples like this.
> 
> If that is possible, sure.  No one said that from now on every problem
> in Emacs that causes slow responses will be handled by narrowing.  But
> if, for example, it turns out that the slow responses is due to time
> it takes some code to traverse a long stretch of fontified buffer,
> what other solution would you suggest except making the portion to be
> traversed shorter?

For all I know, the most optimal fix might still be implemented through 
narrowing, but it would be temporarily widened while 
fontificiation-functions are run.

>>> If you dislike mis-fontification, turn font-lock mode off.  It's as easy
>>> as that.  Mis-fontification is expected in such cases.  The docstring of
>>> syntax-wholeline-max also mentions that "misfontification may then
>>> occur". Why did you not protest at that time?
>>
>> I think we could have both speed and correctness, at least for files of
>> this size.
> 
> That is not a given, and the experience till now suggests otherwise.

I have commented out the code which applies the narrowing in 
'handle_fontified_prop' and recompiled.

The result:

- My 88K file is fontified correctly now. The redisplay and scrolling 
performance seem unaffected (meaning still fast).
- dictionary.json (18M) seems to be fontified correctly as well now 
(it's a mess by default on master), its scrolling performance is 
unaffected too. The difference: I have to wait ~2 seconds the first time 
I press 'M->'.

BTW, 'M-> M-<' triggers some puzzling long wait (~3 seconds) both on 
master and with my change, every time I issue this sequence of commands.

diff --git a/src/xdisp.c b/src/xdisp.c
index 099efed2db..02d7f6c562 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4391,19 +4391,19 @@ handle_fontified_prop (struct it *it)

        eassert (it->end_charpos == ZV);

-      if (current_buffer->long_line_optimizations_p)
-	{
-	  ptrdiff_t begv = it->narrowed_begv;
-	  ptrdiff_t zv = it->narrowed_zv;
-	  ptrdiff_t charpos = IT_CHARPOS (*it);
-	  if (charpos < begv || charpos > zv)
-	    {
-	      begv = get_narrowed_begv (it->w, charpos);
-	      zv = get_narrowed_zv (it->w, charpos);
-	    }
-	  narrow_to_region_internal (make_fixnum (begv), make_fixnum (zv), true);
-	  specbind (Qrestrictions_locked, Qt);
-	}
+      /* if (current_buffer->long_line_optimizations_p) */
+      /* 	{ */
+      /* 	  ptrdiff_t begv = it->narrowed_begv; */
+      /* 	  ptrdiff_t zv = it->narrowed_zv; */
+      /* 	  ptrdiff_t charpos = IT_CHARPOS (*it); */
+      /* 	  if (charpos < begv || charpos > zv) */
+      /* 	    { */
+      /* 	      begv = get_narrowed_begv (it->w, charpos); */
+      /* 	      zv = get_narrowed_zv (it->w, charpos); */
+      /* 	    } */
+      /* 	  narrow_to_region_internal (make_fixnum (begv), make_fixnum 
(zv), true); */
+      /* 	  specbind (Qrestrictions_locked, Qt); */
+      /* 	} */

        /* Don't allow Lisp that runs from 'fontification-functions'
  	 clear our face and image caches behind our back.  */





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-02 14:57                                           ` Gregory Heytings
  2022-08-02 16:14                                             ` Eli Zaretskii
  2022-08-02 22:04                                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-05  2:03                                             ` Dmitry Gutov
  2022-08-05  7:43                                               ` Eli Zaretskii
  2022-08-05  8:23                                               ` Gregory Heytings
  2 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05  2:03 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

On 02.08.2022 17:57, Gregory Heytings wrote:
> 
>>
>> Regarding the long-standing bug reports, we did solve a bunch of 
>> issues already. One major one, IIUC, was redisplay of already 
>> fontified text on long lines.
>>
> 
> Try to open the dictionary.json with Emacs on master a month ago.  It's 
> a small file (only 18 MB).  On my computer just opening the file with 
> emacs -Q takes 220 seconds.  220 seconds during which Emacs is 
> completely locked, because of font-lock mode.  If you're not convinced, 
> turn font-lock mode off, open the file, and turn font-lock mode on.

I downloaded that file, and I commented out the code in 
'handle_fontified_prop' which performs the narrowing on master.

And recompiled, leaving all the other settings the same.

Visiting dictionary.json takes about 1 second.

M-> takes ~2 seconds, but only the first time (and until the next 
modification near the beginning of the buffer, I guess).

Scrolling is as fast as without my change. All fontification seems 
correct (which is not the case on master).

>> Another piece of the puzzle was added by Stefan in 15b2138719b340.
>>
> 
> That looked promising, but sadly it had only a very limited effect.
> 
>>
>> So perhaps we should re-evaluate the testing scenario to see where the 
>> current bottlenecks are. If we current main issue is the 55s spent in 
>> syntax-ppss, a more constructive approach would be to look into 
>> optimizing parse-partial-sexp. Or even give up on certain scenarios, 
>> admitting that waiting 55s once to visit the end of a 1 GB buffer is 
>> not so bad (and that could part could also be sped up by setting 
>> syntax-propertize-function to nil and using a very simple syntax 
>> table, for instance).
>>
> 
> It is bad, especially now that it became clear that in fact it's not 
> "waiting 55s once" but "waiting 55s each time the buffer is modified and 
> you move to another position in the buffer".

That was about a 1 GB buffer, right?

Let's take care of buffers with more reasonable sizes first, and then we 
can consider extremes. A separate threshold for syntax-ppss to avoid 
parsing the whole buffer might fit the bill.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 19:26                                                                 ` Gregory Heytings
@ 2022-08-05  6:05                                                                   ` Eli Zaretskii
  2022-08-05  9:37                                                                     ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05  6:05 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Thu, 04 Aug 2022 19:26:03 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> Alas, it becomes much more sluggish if with a larger frame (160 columns 
> instead of 80), and/or with only Arabic characters.  Emacs takes ~10 
> seconds to open a file with only 6000 characters on a single line, and 
> motion commands are slow.

It doesn't surprise me.  If you disable font-lock and show-paren-mode,
does it become significantly faster?  And how does disabling font-lock
that measure vs disabling auto-composition-mode with that file?

Can you post that file?  I'd like to try some ideas I might have with
it.

> >> You didn't tell me whether it's okay to merge the branch with the 
> >> latest changes?
> >
> > I think you can merge.
> 
> Done.

Thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 22:35                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-05  6:20                                                                 ` Eli Zaretskii
  2022-08-05  9:03                                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05 10:00                                                                 ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05  6:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  dgutov@yandex.ru,  56682@debbugs.gnu.org
> Date: Thu, 04 Aug 2022 18:35:36 -0400
> 
> >> The wholelines problem did not kick in because of PPS nor `syntax-ppss`
> >> but because of font-lock (which then called `syntax-ppss` which then
> >> called PPS).
> > If it's font-lock that forces syntax-ppss to examine the whole huge
> > line, then what is your proposal for avoiding that which doesn't
> > involve some more-or-less arbitrary restrictions on the part of the
> > buffer that can be examined by syntax-ppss?
> 
> The use of `syntax-wholeline-max` in
> `font-lock-extend-region-wholelines` supposedly fixed this problem since
> it changed `font-lock` so it doesn't ask `syntax-ppss` to compute the
> whole line/buffer.

It did?

And if it did, how is that better or different from a locked
narrowing?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  1:39                                           ` Dmitry Gutov
@ 2022-08-05  7:38                                             ` Eli Zaretskii
  2022-08-05  8:21                                             ` Gregory Heytings
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05  7:38 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 04:39:46 +0300
> Cc: gregory@heytings.org, gerd.moellmann@gmail.com, 56682@debbugs.gnu.org,
>  monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> >> So there is a particular performance problem with the display of
> >> fontified buffers which I'd really like your help in fixing.
> > 
> > Maybe it is in display, and maybe it isn't.  Do you have any evidence
> > that the sluggish response is due to redisplay?  C-n, for example, is
> > mostly not redisplay, but Lisp code in simple.el and occasional calls
> > to vertical-motion.
> 
> I come to that conclusion by observing said sluggish movement in a 
> buffer that is fully fontified. And yet the delays after pressing 'C-n' 
> or 'C-p' or 'C-n' seemed noticeable and very similar to delays on 
> PgUp/PgDown.

There's much more to C-n/C-p/C-v/M-v than just fontifications and
redisplay.  If fontification-functions are out of the picture, it
doesn't yet follow that the only factor that is left is redisplay.

In general, redisplay _is_ slower when there are non-default faces in
the buffer, but (a) that is inevitable due to additional processing,
and (b) I hope you are not arguing against having font-lock faces, are
you?  The question, therefore, is whether the additional processing
due to font-lock faces, by itself, is or isn't a significant factor in
the slowdown you observe?  I don't see how that question could be
answered except by profiling.

> Are these delays in fontification-functions? That seems unlikely given 
> the buffer is fontified already in full

When the buffer is fully fontified, Emacs will not call
fontification-functions at all, unless the buffer becomes modified.

> If the delays are in font-lock anyway, that might be in the code which 
> checks that the area between window-start and window-end is fontified.

Unlikely.  That test uses the interval tree, and is reasonably fast.
But again, only profiling will tell the truth.

> I 
> don't see why that code has to be slow (long lines or not), so it should 
> be fixable easily enough. Unless 'get-text-property' or 
> 'next-single-property-change' can exhibit pathologic performance in the 
> presence of long lines, of course. But that doesn't show up in my testing.

get-text-property and its ilk don't treat newlines as special
characters, and don't even access buffer text at all.  So no, that is
most probably not a significant factor here.

> > But even if the slow response is due to redisplay, we just have
> > another cause that we need to investigate and try fixing.
> 
> It seems to me that that cause actually has larger impact than 
> font-lock, because it does show itself in a moderately-sized (88K) 
> buffer, where font-lock doesn't feel like a problem.

"Doesn't feel" like a problem?  On what is that feeling based?

We need profiling and other hard data.  It is impossible to argue
based on feelings.

> It stands to reason 
> that the same "cause" might have a proportionally bigger impact in large 
> buffers as well, and only after we remove it (alone), then we can 
> evaluate how font-lock itself affects user experience, and how much of 
> its correctness (and for buffers of which size) we want to sacrifice.

The buffer size shouldn't matter if lines are not too long, because
redisplay always examines a small portion of the buffer that fits in
the window, and sometimes also a couple of lines above and below.  If
lines _are_ long, then yes, their length (not the buffer size) is a
significant factor, because the redisplay algorithms many times need
to start from a previous line's beginning, to anchor their layout
calculations (because only there the X coordinate is known in
advance).

That is the root cause which these changes we are discussing are
trying to eliminate or at least alleviate.

There are no other causes we know about.  If you claim that there are
such causes, you need to show that, not just by reasoning, but by
measurements, after you eliminate the already-known effects, like the
length of a physical line, character composition (when the script of
the characters requires it), etc.

And please keep in mind that without long lines, any slowdown factors
that we could have are unlikely to cause unreasonably slow responses,
because otherwise we'd have complaints about that long ago.  (In fact
we did have such complaints, and the problems they reported were fixed
in past releases.)  The only known situation with large buffers that
is yet unsolved in core is when the buffer is larger than the
available memory, so it causes paging when buffer text is accessed for
display or navigation.

The mental model you are building and on whose basis you are trying to
reason about the ways to solve this problem should take all of the
above into account.  Only after that you may have a chance of
identifying some hidden factor that eluded us, whose elimination could
then allow to solve these problems without the narrowing we are now
using.

> > It says
> > nothing about the measures we've already taken on master.  They
> > definitely make even this case faster, and with an unoptimized build I
> > can now reasonably edit this file, something I couldn't do before.
> 
> If my guess is right, the fix on master whammied all over the redisplay 
> with narrowing, both fixing the "cause" and restricting font-lock to the 
> same narrowed region. The latter part might be unnecessary in the usual 
> case (we might still decide to do that later for much larger buffers, 
> but that should be decided by a separate threshold variable).

We need hard data, not guesses.  It is impractical to argue about
guesses, especially when they are based on incomplete or inaccurate
understanding of how the relevant code really behaves.  If you produce
measurements or other facts that contradict our understanding, then we
will be forced to reconsider and adapt to those facts.  For now, you
didn't yet say or show anything that amounts to such a contradiction.

> >> Fixing in a way that doesn't add narrowing around
> >> fontification-functions, because as we can see it's not necessary in
> >> examples like this.
> > 
> > If that is possible, sure.  No one said that from now on every problem
> > in Emacs that causes slow responses will be handled by narrowing.  But
> > if, for example, it turns out that the slow responses is due to time
> > it takes some code to traverse a long stretch of fontified buffer,
> > what other solution would you suggest except making the portion to be
> > traversed shorter?
> 
> For all I know, the most optimal fix might still be implemented through 
> narrowing, but it would be temporarily widened while 
> fontificiation-functions are run.

The first step of these changes didn't narrow when
fontification-functions were run.  It was still insufficient, because
font-lock still made Emacs extremely slow with long lines, whereas
disabling font-lock removed that slowness.  The next step then applied
narrowing to fontification-functions as well, and that solved the slow
cases.  This is how this activity proceeded, and this is why we are
reasonably sure at this point that fontification-functions _are_
indeed a significant slowdown factor when very long lines are
involved.

> >> I think we could have both speed and correctness, at least for files of
> >> this size.
> > 
> > That is not a given, and the experience till now suggests otherwise.
> 
> I have commented out the code which applies the narrowing in 
> 'handle_fontified_prop' and recompiled.
> 
> The result:
> 
> - My 88K file is fontified correctly now. The redisplay and scrolling 
> performance seem unaffected (meaning still fast).
> - dictionary.json (18M) seems to be fontified correctly as well now 
> (it's a mess by default on master), its scrolling performance is 
> unaffected too. The difference: I have to wait ~2 seconds the first time 
> I press 'M->'.

We consider 2 seconds of wait in this case to be "too slow".

But if all you are saying is that the value of long-line-threshold
should be changed, or that perhaps the portion of the buffer around
the window used for narrowing should be enlarged and/or exposed to
control of Lisp programs, we can discuss that.  My impression, though,
was that your arguments are much more basic: that they argue against
the very methods of solving this problem that we currently have on
master.  And that is an entirely different discussion than the one
about the default values of these thresholds.

> BTW, 'M-> M-<' triggers some puzzling long wait (~3 seconds) both on 
> master and with my change, every time I issue this sequence of commands.

It would be useful to look into the reasons of that, thanks.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  2:03                                             ` Dmitry Gutov
@ 2022-08-05  7:43                                               ` Eli Zaretskii
  2022-08-05 11:34                                                 ` Dmitry Gutov
  2022-08-05  8:23                                               ` Gregory Heytings
  1 sibling, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05  7:43 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 05:03:39 +0300
> Cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
>  monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> That was about a 1 GB buffer, right?
> 
> Let's take care of buffers with more reasonable sizes first, and then we 
> can consider extremes.

We want to solve all of them.

> A separate threshold for syntax-ppss to avoid parsing the whole
> buffer might fit the bill.

Don't we already have such a threshold?

But again, if you are just saying that the current balance between
response time and font-lock correctness is sub-optimal, and should be
made better by changing the values of the thresholds, that's a
different discussion.  For that discussion, we'd need a representative
enough sample of real-life files with long lines and in as many major
modes as possible, to make our balance really close to optimal.  I,
for one, will welcome more examples of such files, especially if they
use major modes we didn't consider until now.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  1:39                                           ` Dmitry Gutov
  2022-08-05  7:38                                             ` Eli Zaretskii
@ 2022-08-05  8:21                                             ` Gregory Heytings
  2022-08-05 10:49                                               ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05  8:21 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gerd.moellmann, 56682, Eli Zaretskii, monnier


>
> If my guess is right, the fix on master whammied all over the redisplay 
> with narrowing, both fixing the "cause" and restricting font-lock to the 
> same narrowed region.
>

No, as Eli told you the restriction around font-lock was added after the 
other fixes, when it became clear that fontification-functions were the 
main remaining cause of the remaining slowdowns.

>
> - dictionary.json (18M) seems to be fontified correctly as well now 
> (it's a mess by default on master), its scrolling performance is 
> unaffected too. The difference: I have to wait ~2 seconds the first time 
> I press 'M->'.
>

You have to wait 2 seconds.  I have to wait 4, and Eli has to wait 8. 
And for someone else, with older hardware, it might be 20.  And you'll see 
similar slowdowns when you modify the buffer at one place and move 
somewhere else.  This is hardly acceptable for a still relatively small 
file, if the only reason of that additional wait it to put colors on the 
buffer characters.

>
> BTW, 'M-> M-<' triggers some puzzling long wait (~3 seconds) both on 
> master and with my change, every time I issue this sequence of commands.
>

This is caused by show-paren-mode.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  2:03                                             ` Dmitry Gutov
  2022-08-05  7:43                                               ` Eli Zaretskii
@ 2022-08-05  8:23                                               ` Gregory Heytings
  2022-08-05 12:19                                                 ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05  8:23 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier


>> "waiting 55s once" but "waiting 55s each time the buffer is modified 
>> and you move to another position in the buffer".
>
> That was about a 1 GB buffer, right?
>
> Let's take care of buffers with more reasonable sizes first, and then we 
> can consider extremes.
>

No, because the point of considering extreme cases is that they reveal on 
your computer what happens on other people's less powerful computers with 
much smaller files.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  6:20                                                                 ` Eli Zaretskii
@ 2022-08-05  9:03                                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05 10:57                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-05  9:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, dgutov

>> The use of `syntax-wholeline-max` in
>> `font-lock-extend-region-wholelines` supposedly fixed this problem since
>> it changed `font-lock` so it doesn't ask `syntax-ppss` to compute the
>> whole line/buffer.
>
> It did?
> And if it did, how is that better or different from a locked
> narrowing?

In terms of end-user behavior, it's very similar: it can break the
`font-lock-keywords` part of font-lock but it still lets `syntax-ppss`
look at the whole buffer and will thus still provide correct recognition
of strings and comments, except when the major mode relies on
`syntax-propertize-function` since that one also obeys
`syntax-wholeline-max` and can thus misbehave in a similar way to the
narrowing.

The more important difference is that it can be
tweaked/changed/broken/improved by any ELisp package without
necessitating a recompilation of Emacs's C code, or ugly workarounds to
escape the narrowing, like postponing the actual font-lock to a timer or
some such.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  6:05                                                                   ` Eli Zaretskii
@ 2022-08-05  9:37                                                                     ` Gregory Heytings
  2022-08-05 11:40                                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05  9:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>
> It doesn't surprise me.  If you disable font-lock and show-paren-mode, 
> does it become significantly faster?  And how does disabling font-lock 
> that measure vs disabling auto-composition-mode with that file?
>

Disabling show-paren-mode and font-lock has no effect, no.  What is 
surprising is that the speed seems to depend both on the mode and on the 
presence of bidirectionality (so it seems that after all bidi is involved 
in one way or another).

>
> Can you post that file?  I'd like to try some ideas I might have with 
> it.
>

There are now eight files:

1. https://www.heytings.org/data/arabic-large.json
2. https://www.heytings.org/data/arabic-large.json.txt
3. https://www.heytings.org/data/arabic-large.txt
4. https://www.heytings.org/data/arabic-large.txt.json
5. https://www.heytings.org/data/arabic-small.json
6. https://www.heytings.org/data/arabic-small.json.txt
7. https://www.heytings.org/data/arabic-small.txt
8. https://www.heytings.org/data/arabic-small.txt.json

1 and 2, 3 and 4, 5 and 6, 7 and 8 are the same file, the only difference 
is the added extension.  1, 2, 3 and 4 on the one hand, and 5, 6, 7 and 8 
on the other hand, are almost the same file, the only difference is that 
in 1 and 2 and 5 and 6 the arabic text is enclosed into '{"ar":"<arabic 
text>"}'.

Now what you'll see is that 5 is slow, 6 is also slow (so it's not only 
js-mode which is doing something wrong), 7 is fast, and 8 is again slow 
(so js-mode is perhaps doing something wrong).  Also, the motion commands 
C-n and C-p do not work as expected in 5, 6 and 8.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04 22:35                                                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05  6:20                                                                 ` Eli Zaretskii
@ 2022-08-05 10:00                                                                 ` Gregory Heytings
  1 sibling, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 10:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, dgutov


>> If it's font-lock that forces syntax-ppss to examine the whole huge 
>> line, then what is your proposal for avoiding that which doesn't 
>> involve some more-or-less arbitrary restrictions on the part of the 
>> buffer that can be examined by syntax-ppss?
>
> The use of `syntax-wholeline-max` in 
> `font-lock-extend-region-wholelines` supposedly fixed this problem since 
> it changed `font-lock` so it doesn't ask `syntax-ppss` to compute the 
> whole line/buffer.
>

Actually, it didn't, it made things better in some cases, but (much) worse 
in other cases.  You may have seen the recipes I sent to Dmitry a few days 
ago:

emacs -Q
M-: (setq long-line-threshold nil syntax-wholeline-max most-positive-fixnum) RET
C-x C-f dictionary.json RET y ;; takes 160 seconds
C-e ;; takes 200 seconds

emacs -Q
M-: (setq long-line-threshold nil) RET
C-x C-f dictionary.json RET y ;; immediate
C-e ;; not finished after 1200 seconds (20 minutes), I killed Emacs

emacs -Q
C-x C-f dictionary.json RET y ;; immediate
C-e ;; immediate





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  8:21                                             ` Gregory Heytings
@ 2022-08-05 10:49                                               ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 10:49 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Fri, 05 Aug 2022 08:21:30 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Eli Zaretskii <eliz@gnu.org>, gerd.moellmann@gmail.com, 
>     56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> 
> > BTW, 'M-> M-<' triggers some puzzling long wait (~3 seconds) both on 
> > master and with my change, every time I issue this sequence of commands.
> >
> 
> This is caused by show-paren-mode.

Interesting.  I guess the problematic part is the code in
show-paren-mode which searches for the matching parenthesis, not the
part that puts the overlay on the buffer?  IOW, does making
blink-matching-paren-distance smaller shortens the delay?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  9:03                                                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-05 10:57                                                                     ` Eli Zaretskii
  2022-08-05 12:06                                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 10:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  dgutov@yandex.ru,  56682@debbugs.gnu.org
> Date: Fri, 05 Aug 2022 05:03:25 -0400
> 
> >> The use of `syntax-wholeline-max` in
> >> `font-lock-extend-region-wholelines` supposedly fixed this problem since
> >> it changed `font-lock` so it doesn't ask `syntax-ppss` to compute the
> >> whole line/buffer.
> >
> > It did?
> > And if it did, how is that better or different from a locked
> > narrowing?
> 
> In terms of end-user behavior, it's very similar: it can break the
> `font-lock-keywords` part of font-lock but it still lets `syntax-ppss`
> look at the whole buffer and will thus still provide correct recognition
> of strings and comments, except when the major mode relies on
> `syntax-propertize-function` since that one also obeys
> `syntax-wholeline-max` and can thus misbehave in a similar way to the
> narrowing.
> 
> The more important difference is that it can be
> tweaked/changed/broken/improved by any ELisp package without
> necessitating a recompilation of Emacs's C code, or ugly workarounds to
> escape the narrowing, like postponing the actual font-lock to a timer or
> some such.

AFAIK, the problem is not entirely solved by syntax-wholeline-max.  If
and when it is solved, we could revisit this issue.  However, since
syntactic fontifications are invoked by a major-mode's font-lock
setup, there's still a problem of how to prevent the rest of font-lock
from causing significant slowdown.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  7:43                                               ` Eli Zaretskii
@ 2022-08-05 11:34                                                 ` Dmitry Gutov
  2022-08-05 11:48                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 11:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 10:43, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 05:03:39 +0300
>> Cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
>>   monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> That was about a 1 GB buffer, right?
>>
>> Let's take care of buffers with more reasonable sizes first, and then we
>> can consider extremes.
> 
> We want to solve all of them.

I didn't say we don't. But different issues call for different solutions.

>> A separate threshold for syntax-ppss to avoid parsing the whole
>> buffer might fit the bill.
> 
> Don't we already have such a threshold?

Not exactly: the buffer is still fully parsed by parse-partial-sexp 
(once). AFAICT, the variable makes the application of 
syntax-propertize-rules more lax, but at least it keeps counting the 
simple parens/quotes from the beginning of the buffer. That's why the 
fontification remained correct in both examples that I posted.

The time it takes to (parse-partial-sexp 1 (point-max)) accounts for the 
whole fontification delay at the end of dictionary.json.

Now, if nobody manages to speed up parse-partial-sexp itself further, we 
can add an additional tweak/size threshold, after which syntax-ppss 
won't parse the whole buffer anymore. But if we do that in Lisp, we can 
later improve that bit of logic so that the result is not entirely 
arbitrary, like it is now on master with dictionary.json.

> But again, if you are just saying that the current balance between
> response time and font-lock correctness is sub-optimal, and should be
> made better by changing the values of the thresholds, that's a
> different discussion.  For that discussion, we'd need a representative
> enough sample of real-life files with long lines and in as many major
> modes as possible, to make our balance really close to optimal.  I,
> for one, will welcome more examples of such files, especially if they
> use major modes we didn't consider until now.

We really have different problems and thus need different solutions for 
them. Not just one blunt instrument.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  9:37                                                                     ` Gregory Heytings
@ 2022-08-05 11:40                                                                       ` Eli Zaretskii
  2022-08-05 11:50                                                                         ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 11:40 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Fri, 05 Aug 2022 09:37:35 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> > It doesn't surprise me.  If you disable font-lock and show-paren-mode, 
> > does it become significantly faster?  And how does disabling font-lock 
> > that measure vs disabling auto-composition-mode with that file?
> 
> Disabling show-paren-mode and font-lock has no effect, no.  What is 
> surprising is that the speed seems to depend both on the mode and on the 
> presence of bidirectionality (so it seems that after all bidi is involved 
> in one way or another).

That the presence of R2L characters makes this slower shouldn't
surprise.  The bidi iteration through buffer text _can_ be non-linear,
but it's only _actually_ non-linear when R2L characters are present.
So the complications with handle_stop_backwards and
compute_stop_backwards actually happen only when there are R2L
characters in the buffer; otherwise the iterator behaves exactly like
the unidirectional display in Emacs 23 and before did: it examines
characters one by one it strict increasing order of buffer positions,
and skips or shortcuts some processing that is only needed for truly
bidirectional text.

IOW, the display engine is specially optimized for the very frequent
case of buffers that don't include R2L characters, and is expected to
be slower otherwise.

If font-lock doesn't produce any tangible difference, it probably
means that the amount of stop-positions added due to faces is
negligible when compared to the stop-positions due to character
composition.  Which I guess is also reasonable with a script like
Arabic.

> 1. https://www.heytings.org/data/arabic-large.json
> 2. https://www.heytings.org/data/arabic-large.json.txt
> 3. https://www.heytings.org/data/arabic-large.txt
> 4. https://www.heytings.org/data/arabic-large.txt.json
> 5. https://www.heytings.org/data/arabic-small.json
> 6. https://www.heytings.org/data/arabic-small.json.txt
> 7. https://www.heytings.org/data/arabic-small.txt
> 8. https://www.heytings.org/data/arabic-small.txt.json
> 
> 1 and 2, 3 and 4, 5 and 6, 7 and 8 are the same file, the only difference 
> is the added extension.  1, 2, 3 and 4 on the one hand, and 5, 6, 7 and 8 
> on the other hand, are almost the same file, the only difference is that 
> in 1 and 2 and 5 and 6 the arabic text is enclosed into '{"ar":"<arabic 
> text>"}'.
> 
> Now what you'll see is that 5 is slow, 6 is also slow (so it's not only 
> js-mode which is doing something wrong), 7 is fast, and 8 is again slow 
> (so js-mode is perhaps doing something wrong).  Also, the motion commands 
> C-n and C-p do not work as expected in 5, 6 and 8.

Thanks, I will use these.

There's (at least) one more aspect of this, as long as Text mode is
being used: Text mode doesn't force bidi-paragraph-direction to be
left-to-right, whereas all descendants of prog-mode, including
js-mode, do.  Leaving bidi-paragraph-direction at nil means Emacs
needs to determine the base paragraph direction each time it's about
to redisplay a window, and that might be expensive, especially in a
large buffer without any paragraph breaks (by default, an empty line),
because that is determined by the first strong directional character
of the paragraph.  So for a more fair comparison with Text mode, you
should set bidi-paragraph-direction to the value left-to-right in
text-mode buffers.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 11:34                                                 ` Dmitry Gutov
@ 2022-08-05 11:48                                                   ` Eli Zaretskii
  2022-08-05 12:08                                                     ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 11:48 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 14:34:12 +0300
> Cc: gregory@heytings.org, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> We really have different problems and thus need different solutions for 
> them. Not just one blunt instrument.

The current opinion of both the head maintainers and of Gregory is
that these are all parts of the same problem, and a single class of
solutions can solve most of them.  The problem being that many
portions of Emacs code involved in navigation and redisplay don't
expect lines to be too long, and therefore employ algorithms that
don't scale well with line length.  Preventing such code from going
far back to the beginning of the previous line, and then coming back
through all that text, is therefore an idea that should appear very
reasonable.  It also works surprisingly well in practice, at least
according to what we know at this point.

I get it that you disagree, but I haven't seen any real data behind
your dissenting opinions, and thus I don't yet see any reason to
reconsider changing the direction of development in this regard.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 11:40                                                                       ` Eli Zaretskii
@ 2022-08-05 11:50                                                                         ` Gregory Heytings
  2022-08-05 13:43                                                                           ` Eli Zaretskii
                                                                                             ` (2 more replies)
  0 siblings, 3 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 11:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gerd.moellmann, 56682, monnier, dgutov


>
> There's (at least) one more aspect of this, as long as Text mode is 
> being used: Text mode doesn't force bidi-paragraph-direction to be 
> left-to-right, whereas all descendants of prog-mode, including js-mode, 
> do.  Leaving bidi-paragraph-direction at nil means Emacs needs to 
> determine the base paragraph direction each time it's about to redisplay 
> a window, and that might be expensive, especially in a large buffer 
> without any paragraph breaks (by default, an empty line), because that 
> is determined by the first strong directional character of the 
> paragraph.  So for a more fair comparison with Text mode, you should set 
> bidi-paragraph-direction to the value left-to-right in text-mode 
> buffers.
>

Indeed, that seems to be the culprit here, I didn't know that text-mode 
was an exception here.  If I set bidi-paragraph-direction to 
'left-to-right after visiting the arabic-small.txt file, Emacs (mis) 
behaves like it does for the other Arabic files: it becomes slow, and C-n 
C-p do not work correctly anymore.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 10:57                                                                     ` Eli Zaretskii
@ 2022-08-05 12:06                                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05 12:16                                                                         ` Gregory Heytings
  2022-08-05 13:05                                                                         ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-05 12:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, dgutov

> AFAIK, the problem is not entirely solved by syntax-wholeline-max.

No, indeed.

> If and when it is solved, we could revisit this issue.

The locked narrowing currently in effect makes it impossible to
investigate this problem or improve the behavior :-(


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 11:48                                                   ` Eli Zaretskii
@ 2022-08-05 12:08                                                     ` Dmitry Gutov
  2022-08-05 12:20                                                       ` Gregory Heytings
  2022-08-05 14:16                                                       ` Eli Zaretskii
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 12:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 14:48, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 14:34:12 +0300
>> Cc: gregory@heytings.org, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> We really have different problems and thus need different solutions for
>> them. Not just one blunt instrument.
> 
> The current opinion of both the head maintainers and of Gregory is
> that these are all parts of the same problem, and a single class of
> solutions can solve most of them.

This kind of approach fails to optimize for the behavior in medium-sized 
files, like the downloadify.js I showed previously.

Simply customizing long-line-threshold to a much higher value will bring 
back redisplay stutters on C-n/C-p/etc, which *were* a real problem that 
some of Gregory's changes solved.

> The problem being that many
> portions of Emacs code involved in navigation and redisplay don't
> expect lines to be too long, and therefore employ algorithms that
> don't scale well with line length.

As I demonstrated, font-lock itself doesn't have that issue.

Furthermore, the performance problem with syntax-ppss which we are 
talking about now doesn't have anything to do long lines.

Go ahead and pretty-print dictionary.json (you can use 'M-x 
json-pretty-print', write the buffer to a new file, then re-visit it). 
There won't be any long lines in the resulting file, but 'M->' will 
still make you wait a few seconds the first time.

> Preventing such code from going
> far back to the beginning of the previous line, and then coming back
> through all that text, is therefore an idea that should appear very
> reasonable.  It also works surprisingly well in practice, at least
> according to what we know at this point.
> 
> I get it that you disagree, but I haven't seen any real data behind
> your dissenting opinions, and thus I don't yet see any reason to
> reconsider changing the direction of development in this regard.

I don't understand why you dismiss the more subtle approach which still 
seems to reach the stated goals.

Gregory's changes, along with my suggested tweak, indeed bring work 
"surprisingly well" already. All without breaking font-lock in the 
common case.

Like, we're going from a 255 (?) second delay to 2 second delay already 
without breaking fontification. And yet you're eager to go from 2 
seconds down to ~0 and sacrifice highlighting correctness?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:06                                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-05 12:16                                                                         ` Gregory Heytings
  2022-08-05 12:21                                                                           ` Dmitry Gutov
  2022-08-05 13:05                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 12:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Eli Zaretskii, dgutov


>> If and when it is solved, we could revisit this issue.
>
> The locked narrowing currently in effect makes it impossible to 
> investigate this problem or improve the behavior :-(
>

How so?  It is fully backward compatible, just (setq long-line-threshold 
nil).  If what you want is only to remove the locked narrowing, comment 
out the relevant line in xdisp.c.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05  8:23                                               ` Gregory Heytings
@ 2022-08-05 12:19                                                 ` Dmitry Gutov
  2022-08-05 14:18                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 12:19 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

On 05.08.2022 11:23, Gregory Heytings wrote:
> No, because the point of considering extreme cases is that they reveal 
> on your computer what happens on other people's less powerful computers 
> with much smaller files.

The problem with considering extreme cases, however, is that one can 
also provide an even more extreme case no matter how we optimize our 
implementations, which, with the chosen approach, will force us to 
cripple or drop most features outright.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:08                                                     ` Dmitry Gutov
@ 2022-08-05 12:20                                                       ` Gregory Heytings
  2022-08-05 12:50                                                         ` Dmitry Gutov
  2022-08-05 14:16                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 12:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier


>
> Like, we're going from a 255 (?) second delay to 2 second delay already 
> without breaking fontification. And yet you're eager to go from 2 
> seconds down to ~0 and sacrifice highlighting correctness?
>

Yes.  Because as I told you your 2 seconds are 4 for me and 8 for Eli and 
20 for someone else.  And that's in a relatively small file.

Note that if it were 2/4/8/20 seconds once, and then no further slowdowns 
while editing the file, that would perhaps be okay.  But that's not the 
case, you will regularly see a similar 2/4/8/20 seconds delay.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:16                                                                         ` Gregory Heytings
@ 2022-08-05 12:21                                                                           ` Dmitry Gutov
  2022-08-05 12:42                                                                             ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 12:21 UTC (permalink / raw)
  To: Gregory Heytings, Stefan Monnier; +Cc: 56682, Eli Zaretskii

On 05.08.2022 15:16, Gregory Heytings wrote:
> 
>>> If and when it is solved, we could revisit this issue.
>>
>> The locked narrowing currently in effect makes it impossible to 
>> investigate this problem or improve the behavior :-(
>>
> 
> How so?  It is fully backward compatible, just (setq long-line-threshold 
> nil).  If what you want is only to remove the locked narrowing, comment 
> out the relevant line in xdisp.c.

That brings back the performance problems in redisplay, which have a 
more pronounced effect, overshadowing both issues and potential 
improvements in font-lock.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-04  1:41                                                     ` Gregory Heytings
@ 2022-08-05 12:28                                                       ` Dmitry Gutov
  0 siblings, 0 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 12:28 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

On 04.08.2022 04:41, Gregory Heytings wrote:
> 
>>
>> We cannot perform every basic operation in fixed time for any 
>> arbitrarily sized file. There are limits of what we can possibly do.
>>
> 
> Apparently the limits are lower than what you think.  Provided that we 
> accept some compromises, such as mis-fontification, which is also what 
> syntax-wholeline-max does, and against which you didn't protest.

syntax-wholelines-max indeed can potentially cause problems too, but in 
a much narrow range of situations (and only with major modes which have 
non-trivial syntax-propertize-function).

In any case, I would be happy to investigate further improvements in 
Lisp which would be implemented in the same area of code that 
syntax-wholelines-max lives in.

>> I'd really like it if we could scope this discussion to solving that 
>> particular problem. Not the speed of operations in large files in 
>> general.
>>
> 
> I don't understand what you mean.  Which "particular problem"?  The 
> point of this discussion is of course the speed of operations in large 
> files in general.

The particular problem is long lines. It not the same as "large files in 
general".

> If you take that out of the picture, everything is of 
> course possible. I'm not even sure what remains in fact, Emacs is an 
> editor, not a displayer.

The well-known problem we have had for a while is that Emacs screeches 
to a halt even on medium-sized buffers as soon as it encounters a long line.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:21                                                                           ` Dmitry Gutov
@ 2022-08-05 12:42                                                                             ` Gregory Heytings
  0 siblings, 0 replies; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 12:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 552 bytes --]


>>> The locked narrowing currently in effect makes it impossible to 
>>> investigate this problem or improve the behavior :-(
>> 
>> How so?  It is fully backward compatible, just (setq 
>> long-line-threshold nil).  If what you want is only to remove the 
>> locked narrowing, comment out the relevant line in xdisp.c.
>
> That brings back the performance problems in redisplay, which have a 
> more pronounced effect, overshadowing both issues and potential 
> improvements in font-lock.
>

The first option does, the second does not.

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:20                                                       ` Gregory Heytings
@ 2022-08-05 12:50                                                         ` Dmitry Gutov
  2022-08-05 13:00                                                           ` Gregory Heytings
  2022-08-05 13:17                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 12:50 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

On 05.08.2022 15:20, Gregory Heytings wrote:
> 
>>
>> Like, we're going from a 255 (?) second delay to 2 second delay 
>> already without breaking fontification. And yet you're eager to go 
>> from 2 seconds down to ~0 and sacrifice highlighting correctness?
>>
> 
> Yes.  Because as I told you your 2 seconds are 4 for me and 8 for Eli 
> and 20 for someone else.  And that's in a relatively small file.

Is it a "relatively small file" if we've effectively been unable to edit 
such files for all of 40 years of Emacs's existence?

> Note that if it were 2/4/8/20 seconds once, and then no further 
> slowdowns while editing the file, that would perhaps be okay.  But 
> that's not the case, you will regularly see a similar 2/4/8/20 seconds 
> delay.

Do our users regularly edit 30MB files? And do a lot of changes in them? 
In different areas?

Note that, again, to see the same delay you would have to edit that file 
near the beginning, and then visit its end again.

If I did that, though, I'm not sure whether I would be more 
inconvenienced by performance, or by broken syntax highlighting and sexp 
navigation. font-lock is not just eye candy: it also assists you when 
editing code.

For instance, if I were to edit dictionary.json, I might have needed to 
look for a certain key and change it somewhere. But if the said key is 
highlighted as a part of a string value in some places, and only as a 
key in some others, that can look and feel very puzzling, and slow down 
my work just the same.

Similarly, if I'm editing a large JSON file, I might want to write a 
small Lisp program which searches for a word, checks that it's inside a 
string (or, conversely, outside and thus looks like a key), makes all 
the necessary changes in an automated fashion, and saves the buffer. A 
broken syntax-ppss wouldn't let me do that.

Finally, yes, for some buffer size the initial wait is going to be too 
much. But that can have a separate solution with a separate threshold.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-01 11:58                               ` Eli Zaretskii
  2022-08-02  8:10                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-05 12:59                                 ` Dmitry Gutov
  2022-08-05 14:20                                   ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 12:59 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: 56682, gregory

On 01.08.2022 14:58, Eli Zaretskii wrote:

> As I wrote elsewhere, I'm okay with extending 'widen' so that it could
> "unlock" the locked narrowing, which could then be used in major modes
> that convince us their performance is adequate (or clearly announce in
> their docs that they don't care about files with long lines ;-).

And to address the idea of "unlocking" the narrowing: I think I have 
demonstrated that the remaining slowdown can be caused purely by the 
length of the buffer and how long 'parse-partial-sexp' takes to parse 
it. That part doesn't have much to do with individual modes.

And of course the more, let's say, *complex* modes like CC Mode will opt 
for unlocking narrowing right away because its font-lock logic has to 
jump around to previously-saved positions in its syntax cache, which 
will inevitably spam errors here and there when those positions are not 
accessible.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:50                                                         ` Dmitry Gutov
@ 2022-08-05 13:00                                                           ` Gregory Heytings
  2022-08-05 13:11                                                             ` Dmitry Gutov
  2022-08-05 13:17                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 13:00 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, monnier

[-- Attachment #1: Type: text/plain, Size: 770 bytes --]


>> Yes.  Because as I told you your 2 seconds are 4 for me and 8 for Eli 
>> and 20 for someone else.  And that's in a relatively small file.
>
> Is it a "relatively small file" if we've effectively been unable to edit 
> such files for all of 40 years of Emacs's existence?
>

There are many other editors out there, isn't it?

>> Note that if it were 2/4/8/20 seconds once, and then no further 
>> slowdowns while editing the file, that would perhaps be okay.  But 
>> that's not the case, you will regularly see a similar 2/4/8/20 seconds 
>> delay.
>
> Do our users regularly edit 30MB files? And do a lot of changes in them? 
> In different areas?
>

You seem to think they do not.  Then why is it a problem if such files are 
mis-fontified?

^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:06                                                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05 12:16                                                                         ` Gregory Heytings
@ 2022-08-05 13:05                                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 13:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, gregory, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gregory@heytings.org,  dgutov@yandex.ru,  56682@debbugs.gnu.org
> Date: Fri, 05 Aug 2022 08:06:32 -0400
> 
> > If and when it is solved, we could revisit this issue.
> 
> The locked narrowing currently in effect makes it impossible to
> investigate this problem or improve the behavior :-(

??? Just set long-line-threshold to nil, and you can investigate all
you want.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 13:00                                                           ` Gregory Heytings
@ 2022-08-05 13:11                                                             ` Dmitry Gutov
  0 siblings, 0 replies; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 13:11 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, monnier

On 05.08.2022 16:00, Gregory Heytings wrote:
>>> Note that if it were 2/4/8/20 seconds once, and then no further 
>>> slowdowns while editing the file, that would perhaps be okay.  But 
>>> that's not the case, you will regularly see a similar 2/4/8/20 
>>> seconds delay.
>>
>> Do our users regularly edit 30MB files? And do a lot of changes in 
>> them? In different areas?
>>
> 
> You seem to think they do not.  Then why is it a problem if such files 
> are mis-fontified?

 From experience, I might visit a medium-to-large sized file, and I 
might search for a particular identifier inside it. Much more rarely, I 
would apply some edit inside it, in one place, and then save the buffer.

On balance, there is more likely more reading than editing involved. 
That's why I think font-lock should be assigned more importance.

In all likelihood, the file I would visit would be even smaller than 
30MB, but big enough that the majority of your improvements will be 
noticeable and welcome there. Just not the part that breaks font-lock.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:50                                                         ` Dmitry Gutov
  2022-08-05 13:00                                                           ` Gregory Heytings
@ 2022-08-05 13:17                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-05 13:30                                                             ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-08-05 13:17 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Gregory Heytings, Eli Zaretskii

> Do our users regularly edit 30MB files?

Thanks to Gregory's changes, it may start becoming more common.

> And do a lot of changes in them?

In my experience, not very much, no.  Usually large files are
machine-generated and rarely edited by hand.  But there are still
relevant use-cases like opening a large JSON/XML/younameit file and
applying a search&replace or a keyboard macro to it.

I think my machines are slow enough (and my Emacs has enough extra
sluggishness thanks to the many assertion checking compiled into it)
that I'd be better served with an approach like that of so-long where
large files use a dumbed down major mode to avoid most source of extra
slow down like font-lock.

What I'm not sure of is how useful is a "font-lock with arbitrary
narrowing", where portions will be highlighted as strings rather than
code (and vice-versa).  I don't have enough experience with it yet to
be sure.  Taking a step back, I suspect that the only "real" solution is
something like `jit-lock-defer` coupled with a way to perform the
font-lock (and syntax-ppss/propertize) in the background.


        Stefan






^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 13:17                                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-08-05 13:30                                                             ` Dmitry Gutov
  2022-08-05 13:41                                                               ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 13:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 56682, Gregory Heytings, Eli Zaretskii

On 05.08.2022 16:17, Stefan Monnier wrote:
> What I'm not sure of is how useful is a "font-lock with arbitrary
> narrowing", where portions will be highlighted as strings rather than
> code (and vice-versa).  I don't have enough experience with it yet to
> be sure.  Taking a step back, I suspect that the only "real" solution is
> something like `jit-lock-defer` coupled with a way to perform the
> font-lock (and syntax-ppss/propertize) in the background.

IIRC I heard that "some other editors" take the approach of restricting 
syntax-highlighting to just the beginning of a large file.

10000 seems too low, but if the 2 seconds in the dictionary.json example 
feels too much to people (and the file is 18 MB), maybe restrict syntax 
highlighting to the first 1 MB of each file? At least until someone 
optimizes parse-partial-sexp to work much faster.

Or 10MB. Not too important as long as the value is separately customizable.

Anyway, I think I'd prefer no highlighting at the end of those large 
files, rather than arbitrarily incorrect one.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 13:30                                                             ` Dmitry Gutov
@ 2022-08-05 13:41                                                               ` Gregory Heytings
  2022-08-05 14:00                                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 13:41 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>> What I'm not sure of is how useful is a "font-lock with arbitrary 
>> narrowing", where portions will be highlighted as strings rather than 
>> code (and vice-versa).  I don't have enough experience with it yet to 
>> be sure.  Taking a step back, I suspect that the only "real" solution 
>> is something like `jit-lock-defer` coupled with a way to perform the 
>> font-lock (and syntax-ppss/propertize) in the background.
>
> IIRC I heard that "some other editors" take the approach of restricting 
> syntax-highlighting to just the beginning of a large file.
>

Other editors just give up syntax highlighting altogether even with the 18 
MB file (and you cannot edit really large files with them).

But if you're so annoyed by mis-fontification, why don't you just turn 
font-lock mode off?

Also, why did you not protest vehemently when Stefan added 
syntax-wholeline-max, which also causes occasional mis-fontification?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 11:50                                                                         ` Gregory Heytings
@ 2022-08-05 13:43                                                                           ` Eli Zaretskii
  2022-08-06 13:28                                                                           ` Eli Zaretskii
  2022-08-06 14:05                                                                           ` Eli Zaretskii
  2 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 13:43 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: gerd.moellmann, 56682, monnier, dgutov

> Date: Fri, 05 Aug 2022 11:50:56 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: gerd.moellmann@gmail.com, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca, 
>     dgutov@yandex.ru
> 
> > There's (at least) one more aspect of this, as long as Text mode is 
> > being used: Text mode doesn't force bidi-paragraph-direction to be 
> > left-to-right, whereas all descendants of prog-mode, including js-mode, 
> > do.  Leaving bidi-paragraph-direction at nil means Emacs needs to 
> > determine the base paragraph direction each time it's about to redisplay 
> > a window, and that might be expensive, especially in a large buffer 
> > without any paragraph breaks (by default, an empty line), because that 
> > is determined by the first strong directional character of the 
> > paragraph.  So for a more fair comparison with Text mode, you should set 
> > bidi-paragraph-direction to the value left-to-right in text-mode 
> > buffers.
> 
> Indeed, that seems to be the culprit here, I didn't know that text-mode 
> was an exception here.

Actually, it's the other way around: prog-mode and its descendants are
the exception.  The default value of bidi-paragraph-direction is nil,
but in prog-mode's we _know_ that program source code is written
left-to-right, even if some of strings and comments could include R2L
text, so we force bidi-paragraph-direction in these modes.

> If I set bidi-paragraph-direction to 'left-to-right after visiting
> the arabic-small.txt file, Emacs (mis) behaves like it does for the
> other Arabic files: it becomes slow, and C-n C-p do not work
> correctly anymore.

OK, so there's one more subtle issue about bidi, and it definitely
affects some of these files: whether the file actually includes both
L2R and R2L characters.  If all of the characters are L2R or all of
them are R2L (the latter is the case with arabic-small.txt), and
bidi-paragraph-direction is either nil or left-to-right for L2R text
and right-to-left for R2L text, then from the display engine's POV the
text is not truly bidirectional, it actually has a single direction.
Text that has a single direction doesn't require (expensive)
reordering for display, and we have optimizations in place for such
situations, to make redisplay faster in those cases, because that is
the natural situation with human-readable text: text that is
predominantly R2L is relatively rarely mixed with L2R text and is
normally displayed in right-to-left paragraphs.

By adding the "ar" prefix to the files, you made what was a
unidirectional text be bidirectional.  The effect on display is
dramatic: the first Arabic letter will now be displayed _last_,
because the paragraph now has left-to-right direction (the first
strong character in it is 'a', a L2R character), and the entire Arabic
string that follows needs to be reversed on display.

Compare arabic-small.txt with arabic-small.json after setting
bidi-paragraph-direction to be right-to-left in the latter, and you
should see a similar performance in both.  (And watch what happens on
display when you change the value of bidi-paragraph-direction in
arabic-small.json.)





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 13:41                                                               ` Gregory Heytings
@ 2022-08-05 14:00                                                                 ` Dmitry Gutov
  2022-08-05 14:09                                                                   ` Gregory Heytings
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 14:00 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 05.08.2022 16:41, Gregory Heytings wrote:
> 
>>> What I'm not sure of is how useful is a "font-lock with arbitrary 
>>> narrowing", where portions will be highlighted as strings rather than 
>>> code (and vice-versa).  I don't have enough experience with it yet to 
>>> be sure.  Taking a step back, I suspect that the only "real" solution 
>>> is something like `jit-lock-defer` coupled with a way to perform the 
>>> font-lock (and syntax-ppss/propertize) in the background.
>>
>> IIRC I heard that "some other editors" take the approach of 
>> restricting syntax-highlighting to just the beginning of a large file.
>>
> 
> Other editors just give up syntax highlighting altogether even with the 
> 18 MB file (and you cannot edit really large files with them).

In my testing, with my patch, the 18 MB file works reasonably well and 
has good syntax highlighting.

> But if you're so annoyed by mis-fontification, why don't you just turn 
> font-lock mode off?

"If you're annoyed by Emacs's performance with large files, why don't 
you just never open them?"

I like font-lock and the visual cues that come with it. Only 
font-locking the first 1 MB of a large file seems like a good 
compromise: show correct highlighting where we can with reasonable 
performance, and omit it in the rest of the file.

> Also, why did you not protest vehemently when Stefan added 
> syntax-wholeline-max, which also causes occasional mis-fontification?

I have replied to this exact question in an earlier email. We can 
continue this line of inquiry in that subthread.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 14:00                                                                 ` Dmitry Gutov
@ 2022-08-05 14:09                                                                   ` Gregory Heytings
  2022-08-05 22:38                                                                     ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Gregory Heytings @ 2022-08-05 14:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, Eli Zaretskii, Stefan Monnier


>> But if you're so annoyed by mis-fontification, why don't you just turn 
>> font-lock mode off?
>
> "If you're annoyed by Emacs's performance with large files, why don't 
> you just never open them?"
>

That's just wrong: in one case we're talking about your personal 
feelings/preferences, in the other one about Emacs' capabilities for all 
its users.

>
> I like font-lock and the visual cues that come with it. Only 
> font-locking the first 1 MB of a large file seems like a good 
> compromise: show correct highlighting where we can with reasonable 
> performance, and omit it in the rest of the file.
>

So what you prefer IIUC would be to call fontification-functions with a 
locked narrowing to 1 MB if point is before that threshold, and to not 
call fontification-functions at all after that threshold?  That might be 
another doable approach.

>> Also, why did you not protest vehemently when Stefan added 
>> syntax-wholeline-max, which also causes occasional mis-fontification?
>
> I have replied to this exact question in an earlier email. We can 
> continue this line of inquiry in that subthread.
>

Sorry, I missed that part of your earlier post:

>
> syntax-wholelines-max indeed can potentially cause problems too, but in 
> a much narrow range of situations (and only with major modes which have 
> non-trivial syntax-propertize-function).
>

You forget to metion that syntax-wholelines-max can in fact make things 
much worse, see the recipes I sent you a few days ago.  So it doesn't seem 
like it's the right approach.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:08                                                     ` Dmitry Gutov
  2022-08-05 12:20                                                       ` Gregory Heytings
@ 2022-08-05 14:16                                                       ` Eli Zaretskii
  1 sibling, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 14:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 15:08:22 +0300
> Cc: gregory@heytings.org, 56682@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > The current opinion of both the head maintainers and of Gregory is
> > that these are all parts of the same problem, and a single class of
> > solutions can solve most of them.
> 
> This kind of approach fails to optimize for the behavior in medium-sized 
> files, like the downloadify.js I showed previously.
> 
> Simply customizing long-line-threshold to a much higher value will bring 
> back redisplay stutters on C-n/C-p/etc, which *were* a real problem that 
> some of Gregory's changes solved.

You want a separate threshold for calling font-lock from redisplay?
We could discuss that.

> > The problem being that many
> > portions of Emacs code involved in navigation and redisplay don't
> > expect lines to be too long, and therefore employ algorithms that
> > don't scale well with line length.
> 
> As I demonstrated, font-lock itself doesn't have that issue.

What do you mean by "font-lock itself"?

> Furthermore, the performance problem with syntax-ppss which we are 
> talking about now doesn't have anything to do long lines.

The facts are that it has problems in large files, but it also has
problems in not-so-large files with long lines, due to whole-line
approach.

> Go ahead and pretty-print dictionary.json (you can use 'M-x 
> json-pretty-print', write the buffer to a new file, then re-visit it). 
> There won't be any long lines in the resulting file, but 'M->' will 
> still make you wait a few seconds the first time.

No one said that the changes being discussed solve all the problems in
Emacs.

> > I get it that you disagree, but I haven't seen any real data behind
> > your dissenting opinions, and thus I don't yet see any reason to
> > reconsider changing the direction of development in this regard.
> 
> I don't understand why you dismiss the more subtle approach which still 
> seems to reach the stated goals.
> 
> Gregory's changes, along with my suggested tweak, indeed bring work 
> "surprisingly well" already. All without breaking font-lock in the 
> common case.

I don't think I understand what "tweak" you are suggesting.  can you
show a patch relative to the current master?

> Like, we're going from a 255 (?) second delay to 2 second delay already 
> without breaking fontification. And yet you're eager to go from 2 
> seconds down to ~0 and sacrifice highlighting correctness?

Yes, because 2 sec (in an optimized build) is a very long time.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:19                                                 ` Dmitry Gutov
@ 2022-08-05 14:18                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 14:18 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 15:19:45 +0300
> Cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
>  monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 05.08.2022 11:23, Gregory Heytings wrote:
> > No, because the point of considering extreme cases is that they reveal 
> > on your computer what happens on other people's less powerful computers 
> > with much smaller files.
> 
> The problem with considering extreme cases, however, is that one can 
> also provide an even more extreme case no matter how we optimize our 
> implementations, which, with the chosen approach, will force us to 
> cripple or drop most features outright.

The chosen approach does the same in a 10KB file and in 100GB file, so
I don't understand what you have in mind here.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 12:59                                 ` Dmitry Gutov
@ 2022-08-05 14:20                                   ` Eli Zaretskii
  2022-08-05 14:41                                     ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 14:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 15:59:57 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 01.08.2022 14:58, Eli Zaretskii wrote:
> 
> > As I wrote elsewhere, I'm okay with extending 'widen' so that it could
> > "unlock" the locked narrowing, which could then be used in major modes
> > that convince us their performance is adequate (or clearly announce in
> > their docs that they don't care about files with long lines ;-).
> 
> And to address the idea of "unlocking" the narrowing: I think I have 
> demonstrated that the remaining slowdown can be caused purely by the 
> length of the buffer and how long 'parse-partial-sexp' takes to parse 
> it.

No, you haven't demonstrated that.

> And of course the more, let's say, *complex* modes like CC Mode will opt 
> for unlocking narrowing right away because its font-lock logic has to 
> jump around to previously-saved positions in its syntax cache, which 
> will inevitably spam errors here and there when those positions are not 
> accessible.

CC Mode is extremely unlikely to happen in files with such long lines,
so what it does is largely irrelevant to this discussion.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 14:20                                   ` Eli Zaretskii
@ 2022-08-05 14:41                                     ` Dmitry Gutov
  2022-08-05 15:33                                       ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 14:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 17:20, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 15:59:57 +0300
>> Cc: 56682@debbugs.gnu.org, gregory@heytings.org
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> On 01.08.2022 14:58, Eli Zaretskii wrote:
>>
>>> As I wrote elsewhere, I'm okay with extending 'widen' so that it could
>>> "unlock" the locked narrowing, which could then be used in major modes
>>> that convince us their performance is adequate (or clearly announce in
>>> their docs that they don't care about files with long lines ;-).
>>
>> And to address the idea of "unlocking" the narrowing: I think I have
>> demonstrated that the remaining slowdown can be caused purely by the
>> length of the buffer and how long 'parse-partial-sexp' takes to parse
>> it.
> 
> No, you haven't demonstrated that.

Apply the patch for xdisp.c that I have sent previously (it will be at 
the end of this email too) and recompile Emacs.

Now try two different scenarios. 1,2a and 1,2b.

1. Visit dictionary.json. It will ask you whether to open such big file 
literally, but after you answer 'y', it will display the beginning of 
the file quickly.
2a) Evaluate (benchmark 1 '(save-excursion (parse-partial-sexp 1 
(point-max)))), note the reported delay.

Kill and re-visit the file.

1. (same as before)
2b) Press M->, note the delay you see.

The delays in scenarios 1,2a and 1,2b should be ~the same. They are so 
in my testing.

Or try this scenario: 1,2a,2b. Step 2b should work instantly here.

The patch:

diff --git a/src/xdisp.c b/src/xdisp.c
index 099efed2db..02d7f6c562 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4391,19 +4391,19 @@ handle_fontified_prop (struct it *it)

        eassert (it->end_charpos == ZV);

-      if (current_buffer->long_line_optimizations_p)
-	{
-	  ptrdiff_t begv = it->narrowed_begv;
-	  ptrdiff_t zv = it->narrowed_zv;
-	  ptrdiff_t charpos = IT_CHARPOS (*it);
-	  if (charpos < begv || charpos > zv)
-	    {
-	      begv = get_narrowed_begv (it->w, charpos);
-	      zv = get_narrowed_zv (it->w, charpos);
-	    }
-	  narrow_to_region_internal (make_fixnum (begv), make_fixnum (zv), true);
-	  specbind (Qrestrictions_locked, Qt);
-	}
+      /* if (current_buffer->long_line_optimizations_p) */
+      /* 	{ */
+      /* 	  ptrdiff_t begv = it->narrowed_begv; */
+      /* 	  ptrdiff_t zv = it->narrowed_zv; */
+      /* 	  ptrdiff_t charpos = IT_CHARPOS (*it); */
+      /* 	  if (charpos < begv || charpos > zv) */
+      /* 	    { */
+      /* 	      begv = get_narrowed_begv (it->w, charpos); */
+      /* 	      zv = get_narrowed_zv (it->w, charpos); */
+      /* 	    } */
+      /* 	  narrow_to_region_internal (make_fixnum (begv), make_fixnum 
(zv), true); */
+      /* 	  specbind (Qrestrictions_locked, Qt); */
+      /* 	} */

        /* Don't allow Lisp that runs from 'fontification-functions'
  	 clear our face and image caches behind our back.  */





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 14:41                                     ` Dmitry Gutov
@ 2022-08-05 15:33                                       ` Eli Zaretskii
  2022-08-05 17:32                                         ` Dmitry Gutov
  2022-08-05 18:02                                         ` Dmitry Gutov
  0 siblings, 2 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 15:33 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 17:41:51 +0300
> Cc: monnier@iro.umontreal.ca, 56682@debbugs.gnu.org, gregory@heytings.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> >> And to address the idea of "unlocking" the narrowing: I think I have
> >> demonstrated that the remaining slowdown can be caused purely by the
> >> length of the buffer and how long 'parse-partial-sexp' takes to parse
> >> it.
> > 
> > No, you haven't demonstrated that.
> 
> Apply the patch for xdisp.c that I have sent previously (it will be at 
> the end of this email too) and recompile Emacs.
> 
> Now try two different scenarios. 1,2a and 1,2b.
> 
> 1. Visit dictionary.json. It will ask you whether to open such big file 
> literally, but after you answer 'y', it will display the beginning of 
> the file quickly.
> 2a) Evaluate (benchmark 1 '(save-excursion (parse-partial-sexp 1 
> (point-max)))), note the reported delay.
> 
> Kill and re-visit the file.
> 
> 1. (same as before)
> 2b) Press M->, note the delay you see.
> 
> The delays in scenarios 1,2a and 1,2b should be ~the same. They are so 
> in my testing.
> 
> Or try this scenario: 1,2a,2b. Step 2b should work instantly here.

How is (parse-partial-sexp 1 (point-max)) related to the issue at
hand?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 15:33                                       ` Eli Zaretskii
@ 2022-08-05 17:32                                         ` Dmitry Gutov
  2022-08-05 18:09                                           ` Eli Zaretskii
  2022-08-05 18:02                                         ` Dmitry Gutov
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 17:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 18:33, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 17:41:51 +0300
>> Cc: monnier@iro.umontreal.ca, 56682@debbugs.gnu.org, gregory@heytings.org
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
I think I have
>>>> demonstrated that the remaining slowdown can be caused purely by the
>>>> length of the buffer and how long 'parse-partial-sexp' takes to parse
>>>> it.
>>>
>>> No, you haven't demonstrated that.
>>
>> Apply the patch for xdisp.c that I have sent previously (it will be at
>> the end of this email too) and recompile Emacs.
>>
>> Now try two different scenarios. 1,2a and 1,2b.
>>
>> 1. Visit dictionary.json. It will ask you whether to open such big file
>> literally, but after you answer 'y', it will display the beginning of
>> the file quickly.
>> 2a) Evaluate (benchmark 1 '(save-excursion (parse-partial-sexp 1
>> (point-max)))), note the reported delay.
>>
>> Kill and re-visit the file.
>>
>> 1. (same as before)
>> 2b) Press M->, note the delay you see.
>>
>> The delays in scenarios 1,2a and 1,2b should be ~the same. They are so
>> in my testing.
>>
>> Or try this scenario: 1,2a,2b. Step 2b should work instantly here.
> 
> How is (parse-partial-sexp 1 (point-max)) related to the issue at
> hand?

I said:

 >>>> I think I have
 >>>> demonstrated that the remaining slowdown can be caused purely by the
 >>>> length of the buffer and how long 'parse-partial-sexp' takes to parse
 >>>> it.

You said:

 >>> No, you haven't demonstrated that.

...and now you are asking why we are talking about parse-partial-sexp?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 15:33                                       ` Eli Zaretskii
  2022-08-05 17:32                                         ` Dmitry Gutov
@ 2022-08-05 18:02                                         ` Dmitry Gutov
  2022-08-05 18:14                                           ` Eli Zaretskii
  1 sibling, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 18:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 18:33, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 17:41:51 +0300
>> Cc:monnier@iro.umontreal.ca,56682@debbugs.gnu.org,gregory@heytings.org
>> From: Dmitry Gutov<dgutov@yandex.ru>
>>
>>>> And to address the idea of "unlocking" the narrowing: I think I have
>>>> demonstrated that the remaining slowdown can be caused purely by the
>>>> length of the buffer and how long 'parse-partial-sexp' takes to parse
>>>> it.
>>> No, you haven't demonstrated that.
>> Apply the patch for xdisp.c that I have sent previously (it will be at
>> the end of this email too) and recompile Emacs.
>>
>> Now try two different scenarios. 1,2a and 1,2b.
>>
>> 1. Visit dictionary.json. It will ask you whether to open such big file
>> literally, but after you answer 'y', it will display the beginning of
>> the file quickly.
>> 2a) Evaluate (benchmark 1 '(save-excursion (parse-partial-sexp 1
>> (point-max)))), note the reported delay.
>>
>> Kill and re-visit the file.
>>
>> 1. (same as before)
>> 2b) Press M->, note the delay you see.
>>
>> The delays in scenarios 1,2a and 1,2b should be ~the same. They are so
>> in my testing.
>>
>> Or try this scenario: 1,2a,2b. Step 2b should work instantly here.
> How is (parse-partial-sexp 1 (point-max)) related to the issue at
> hand?

Or perhaps I should answer this way:

We move to near EOB.
fontification-functions are called.

jit-lock
calls
(font-lock-fontify-region point-near-buffer-end (point-max))
which calls
font-lock-fontify-syntactically-region
which calls both
   (syntax-propertize (point-max))
   and
   (syntax-ppss point-near-buffer-end) -> and it calls parse-partial-sexp

syntax-propertize will also likely call syntax-ppss itself, probably 
through the major mode's syntax-propertize-function. But if 
syntax-propertize-function is nil, parse-partial-sexp gets called 
anyway, over the whole buffer, which makes it the main workload in 
fontifying near EOB.

Now, if syntax-propertize-function is non-nil, parse-partial-sexp will 
also call it, and it adds its overhead (sometimes a multiple of p-p-s), 
which also scales linearly with the length of the buffer.

So if one can demonstrate that (parse-partial-sexp (point-min) 
(point-max)) takes about the same time as it takes to fontify the last 
screen-ful of a buffer, then that says that everything else that 
jit-lock does to fontify, is negligible, time-wise.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 17:32                                         ` Dmitry Gutov
@ 2022-08-05 18:09                                           ` Eli Zaretskii
  0 siblings, 0 replies; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 18:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 20:32:32 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 05.08.2022 18:33, Eli Zaretskii wrote:
> >> Date: Fri, 5 Aug 2022 17:41:51 +0300
> >> Cc: monnier@iro.umontreal.ca, 56682@debbugs.gnu.org, gregory@heytings.org
> >> From: Dmitry Gutov <dgutov@yandex.ru>
> >>
> I think I have
> >>>> demonstrated that the remaining slowdown can be caused purely by the
> >>>> length of the buffer and how long 'parse-partial-sexp' takes to parse
> >>>> it.
> >>>
> >>> No, you haven't demonstrated that.
> >>
> >> Apply the patch for xdisp.c that I have sent previously (it will be at
> >> the end of this email too) and recompile Emacs.
> >>
> >> Now try two different scenarios. 1,2a and 1,2b.
> >>
> >> 1. Visit dictionary.json. It will ask you whether to open such big file
> >> literally, but after you answer 'y', it will display the beginning of
> >> the file quickly.
> >> 2a) Evaluate (benchmark 1 '(save-excursion (parse-partial-sexp 1
> >> (point-max)))), note the reported delay.
> >>
> >> Kill and re-visit the file.
> >>
> >> 1. (same as before)
> >> 2b) Press M->, note the delay you see.
> >>
> >> The delays in scenarios 1,2a and 1,2b should be ~the same. They are so
> >> in my testing.
> >>
> >> Or try this scenario: 1,2a,2b. Step 2b should work instantly here.
> > 
> > How is (parse-partial-sexp 1 (point-max)) related to the issue at
> > hand?
> 
> I said:
> 
>  >>>> I think I have
>  >>>> demonstrated that the remaining slowdown can be caused purely by the
>  >>>> length of the buffer and how long 'parse-partial-sexp' takes to parse
>  >>>> it.
> 
> You said:
> 
>  >>> No, you haven't demonstrated that.
> 
> ...and now you are asking why we are talking about parse-partial-sexp?

Yes, because a user who visits a file doesn't invoke
parse-partial-sexp.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 18:02                                         ` Dmitry Gutov
@ 2022-08-05 18:14                                           ` Eli Zaretskii
  2022-08-05 19:01                                             ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 18:14 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 21:02:35 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > How is (parse-partial-sexp 1 (point-max)) related to the issue at
> > hand?
> 
> Or perhaps I should answer this way:
> 
> We move to near EOB.
> fontification-functions are called.
> 
> jit-lock
> calls
> (font-lock-fontify-region point-near-buffer-end (point-max))
> which calls
> font-lock-fontify-syntactically-region
> which calls both
>    (syntax-propertize (point-max))
>    and
>    (syntax-ppss point-near-buffer-end) -> and it calls parse-partial-sexp
> 
> syntax-propertize will also likely call syntax-ppss itself, probably 
> through the major mode's syntax-propertize-function. But if 
> syntax-propertize-function is nil, parse-partial-sexp gets called 
> anyway, over the whole buffer, which makes it the main workload in 
> fontifying near EOB.
> 
> Now, if syntax-propertize-function is non-nil, parse-partial-sexp will 
> also call it, and it adds its overhead (sometimes a multiple of p-p-s), 
> which also scales linearly with the length of the buffer.
> 
> So if one can demonstrate that (parse-partial-sexp (point-min) 
> (point-max)) takes about the same time as it takes to fontify the last 
> screen-ful of a buffer, then that says that everything else that 
> jit-lock does to fontify, is negligible, time-wise.

So you have demonstrated that, if visiting a file and moving inside it
calls parse-partial-sexp to scan the entire buffer, then this could be
some, perhaps a large, part of the slowdown.

First, we need to establish that indeed parse-partial-sexp is called
in that manner in the relevant major modes (not just one of them), or
by font-lock itself regardless of the mode.

Second, we need to establish that indeed this takes a large portion of
the time in the slow operations.  Not just one particular operation,
but most or all of them.

And after that, we may have some food for thought.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 18:14                                           ` Eli Zaretskii
@ 2022-08-05 19:01                                             ` Dmitry Gutov
  2022-08-05 19:14                                               ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 19:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 21:14, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 21:02:35 +0300
>> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>>> How is (parse-partial-sexp 1 (point-max)) related to the issue at
>>> hand?
>>
>> Or perhaps I should answer this way:
>>
>> We move to near EOB.
>> fontification-functions are called.
>>
>> jit-lock
>> calls
>> (font-lock-fontify-region point-near-buffer-end (point-max))
>> which calls
>> font-lock-fontify-syntactically-region
>> which calls both
>>     (syntax-propertize (point-max))
>>     and
>>     (syntax-ppss point-near-buffer-end) -> and it calls parse-partial-sexp
>>
>> syntax-propertize will also likely call syntax-ppss itself, probably
>> through the major mode's syntax-propertize-function. But if
>> syntax-propertize-function is nil, parse-partial-sexp gets called
>> anyway, over the whole buffer, which makes it the main workload in
>> fontifying near EOB.
>>
>> Now, if syntax-propertize-function is non-nil, parse-partial-sexp will
>> also call it, and it adds its overhead (sometimes a multiple of p-p-s),
>> which also scales linearly with the length of the buffer.
>>
>> So if one can demonstrate that (parse-partial-sexp (point-min)
>> (point-max)) takes about the same time as it takes to fontify the last
>> screen-ful of a buffer, then that says that everything else that
>> jit-lock does to fontify, is negligible, time-wise.
> 
> So you have demonstrated that, if visiting a file and moving inside it
> calls parse-partial-sexp to scan the entire buffer, then this could be
> some, perhaps a large, part of the slowdown.

Yes.

> First, we need to establish that indeed parse-partial-sexp is called
> in that manner in the relevant major modes (not just one of them), or
> by font-lock itself regardless of the mode.

It is called by font-lock itself, which ends up calling syntax-ppss, 
which does its job with parse-partial-sexp. I have outlined the chain of 
calls in the previous message, you can verify it by looking at the sources.

> Second, we need to establish that indeed this takes a large portion of
> the time in the slow operations.  Not just one particular operation,
> but most or all of them.

To establish that, I have described the experiment in the grandparent 
email (with scenarios 1,2a;1,2b;1,2a,2b), and performed it myself as well.

But I'm talking about the slowdown observed when doing 'M->'. Not about 
any operations one might try to perform. Having said that, after the 
initial 'M->' most of navigation operations look snappy to me. So that's 
the slowdown I decided to investigate.

> And after that, we may have some food for thought.

Here's some more:

All major modes we can currently use for JSON (the built-in js-mode and 
the two json-mode's in ELPA) inherit the value of 
syntax-propertize-function from js-mode. But there's no need for it: 
JSON doesn't have division, or regexps, or preprocessor directives, or 
embedded JSX structures.

Setting syntax-propertize-function to nil speeds up parse-partial-sexp 
significantly. Here's a patch you can try to evaluate the effect on 
dictionary.json of that change combined with the previous tweak I 
suggested. Now it takes about 5x faster to fontify the last screenful, 
on my machine. Meaning, 'M->' feels almost (but not quite) instant. And 
the fontification is still correct.

A "proper" change would involve creating a new major mode, probably, 
rather than regexp-matching against buffer-file-name. But I'm not sure 
what name to pick: 'json-mode' would step on the toes of two existing 
packages now. 'js-json-mode', maybe? Or we bring in json-mode from GNU 
ELPA (with a similar change).

Anyway, try this please:

diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index eb2a1e4fcc..ae8e980125 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -3418,7 +3418,8 @@ js-mode
                (list js--font-lock-keywords nil nil nil nil
                      '(font-lock-syntactic-face-function
                        . js-font-lock-syntactic-face-function)))
-  (setq-local syntax-propertize-function #'js-syntax-propertize)
+  (unless (and buffer-file-name (string-match-p "\\.json\\'" 
buffer-file-name))
+    (setq-local syntax-propertize-function #'js-syntax-propertize))
    (add-hook 'syntax-propertize-extend-region-functions
              #'syntax-propertize-multiline 'append 'local)
    (add-hook 'syntax-propertize-extend-region-functions
diff --git a/src/xdisp.c b/src/xdisp.c
index 099efed2db..fcb2be8768 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -4391,20 +4391,6 @@ handle_fontified_prop (struct it *it)

        eassert (it->end_charpos == ZV);

-      if (current_buffer->long_line_optimizations_p)
-	{
-	  ptrdiff_t begv = it->narrowed_begv;
-	  ptrdiff_t zv = it->narrowed_zv;
-	  ptrdiff_t charpos = IT_CHARPOS (*it);
-	  if (charpos < begv || charpos > zv)
-	    {
-	      begv = get_narrowed_begv (it->w, charpos);
-	      zv = get_narrowed_zv (it->w, charpos);
-	    }
-	  narrow_to_region_internal (make_fixnum (begv), make_fixnum (zv), true);
-	  specbind (Qrestrictions_locked, Qt);
-	}
-
        /* Don't allow Lisp that runs from 'fontification-functions'
  	 clear our face and image caches behind our back.  */
        it->f->inhibit_clear_image_cache = true;





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 19:01                                             ` Dmitry Gutov
@ 2022-08-05 19:14                                               ` Eli Zaretskii
  2022-08-05 20:23                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 416+ messages in thread
From: Eli Zaretskii @ 2022-08-05 19:14 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 56682, gregory, monnier

> Date: Fri, 5 Aug 2022 22:01:24 +0300
> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > First, we need to establish that indeed parse-partial-sexp is called
> > in that manner in the relevant major modes (not just one of them), or
> > by font-lock itself regardless of the mode.
> 
> It is called by font-lock itself, which ends up calling syntax-ppss, 
> which does its job with parse-partial-sexp.

In all modes, regardless of what the mode wants to highlight?

> > Second, we need to establish that indeed this takes a large portion of
> > the time in the slow operations.  Not just one particular operation,
> > but most or all of them.
> 
> To establish that, I have described the experiment in the grandparent 
> email (with scenarios 1,2a;1,2b;1,2a,2b), and performed it myself as well.
> 
> But I'm talking about the slowdown observed when doing 'M->'. Not about 
> any operations one might try to perform. Having said that, after the 
> initial 'M->' most of navigation operations look snappy to me. So that's 
> the slowdown I decided to investigate.

We need to look at more than just M->.  C-n/C-p, C-v/M-v, C-l are also
important, as are the time it takes from typing M-x or M-: until you
see the prompt in the minibuffer, and the time to update the display
after inserting or deleting a single character.

> All major modes we can currently use for JSON (the built-in js-mode and 
> the two json-mode's in ELPA) inherit the value of 
> syntax-propertize-function from js-mode. But there's no need for it: 
> JSON doesn't have division, or regexps, or preprocessor directives, or 
> embedded JSX structures.
> 
> Setting syntax-propertize-function to nil speeds up parse-partial-sexp 
> significantly.

Thanks, so I guess we may have a solution for JSON files, if disabling
syntax-propertize-function doesn't have any downsides.  What about
other modes that we see in files with long lines, like XML?

And how scalable is the solution you propose, i.e. how it behaves in
JSON files with a much longer lines?





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 19:14                                               ` Eli Zaretskii
@ 2022-08-05 20:23                                                 ` Dmitry Gutov
  2022-08-06  6:07                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 20:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 56682, gregory, monnier

On 05.08.2022 22:14, Eli Zaretskii wrote:
>> Date: Fri, 5 Aug 2022 22:01:24 +0300
>> Cc: 56682@debbugs.gnu.org, gregory@heytings.org, monnier@iro.umontreal.ca
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>>> First, we need to establish that indeed parse-partial-sexp is called
>>> in that manner in the relevant major modes (not just one of them), or
>>> by font-lock itself regardless of the mode.
>>
>> It is called by font-lock itself, which ends up calling syntax-ppss,
>> which does its job with parse-partial-sexp.
> 
> In all modes, regardless of what the mode wants to highlight?

Yes. As part of highlighting strings, for instance. It needs to know 
which intervals of characters are strings. And you can't really know 
that without scanning the buffer from the beginning.

>>> Second, we need to establish that indeed this takes a large portion of
>>> the time in the slow operations.  Not just one particular operation,
>>> but most or all of them.
>>
>> To establish that, I have described the experiment in the grandparent
>> email (with scenarios 1,2a;1,2b;1,2a,2b), and performed it myself as well.
>>
>> But I'm talking about the slowdown observed when doing 'M->'. Not about
>> any operations one might try to perform. Having said that, after the
>> initial 'M->' most of navigation operations look snappy to me. So that's
>> the slowdown I decided to investigate.
> 
> We need to look at more than just M->.  C-n/C-p, C-v/M-v, C-l are also
> important, as are the time it takes from typing M-x or M-: until you
> see the prompt in the minibuffer, and the time to update the display
> after inserting or deleting a single character.

I'm not seeing any particular sluggishness in these operations when 
visiting dictionary.json.

>> All major modes we can currently use for JSON (the built-in js-mode and
>> the two json-mode's in ELPA) inherit the value of
>> syntax-propertize-function from js-mode. But there's no need for it:
>> JSON doesn't have division, or regexps, or preprocessor directives, or
>> embedded JSX structures.
>>
>> Setting syntax-propertize-function to nil speeds up parse-partial-sexp
>> significantly.
> 
> Thanks, so I guess we may have a solution for JSON files, if disabling
> syntax-propertize-function doesn't have any downsides.  What about
> other modes that we see in files with long lines, like XML?

Someone will need to test it with some typical large file. xml-mode 
(alias to nxml-mode) does have a syntax-propertize-function, but it's 
probably faster than js-syntax-propertize.

> And how scalable is the solution you propose, i.e. how it behaves in
> JSON files with a much longer lines?

parse-partial-sexp is O(length of text span)

Meaning, it scales linearly. You'll see a 10x delay in a JSON file that 
is 10x as large.





^ permalink raw reply	[flat|nested] 416+ messages in thread

* bug#56682: Fix the long lines font locking related slowdowns
  2022-08-05 14:09                                                                   ` Gregory Heytings
@ 2022-08-05 22:38                                                                     ` Dmitry Gutov
  2022-08-06  7:28                                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 416+ messages in thread
From: Dmitry Gutov @ 2022-08-05 22:38 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 56682, Eli Zaretskii, Stefan Monnier

On 05.08.2022 17:09, Gregory Heytings wrote:
> 
>>> But if you're so annoyed by mis-fontification, why don't you just 
>>> turn font-lock mode off?
>>
>> "If you're annoyed by Emacs's performance with large files, why don't 
>> you just never open them?"
>>
> 
> That's just wrong: in one case we're talking about your personal 
> feelings/preferences, in the other one about Emacs' capabilities for all 
> its users.

Being able to edit a medium-size file with correct syntax highlighting 
without redisplay-related stuttering is also a capability.

>> I like font-lock and the visual cues that come with it. Only 
>> font-locking the first 1 MB of a large file seems like a good 
>> compromise: show correct highlighting where we can with reasonable 
>> performance, and omit it in the rest of the file.
>>
> 
> So what you prefer IIUC would be to call fontification-functions with a 
> locked narrowing to 1 MB if point is before that threshold, and to not 
> call fontification-functions at all after that threshold?  That might be 
> another doable approach.

If we have to support huge files with max responsiveness, then that 
would be my preference, yes.

I don't see the point of using a "locked" narrowing for this, though. 
Maybe not even a narrowing at all: just avoid calling 
fontification-functions with START > value_of(large_file_fontification_max).

Or even implement that limitation in font-lock itself (in Lisp). Not 
sure which place is better. Depends on how we want 
'font-lock-fontify-region' to behave: would it still fontify a region 
when asked, or would it abort when BEG is too high. I'd prefer the 
former approach since it gives more power to programmers, if we don't 
find any significant pitfalls with it.

>>> Also, why did you not protest vehemently when Stefan added 
>>> syntax-wholeline-max, which also causes occasional mis-fontification?
>>
>> I have replied to this exact question in an earlier email. We can 
>> continue this line of inquiry in that subthread.
>>
> 
> Sorry, I missed that part of your earlier post:
> 
>>
>> syntax-wholelines-max indeed can potentially cause problems too, but 
&g