all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
@ 2023-10-24 14:22 Dominik Honnef
  2023-10-24 23:15 ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Dominik Honnef @ 2023-10-24 14:22 UTC (permalink / raw)
  To: 66732

Steps to reproduce:

1. Start emacs -q
2. Clear the scratch buffer
3. Switch to c-ts-mode
4. Type the following:

    /* foo
    bar
    baz
    */

Notice the following:

1. tree-sitter will not parse the buffer contents as a comment until the
closing comment marker is typed (not an issue per se.)

2. When you type the closing comment marker, only that line will be
fontified.

3. Moving to the previous line will refontify the whole comment.

Similarly, removing the closing comment marker will not instantly
refontify the buffer, either.

The expected behavior would be for the whole comment to be refontified
immediately after typing the closing comment marker.

Other buffer contents often lead to even worse refontifying, where even
motion will not fix things.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-10-24 14:22 bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably Dominik Honnef
@ 2023-10-24 23:15 ` Dmitry Gutov
  2023-10-29 12:22   ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-10-24 23:15 UTC (permalink / raw)
  To: Dominik Honnef, 66732

Hi Dominik,

On 24/10/2023 17:22, Dominik Honnef wrote:
> Steps to reproduce:
> 
> 1. Start emacs -q
> 2. Clear the scratch buffer
> 3. Switch to c-ts-mode
> 4. Type the following:
> 
>      /* foo
>      bar
>      baz
>      */
> 
> Notice the following:
> 
> 1. tree-sitter will not parse the buffer contents as a comment until the
> closing comment marker is typed (not an issue per se.)
> 
> 2. When you type the closing comment marker, only that line will be
> fontified.
> 
> 3. Moving to the previous line will refontify the whole comment.
> 
> Similarly, removing the closing comment marker will not instantly
> refontify the buffer, either.
> 
> The expected behavior would be for the whole comment to be refontified
> immediately after typing the closing comment marker.
> 
> Other buffer contents often lead to even worse refontifying, where even
> motion will not fix things.

Can repro.

Only with 'emacs -Q', though (note to others who will try).





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-10-24 23:15 ` Dmitry Gutov
@ 2023-10-29 12:22   ` Eli Zaretskii
  2023-11-18  8:37     ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2023-10-29 12:22 UTC (permalink / raw)
  To: Dmitry Gutov, Yuan Fu; +Cc: 66732, dominik

> Date: Wed, 25 Oct 2023 02:15:30 +0300
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> Hi Dominik,
> 
> On 24/10/2023 17:22, Dominik Honnef wrote:
> > Steps to reproduce:
> > 
> > 1. Start emacs -q
> > 2. Clear the scratch buffer
> > 3. Switch to c-ts-mode
> > 4. Type the following:
> > 
> >      /* foo
> >      bar
> >      baz
> >      */
> > 
> > Notice the following:
> > 
> > 1. tree-sitter will not parse the buffer contents as a comment until the
> > closing comment marker is typed (not an issue per se.)
> > 
> > 2. When you type the closing comment marker, only that line will be
> > fontified.
> > 
> > 3. Moving to the previous line will refontify the whole comment.
> > 
> > Similarly, removing the closing comment marker will not instantly
> > refontify the buffer, either.
> > 
> > The expected behavior would be for the whole comment to be refontified
> > immediately after typing the closing comment marker.
> > 
> > Other buffer contents often lead to even worse refontifying, where even
> > motion will not fix things.
> 
> Can repro.
> 
> Only with 'emacs -Q', though (note to others who will try).

Yuan, any ideas or comments?





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-10-29 12:22   ` Eli Zaretskii
@ 2023-11-18  8:37     ` Eli Zaretskii
  2023-12-11  4:16       ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2023-11-18  8:37 UTC (permalink / raw)
  To: casouri; +Cc: dmitry, 66732, dominik

Ping!  Yuan, I'd appreciate if you chimed in on this issue.

> Cc: 66732@debbugs.gnu.org, dominik@honnef.co
> Date: Sun, 29 Oct 2023 14:22:30 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Date: Wed, 25 Oct 2023 02:15:30 +0300
> > From: Dmitry Gutov <dmitry@gutov.dev>
> > 
> > Hi Dominik,
> > 
> > On 24/10/2023 17:22, Dominik Honnef wrote:
> > > Steps to reproduce:
> > > 
> > > 1. Start emacs -q
> > > 2. Clear the scratch buffer
> > > 3. Switch to c-ts-mode
> > > 4. Type the following:
> > > 
> > >      /* foo
> > >      bar
> > >      baz
> > >      */
> > > 
> > > Notice the following:
> > > 
> > > 1. tree-sitter will not parse the buffer contents as a comment until the
> > > closing comment marker is typed (not an issue per se.)
> > > 
> > > 2. When you type the closing comment marker, only that line will be
> > > fontified.
> > > 
> > > 3. Moving to the previous line will refontify the whole comment.
> > > 
> > > Similarly, removing the closing comment marker will not instantly
> > > refontify the buffer, either.
> > > 
> > > The expected behavior would be for the whole comment to be refontified
> > > immediately after typing the closing comment marker.
> > > 
> > > Other buffer contents often lead to even worse refontifying, where even
> > > motion will not fix things.
> > 
> > Can repro.
> > 
> > Only with 'emacs -Q', though (note to others who will try).
> 
> Yuan, any ideas or comments?
> 
> 
> 
> 





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-11-18  8:37     ` Eli Zaretskii
@ 2023-12-11  4:16       ` Yuan Fu
  2023-12-11 12:05         ` Eli Zaretskii
  2023-12-11 15:53         ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Yuan Fu @ 2023-12-11  4:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, 66732, dominik



On 11/18/23 12:37 AM, Eli Zaretskii wrote:
> Ping!  Yuan, I'd appreciate if you chimed in on this issue.
>
>> Cc: 66732@debbugs.gnu.org, dominik@honnef.co
>> Date: Sun, 29 Oct 2023 14:22:30 +0200
>> From: Eli Zaretskii <eliz@gnu.org>
>>
>>> Date: Wed, 25 Oct 2023 02:15:30 +0300
>>> From: Dmitry Gutov <dmitry@gutov.dev>
>>>
>>> Hi Dominik,
>>>
>>> On 24/10/2023 17:22, Dominik Honnef wrote:
>>>> Steps to reproduce:
>>>>
>>>> 1. Start emacs -q
>>>> 2. Clear the scratch buffer
>>>> 3. Switch to c-ts-mode
>>>> 4. Type the following:
>>>>
>>>>       /* foo
>>>>       bar
>>>>       baz
>>>>       */
>>>>
>>>> Notice the following:
>>>>
>>>> 1. tree-sitter will not parse the buffer contents as a comment until the
>>>> closing comment marker is typed (not an issue per se.)
>>>>
>>>> 2. When you type the closing comment marker, only that line will be
>>>> fontified.
>>>>
>>>> 3. Moving to the previous line will refontify the whole comment.
>>>>
>>>> Similarly, removing the closing comment marker will not instantly
>>>> refontify the buffer, either.
>>>>
>>>> The expected behavior would be for the whole comment to be refontified
>>>> immediately after typing the closing comment marker.
>>>>
>>>> Other buffer contents often lead to even worse refontifying, where even
>>>> motion will not fix things.
>>> Can repro.
>>>
>>> Only with 'emacs -Q', though (note to others who will try).
>> Yuan, any ideas or comments?
>>
Some background: When buffer content changes, it might invalidate 
already-fontified region, as shown in the example: Typing the "*/" 
completes the block comment and should cause the comment to be 
refontified in comment-face.

The way we achieves this is thought parser notifiers. When tree-sitter 
parser reparses, it notifies us of which part of the buffer was affected 
by the reparse. For font-lock, we have this font-lock notifier that 
marks the affected buffer region as "not fontified", so redisplay will 
refontify those areas.

(defun treesit--font-lock-notifier (ranges parser)
   "Ensures updated parts of the parse-tree are refontified.
RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
parser notifying of the change."
   (with-current-buffer (treesit-parser-buffer parser)
     (dolist (range ranges)
       (when treesit--font-lock-verbose
         (message "Notifier received range: %s-%s"
                  (car range) (cdr range)))
       (with-silent-modifications
         (put-text-property (car range) (cdr range) 'fontified nil)))))

This notifier function will be called during redisplay [1]. I suspect 
that because of this timing, redisplay doesn't refontify the marked 
region immediately. So I added a timer, I think that ensures we mark the 
affected region in the next command loop?

(defun treesit--font-lock-notifier (ranges parser)
   "Ensures updated parts of the parse-tree are refontified.
RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
parser notifying of the change."
   (with-current-buffer (treesit-parser-buffer parser)
     (dolist (range ranges)
       (when treesit--font-lock-verbose
         (message "Notifier received range: %s-%s"
                  (car range) (cdr range)))
       (run-with-timer
        0 nil
        (lambda ()
          (with-silent-modifications
            (put-text-property (car range) (cdr range)
                               'fontified nil)))))))

This seems to work. Eli, do you see any problem using run-with-timer 
this way? What's the correct way to mark some region unfontified?

[1] The chain of events if roughly: user types the last "/" -> redisplay 
-> fontify that character -> access parser -> parser reparses -> calls 
notifier.

Yuan





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-11  4:16       ` Yuan Fu
@ 2023-12-11 12:05         ` Eli Zaretskii
  2023-12-11 14:35           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-11 15:53         ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2023-12-11 12:05 UTC (permalink / raw)
  To: Yuan Fu, Stefan Monnier; +Cc: dmitry, 66732, dominik

> Date: Sun, 10 Dec 2023 20:16:37 -0800
> Cc: dmitry@gutov.dev, 66732@debbugs.gnu.org, dominik@honnef.co
> From: Yuan Fu <casouri@gmail.com>
> 
> (defun treesit--font-lock-notifier (ranges parser)
>    "Ensures updated parts of the parse-tree are refontified.
> RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
> parser notifying of the change."
>    (with-current-buffer (treesit-parser-buffer parser)
>      (dolist (range ranges)
>        (when treesit--font-lock-verbose
>          (message "Notifier received range: %s-%s"
>                   (car range) (cdr range)))
>        (with-silent-modifications
>          (put-text-property (car range) (cdr range) 'fontified nil)))))
> 
> This notifier function will be called during redisplay [1]. I suspect 
> that because of this timing, redisplay doesn't refontify the marked 
> region immediately. So I added a timer, I think that ensures we mark the 
> affected region in the next command loop?
> 
> (defun treesit--font-lock-notifier (ranges parser)
>    "Ensures updated parts of the parse-tree are refontified.
> RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
> parser notifying of the change."
>    (with-current-buffer (treesit-parser-buffer parser)
>      (dolist (range ranges)
>        (when treesit--font-lock-verbose
>          (message "Notifier received range: %s-%s"
>                   (car range) (cdr range)))
>        (run-with-timer
>         0 nil
>         (lambda ()
>           (with-silent-modifications
>             (put-text-property (car range) (cdr range)
>                                'fontified nil)))))))
> 
> This seems to work. Eli, do you see any problem using run-with-timer 
> this way?

Why do you ask? is there something in this function that bothers you?

What's the correct way to mark some region unfontified?

> [1] The chain of events if roughly: user types the last "/" -> redisplay 
> -> fontify that character -> access parser -> parser reparses -> calls 
> notifier.

The way to mark region of text as not fontified is to set its
'fontified' property nil.

Instead of a timer, did you try to use pre-redisplay-function?

I added Stefan, in case he has some suggestions and ideas.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-11 12:05         ` Eli Zaretskii
@ 2023-12-11 14:35           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-11 14:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, Yuan Fu, 66732, dominik

>> [1] The chain of events if roughly: user types the last "/" -> redisplay 
>> -> fontify that character -> access parser -> parser reparses -> calls 
>> notifier.
>
> The way to mark region of text as not fontified is to set its
> 'fontified' property nil.

Indeed, but if you do it from within jit-lock it will not work as
expected because jit-lock will mark the chunk just processed with
`fontified` == t.

> Instead of a timer, did you try to use pre-redisplay-function?

Anything that runs "later" (i.e. after we exit `jit-lock-fontify-now`)
will work, but `pre-redisplay-function` means the display will only be
updated at the next redisplay, whereas with a timer we also force
a redisplay, thus shortening the time during which we display an
incompletely fontified chunk.

Jit-lock itself uses a similar timer for similar purposes.

We could make it slightly cheaper by exporting the `pending_funcalls` to
ELisp and then using it instead of a 0s timer.



        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-11  4:16       ` Yuan Fu
  2023-12-11 12:05         ` Eli Zaretskii
@ 2023-12-11 15:53         ` Dmitry Gutov
  2023-12-12  7:50           ` Yuan Fu
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-11 15:53 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii, Stefan Monnier; +Cc: 66732, dominik

[-- Attachment #1: Type: text/plain, Size: 3458 bytes --]

On 11/12/2023 06:16, Yuan Fu wrote:
> The way we achieves this is thought parser notifiers. When tree-sitter 
> parser reparses, it notifies us of which part of the buffer was affected 
> by the reparse. For font-lock, we have this font-lock notifier that 
> marks the affected buffer region as "not fontified", so redisplay will 
> refontify those areas.
> 
> (defun treesit--font-lock-notifier (ranges parser)
>    "Ensures updated parts of the parse-tree are refontified.
> RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
> parser notifying of the change."
>    (with-current-buffer (treesit-parser-buffer parser)
>      (dolist (range ranges)
>        (when treesit--font-lock-verbose
>          (message "Notifier received range: %s-%s"
>                   (car range) (cdr range)))
>        (with-silent-modifications
>          (put-text-property (car range) (cdr range) 'fontified nil)))))
> 
> This notifier function will be called during redisplay [1]. I suspect 
> that because of this timing, redisplay doesn't refontify the marked 
> region immediately. So I added a timer, I think that ensures we mark the 
> affected region in the next command loop?
> 
> (defun treesit--font-lock-notifier (ranges parser)
>    "Ensures updated parts of the parse-tree are refontified.
> RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
> parser notifying of the change."
>    (with-current-buffer (treesit-parser-buffer parser)
>      (dolist (range ranges)
>        (when treesit--font-lock-verbose
>          (message "Notifier received range: %s-%s"
>                   (car range) (cdr range)))
>        (run-with-timer
>         0 nil
>         (lambda ()
>           (with-silent-modifications
>             (put-text-property (car range) (cdr range)
>                                'fontified nil)))))))
> 
> This seems to work. Eli, do you see any problem using run-with-timer 
> this way? What's the correct way to mark some region unfontified?
> 
> [1] The chain of events if roughly: user types the last "/" -> redisplay 
> -> fontify that character -> access parser -> parser reparses -> calls 
> notifier.

Note that it's not just font-lock. syntax-propertize has the same 
problem (I've described it in https://debbugs.gnu.org/67262#23).

And a timer wouldn't help because syntax-ppss needs to have up-to-date 
information whenever it's called, not later.

Here's a draft solution based on *-extend-region-functions, attached.

Alas, while it works fine in python-ts-mode (for both syntax and 
font-lock), making it behave better than python-mode, in c-ts-mode it 
doesn't quite have the same effect: when you backspace over the closing 
"/", the highlighting is properly updated only after you make the next 
edit (any edit), or select another window. I'm not sure, though, if it's 
due to my own problems with Emacs's failure to redisplay (reported 
elsewhere), so more testing is welcome.

But that might also be related to the use of 
c-ts-mode--emacs-set-ranges: printing a backtrace calls inside 
treesit--font-lock-notifier shows that the last notification comes also 
during font-lock but after treesit--font-lock-extend-region, inside 
c-ts-mode--emacs-set-ranges. I don't quite understand this design where 
the ranges are applied inside the font-lock code.

[-- Attachment #2: treesit--notifier-context.diff --]
[-- Type: text/x-patch, Size: 3485 bytes --]

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 962a6fc3cf8..f5d30be52bb 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1075,17 +1075,56 @@ treesit-font-lock-fontify-region
                                face (treesit-node-type node))))))))))))
   `(jit-lock-bounds ,start . ,end))
 
+(defvar treesit--notifier-ranges nil)
+(defvar treesit--notifier-context 'direct)
+
 (defun treesit--font-lock-notifier (ranges parser)
   "Ensures updated parts of the parse-tree are refontified.
+And re-syntax-propertized.
 RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
 parser notifying of the change."
   (with-current-buffer (treesit-parser-buffer parser)
-    (dolist (range ranges)
-      (when treesit--font-lock-verbose
-        (message "Notifier received range: %s-%s"
-                 (car range) (cdr range)))
-      (with-silent-modifications
-        (put-text-property (car range) (cdr range) 'fontified nil)))))
+    (unless (eq treesit--notifier-context 'direct)
+      (setq treesit--notifier-ranges (append treesit--notifier-ranges ranges)))
+    (message "notifier context %s ranges %s" treesit--notifier-context ranges)
+    (when ranges
+      (unless (eq treesit--notifier-context 'font-lock)
+        (when treesit-font-lock-settings
+          (with-silent-modifications
+            (dolist (range ranges)
+              (when treesit--font-lock-verbose
+                (message "Notifier received range: %s-%s"
+                         (car range) (cdr range)))
+              ;; (backtrace)
+              (put-text-property (car range) (cdr range) 'fontified nil)))))
+      (unless (eq treesit--notifier-context 'syntax)
+        (when syntax-propertize-function
+          (syntax-ppss-flush-cache (cl-loop for r in ranges
+                                            minimize (car r))))))))
+
+(defun treesit--syntax-extend-region (beg end &optional font-lock-p)
+  (let (treesit--notifier-ranges
+        (treesit--notifier-context (if font-lock-p 'font-lock 'syntax)))
+    (treesit-buffer-root-node (treesit-language-at (point)))
+    (when treesit--notifier-ranges
+      ;; (message "some ranges received %S while beg end %s %s" treesit--notifier-ranges beg end)
+      ;; Some updates received from `treesit_ensure_parsed'.
+      (cl-loop for range in treesit--notifier-ranges
+               if (or (<= (car range) beg (cdr range))
+                      (<= (car range) end (cdr range)))
+               return (cons (min (car range) beg)
+                            (max (cdr range) end))))))
+
+(defun treesit--font-lock-extend-region ()
+  (defvar font-lock-beg)
+  (defvar font-lock-end)
+  ;; (message "called treesit--font-lock-extend-region")
+  (let ((new (treesit--syntax-extend-region font-lock-beg font-lock-end t)))
+    (when new
+      ;; (message "eeextended to %S" new)
+      (setq font-lock-beg (car new))
+      (setq font-lock-end (cdr new))
+      t)))
 
 ;;; Indent
 
@@ -2391,7 +2430,9 @@ treesit-major-mode-setup
     (treesit-font-lock-recompute-features)
     (dolist (parser (treesit-parser-list))
       (treesit-parser-add-notifier
-       parser #'treesit--font-lock-notifier)))
+       parser #'treesit--font-lock-notifier))
+    (push #'treesit--font-lock-extend-region font-lock-extend-region-functions)
+    (push #'treesit--syntax-extend-region syntax-propertize-extend-region-functions))
   ;; Indent.
   (when treesit-simple-indent-rules
     (setq-local treesit-simple-indent-rules

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-11 15:53         ` Dmitry Gutov
@ 2023-12-12  7:50           ` Yuan Fu
  2023-12-12 12:43             ` Dmitry Gutov
  2023-12-12 15:34             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 58+ messages in thread
From: Yuan Fu @ 2023-12-12  7:50 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, Stefan Monnier; +Cc: 66732, dominik



On 12/11/23 7:53 AM, Dmitry Gutov wrote:
> On 11/12/2023 06:16, Yuan Fu wrote:
>> The way we achieves this is thought parser notifiers. When 
>> tree-sitter parser reparses, it notifies us of which part of the 
>> buffer was affected by the reparse. For font-lock, we have this 
>> font-lock notifier that marks the affected buffer region as "not 
>> fontified", so redisplay will refontify those areas.
>>
>> (defun treesit--font-lock-notifier (ranges parser)
>>    "Ensures updated parts of the parse-tree are refontified.
>> RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
>> parser notifying of the change."
>>    (with-current-buffer (treesit-parser-buffer parser)
>>      (dolist (range ranges)
>>        (when treesit--font-lock-verbose
>>          (message "Notifier received range: %s-%s"
>>                   (car range) (cdr range)))
>>        (with-silent-modifications
>>          (put-text-property (car range) (cdr range) 'fontified nil)))))
>>
>> This notifier function will be called during redisplay [1]. I suspect 
>> that because of this timing, redisplay doesn't refontify the marked 
>> region immediately. So I added a timer, I think that ensures we mark 
>> the affected region in the next command loop?
>>
>> (defun treesit--font-lock-notifier (ranges parser)
>>    "Ensures updated parts of the parse-tree are refontified.
>> RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
>> parser notifying of the change."
>>    (with-current-buffer (treesit-parser-buffer parser)
>>      (dolist (range ranges)
>>        (when treesit--font-lock-verbose
>>          (message "Notifier received range: %s-%s"
>>                   (car range) (cdr range)))
>>        (run-with-timer
>>         0 nil
>>         (lambda ()
>>           (with-silent-modifications
>>             (put-text-property (car range) (cdr range)
>>                                'fontified nil)))))))
>>
>> This seems to work. Eli, do you see any problem using run-with-timer 
>> this way? What's the correct way to mark some region unfontified?
>>
>> [1] The chain of events if roughly: user types the last "/" -> 
>> redisplay -> fontify that character -> access parser -> parser 
>> reparses -> calls notifier.
>
> Note that it's not just font-lock. syntax-propertize has the same 
> problem (I've described it in https://debbugs.gnu.org/67262#23).
>
> And a timer wouldn't help because syntax-ppss needs to have up-to-date 
> information whenever it's called, not later.
>

I feel that font-lock and syntax-ppss have different problems and 
requires different solutions. For font-lock, we need to mark affected 
range unfontified; we just need to make sure we do that after 
jit-lock-fontify-now. For syntax-ppss, we need to force a reparse before 
entering syntax-ppss so that syntax text properties are up-to-date when 
syntax-ppss do its work.
> Here's a draft solution based on *-extend-region-functions, attached.
>
> Alas, while it works fine in python-ts-mode (for both syntax and 
> font-lock), making it behave better than python-mode, in c-ts-mode it 
> doesn't quite have the same effect: when you backspace over the 
> closing "/", the highlighting is properly updated only after you make 
> the next edit (any edit), or select another window. I'm not sure, 
> though, if it's due to my own problems with Emacs's failure to 
> redisplay (reported elsewhere), so more testing is welcome.
>
> But that might also be related to the use of 
> c-ts-mode--emacs-set-ranges: printing a backtrace calls inside 
> treesit--font-lock-notifier shows that the last notification comes 
> also during font-lock but after treesit--font-lock-extend-region, 
> inside c-ts-mode--emacs-set-ranges. I don't quite understand this 
> design where the ranges are applied inside the font-lock code.
c-ts-mode--emacs-set-ranges is registered as a range rule, so many 
tree-sitter function calls it before doing anything to make sure range 
is up-to-date. treesit-font-lock-fontify-region calls 
treesit-update-ranges at the beginning of its body, and 
treesit-update-ranges calls c-ts-mode--emacs-set-ranges.

Yuan





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-12  7:50           ` Yuan Fu
@ 2023-12-12 12:43             ` Dmitry Gutov
  2023-12-13  3:28               ` Yuan Fu
  2023-12-12 15:34             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-12 12:43 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii, Stefan Monnier; +Cc: 66732, dominik

On 12/12/2023 09:50, Yuan Fu wrote:

> I feel that font-lock and syntax-ppss have different problems and 
> requires different solutions. For font-lock, we need to mark affected 
> range unfontified; we just need to make sure we do that after 
> jit-lock-fontify-now. For syntax-ppss, we need to force a reparse before 
> entering syntax-ppss so that syntax text properties are up-to-date when 
> syntax-ppss do its work.

font-lock *could* have a different solution, but I don't see why any 
solution that works for syntax-ppss properly, shouldn't work just as 
well for font-lock. And if it's possible - why not do that, saving on 
variations in code. We also save on having to handle two differently 
callbacks from the parser.

The proposed alternative is not without problems. For example, that 
unfontifying a region inside a timer, you trigger a second fontification 
rather than having the job finished in one go. Possibly introducing a 
extra "blink" as well. That is a minor inconvenience by itself, but 
ideally we'd avoid it.

>> Here's a draft solution based on *-extend-region-functions, attached.
>>
>> Alas, while it works fine in python-ts-mode (for both syntax and 
>> font-lock), making it behave better than python-mode, in c-ts-mode it 
>> doesn't quite have the same effect: when you backspace over the 
>> closing "/", the highlighting is properly updated only after you make 
>> the next edit (any edit), or select another window. I'm not sure, 
>> though, if it's due to my own problems with Emacs's failure to 
>> redisplay (reported elsewhere), so more testing is welcome.
>>
>> But that might also be related to the use of 
>> c-ts-mode--emacs-set-ranges: printing a backtrace calls inside 
>> treesit--font-lock-notifier shows that the last notification comes 
>> also during font-lock but after treesit--font-lock-extend-region, 
>> inside c-ts-mode--emacs-set-ranges. I don't quite understand this 
>> design where the ranges are applied inside the font-lock code.
> c-ts-mode--emacs-set-ranges is registered as a range rule, so many 
> tree-sitter function calls it before doing anything to make sure range 
> is up-to-date. treesit-font-lock-fontify-region calls 
> treesit-update-ranges at the beginning of its body, and 
> treesit-update-ranges calls c-ts-mode--emacs-set-ranges.

That seems to mean that any feature accessing the parse tree should call 
treesit-update-ranges first. Including syntax-propertize-functions and 
*-extend-region-functions, which we currently don't do.

But which boundaries is it supposed to use? Should 
treesit--syntax-extend-region call treesit-update-ranges, when wait for 
the parser updates, then possibly call treesit-update-ranges again on 
the extended boundaries, and so on, until the parser stops sending 
notifications? Will it stop?

Anyway, could you try my patch? Like I said, I'm not sure if the 
insufficient fontification I'm observing in c-ts-mode is due to the 
problem with the solution, or due to the other redisplay-related 
problems on my system.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-12  7:50           ` Yuan Fu
  2023-12-12 12:43             ` Dmitry Gutov
@ 2023-12-12 15:34             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-12 15:34 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

> different solutions. For font-lock, we need to mark affected range
> unfontified; we just need to make sure we do that after
> jit-lock-fontify-now.

I think what Dmitry is saying is that it would be even better to do that do that
*before* `jit-lock-fontify-now`.


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-12 12:43             ` Dmitry Gutov
@ 2023-12-13  3:28               ` Yuan Fu
  2023-12-13  3:45                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-14  1:43                 ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Yuan Fu @ 2023-12-13  3:28 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, Stefan Monnier; +Cc: 66732, dominik



On 12/12/23 4:43 AM, Dmitry Gutov wrote:
> On 12/12/2023 09:50, Yuan Fu wrote:
>
>> I feel that font-lock and syntax-ppss have different problems and 
>> requires different solutions. For font-lock, we need to mark affected 
>> range unfontified; we just need to make sure we do that after 
>> jit-lock-fontify-now. For syntax-ppss, we need to force a reparse 
>> before entering syntax-ppss so that syntax text properties are 
>> up-to-date when syntax-ppss do its work.
>
> font-lock *could* have a different solution, but I don't see why any 
> solution that works for syntax-ppss properly, shouldn't work just as 
> well for font-lock. And if it's possible - why not do that, saving on 
> variations in code. We also save on having to handle two differently 
> callbacks from the parser.
>
> The proposed alternative is not without problems. For example, that 
> unfontifying a region inside a timer, you trigger a second 
> fontification rather than having the job finished in one go. Possibly 
> introducing a extra "blink" as well. That is a minor inconvenience by 
> itself, but ideally we'd avoid it.

I've come around it, right, extending the to-be-fontified region before 
jit-lock runs is also a solution (and is probably better than fontifying 
twice).

>>> Here's a draft solution based on *-extend-region-functions, attached.
>>>
>>> Alas, while it works fine in python-ts-mode (for both syntax and 
>>> font-lock), making it behave better than python-mode, in c-ts-mode 
>>> it doesn't quite have the same effect: when you backspace over the 
>>> closing "/", the highlighting is properly updated only after you 
>>> make the next edit (any edit), or select another window. I'm not 
>>> sure, though, if it's due to my own problems with Emacs's failure to 
>>> redisplay (reported elsewhere), so more testing is welcome.
>>>
>>> But that might also be related to the use of 
>>> c-ts-mode--emacs-set-ranges: printing a backtrace calls inside 
>>> treesit--font-lock-notifier shows that the last notification comes 
>>> also during font-lock but after treesit--font-lock-extend-region, 
>>> inside c-ts-mode--emacs-set-ranges. I don't quite understand this 
>>> design where the ranges are applied inside the font-lock code.
>> c-ts-mode--emacs-set-ranges is registered as a range rule, so many 
>> tree-sitter function calls it before doing anything to make sure 
>> range is up-to-date. treesit-font-lock-fontify-region calls 
>> treesit-update-ranges at the beginning of its body, and 
>> treesit-update-ranges calls c-ts-mode--emacs-set-ranges.
>
> That seems to mean that any feature accessing the parse tree should 
> call treesit-update-ranges first. Including 
> syntax-propertize-functions and *-extend-region-functions, which we 
> currently don't do.
>
> But which boundaries is it supposed to use? Should 
> treesit--syntax-extend-region call treesit-update-ranges, when wait 
> for the parser updates, then possibly call treesit-update-ranges again 
> on the extended boundaries, and so on, until the parser stops sending 
> notifications? Will it stop?

It'll stop. If the ranges don't change, no reparse will happen. And only 
buffer content change can cause range change. Reparse itself can't. So 
at most you'll see reparse -> update ranges -> reparse.

> Anyway, could you try my patch? Like I said, I'm not sure if the 
> insufficient fontification I'm observing in c-ts-mode is due to the 
> problem with the solution, or due to the other redisplay-related 
> problems on my system.

Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.

We might need to run (progn (force-parse) (update-ranges) (force-parse)) 
before jit-lock-fontify-now and sytax-ppss.

Yuan





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-13  3:28               ` Yuan Fu
@ 2023-12-13  3:45                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-13  7:12                   ` Yuan Fu
  2023-12-14  1:43                 ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-13  3:45 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

>> Anyway, could you try my patch? Like I said, I'm not sure if the
>> insufficient fontification I'm observing in c-ts-mode is due to the
>> problem with the solution, or due to the other redisplay-related problems
>> on my system.
>
> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.

I must admit I don't understand what the problem may be here.

> We might need to run (progn (force-parse) (update-ranges) (force-parse))
> before jit-lock-fontify-now and sytax-ppss.

Why would we need that?


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-13  3:45                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-13  7:12                   ` Yuan Fu
  2023-12-13 14:30                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-13  7:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

[-- Attachment #1: Type: text/plain, Size: 586 bytes --]



> On Dec 12, 2023, at 7:45 PM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>>> Anyway, could you try my patch? Like I said, I'm not sure if the
>>> insufficient fontification I'm observing in c-ts-mode is due to the
>>> problem with the solution, or due to the other redisplay-related problems
>>> on my system.
>> 
>> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.
> 
> I must admit I don't understand what the problem may be here.

The problem is, when you type /* RET foo RET */ the block comment is not fontified in comment face


[-- Attachment #2: Screenshot 2023-12-12 at 10.57.22 PM.png --]
[-- Type: image/png, Size: 9654 bytes --]

[-- Attachment #3: Type: text/plain, Size: 1231 bytes --]



That’s because before the user types the final “/“, there isn’t a complete comment node in the parse tree. When the user types the “/“, we need to mark the whole block comment for refontification.

We actually have that, when the parser reparses, it’ll also compute the affected region, the region that changed during the last reparse, and it’ll call the “notifiers” with that region. We install a font-lock-notifier, which simply sets fontified text property to nil in that region, so redisplay would call jit-lock to fontify that region.

In our example, the region would be the block comment.

>> We might need to run (progn (force-parse) (update-ranges) (force-parse))
>> before jit-lock-fontify-now and sytax-ppss.
> 
> Why would we need that?

The first force-parse is just to make the parser reparse the latest buffer content, then we need to update ranges of any embedded code, then we would want to make the parser for the embedded language to reparse, if their range has changed. After this process, all the parsers has the up-to-date parse tree, and more importantly have called their notifiers, those notifiers should apply fontified text prop, and apply any syntax text prop.

Yuan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-13  7:12                   ` Yuan Fu
@ 2023-12-13 14:30                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-13 14:30 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

> The problem is, when you type /* RET foo RET */ the block comment is not fontified in comment face
>
> x
>
>
>
> That’s because before the user types the final “/“, there isn’t a complete
> comment node in the parse tree. When the user types the “/“, we need to mark
> the whole block comment for refontification.
>
> We actually have that, when the parser reparses, it’ll also compute the
> affected region, the region that changed during the last reparse, and it’ll
> call the “notifiers” with that region. We install a font-lock-notifier,
> which simply sets fontified text property to nil in that region, so
> redisplay would call jit-lock to fontify that region.
>
> In our example, the region would be the block comment.

Thanks.

If the reparse is done during `jit-lock` then, depending on how we do
it, it may indeed require re-running jit-lock, tho not necessarily.
E.g. if the reparse+notification is used to extend-region, then
fontification will already have been done and doesn't need to
be repeated.
OTOH the rendering will have already been done with the pre-reparse
state of the buffer for the beginning of the buffer (i.e. before the
"*/"), so we need to re-render (i.e. re-run the redisplay) for that part
of the buffer.  But that should not need any re-parse nor re-fontification.

If your `font-lock-extend-region-function` returns correct bounds,
jit-lock should take care of re-rendering as needed.

If the reparse (and the corresponding setting of `fontified`) is done
before redisplay, then there should be no need to do any double work at
all.  So maybe reparsing from a `pre-redisplay-functions` is an even
better option.  The downside is that it doesn't save you from needing to
handle reparses from the `font-lock-extend-region-function`, in order to
handle the cases where the redisplay wasn't involved.


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-13  3:28               ` Yuan Fu
  2023-12-13  3:45                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-14  1:43                 ` Dmitry Gutov
  2023-12-14  8:29                   ` Yuan Fu
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-14  1:43 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii, Stefan Monnier; +Cc: 66732, dominik

On 13/12/2023 05:28, Yuan Fu wrote:
>>> c-ts-mode--emacs-set-ranges is registered as a range rule, so many 
>>> tree-sitter function calls it before doing anything to make sure 
>>> range is up-to-date. treesit-font-lock-fontify-region calls 
>>> treesit-update-ranges at the beginning of its body, and 
>>> treesit-update-ranges calls c-ts-mode--emacs-set-ranges.
>>
>> That seems to mean that any feature accessing the parse tree should 
>> call treesit-update-ranges first. Including 
>> syntax-propertize-functions and *-extend-region-functions, which we 
>> currently don't do.
>>
>> But which boundaries is it supposed to use? Should 
>> treesit--syntax-extend-region call treesit-update-ranges, when wait 
>> for the parser updates, then possibly call treesit-update-ranges again 
>> on the extended boundaries, and so on, until the parser stops sending 
>> notifications? Will it stop?
> 
> It'll stop. If the ranges don't change, no reparse will happen. And only 
> buffer content change can cause range change. Reparse itself can't. So 
> at most you'll see reparse -> update ranges -> reparse.
> 
>> Anyway, could you try my patch? Like I said, I'm not sure if the 
>> insufficient fontification I'm observing in c-ts-mode is due to the 
>> problem with the solution, or due to the other redisplay-related 
>> problems on my system.
> 
> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.
> 
> We might need to run (progn (force-parse) (update-ranges) (force-parse)) 
> before jit-lock-fontify-now and sytax-ppss.

Well... I've replaced

   (treesit-buffer-root-node (treesit-language-at (point)))

in treesit--syntax-extend-region with

   (treesit-buffer-root-node (treesit-language-at (point)))
   (treesit-update-ranges beg end)
   (treesit-buffer-root-node (treesit-language-at (point)))

and even tried commenting out the call to 'treesit-update-ranges' inside 
treesit-font-lock-fontify-region, but the result looks unchanged. And 
the region does get extended, according to my print-debugging inside 
font-lock-default-fontify-region.

Could you check if you're seeing the same?





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-14  1:43                 ` Dmitry Gutov
@ 2023-12-14  8:29                   ` Yuan Fu
  2023-12-15  1:01                     ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-14  8:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik



> On Dec 13, 2023, at 5:43 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 13/12/2023 05:28, Yuan Fu wrote:
>>>> c-ts-mode--emacs-set-ranges is registered as a range rule, so many tree-sitter function calls it before doing anything to make sure range is up-to-date. treesit-font-lock-fontify-region calls treesit-update-ranges at the beginning of its body, and treesit-update-ranges calls c-ts-mode--emacs-set-ranges.
>>> 
>>> That seems to mean that any feature accessing the parse tree should call treesit-update-ranges first. Including syntax-propertize-functions and *-extend-region-functions, which we currently don't do.
>>> 
>>> But which boundaries is it supposed to use? Should treesit--syntax-extend-region call treesit-update-ranges, when wait for the parser updates, then possibly call treesit-update-ranges again on the extended boundaries, and so on, until the parser stops sending notifications? Will it stop?
>> It'll stop. If the ranges don't change, no reparse will happen. And only buffer content change can cause range change. Reparse itself can't. So at most you'll see reparse -> update ranges -> reparse.
>>> Anyway, could you try my patch? Like I said, I'm not sure if the insufficient fontification I'm observing in c-ts-mode is due to the problem with the solution, or due to the other redisplay-related problems on my system.
>> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.
>> We might need to run (progn (force-parse) (update-ranges) (force-parse)) before jit-lock-fontify-now and sytax-ppss.
> 
> Well... I've replaced
> 
>  (treesit-buffer-root-node (treesit-language-at (point)))
> 
> in treesit--syntax-extend-region with
> 
>  (treesit-buffer-root-node (treesit-language-at (point)))
>  (treesit-update-ranges beg end)
>  (treesit-buffer-root-node (treesit-language-at (point)))
> 
> and even tried commenting out the call to 'treesit-update-ranges' inside treesit-font-lock-fontify-region, but the result looks unchanged. And the region does get extended, according to my print-debugging inside font-lock-default-fontify-region.
> 
> Could you check if you're seeing the same?

It should be

(treesit-buffer-root-node 'emacs-c)
(treesit-update-ranges)
(treesit-buffer-root-node ‘c)

I tried, and it doesn’t make a difference. But I must admit I’m not very clear on what your patch tries to do. Does it try to extend the to-be-fontified region before jit-lock runs?

Yuan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-14  8:29                   ` Yuan Fu
@ 2023-12-15  1:01                     ` Dmitry Gutov
  2023-12-15  7:12                       ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-15  1:01 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

On 14/12/2023 10:29, Yuan Fu wrote:
> 
> 
>> On Dec 13, 2023, at 5:43 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>
>> On 13/12/2023 05:28, Yuan Fu wrote:
>>>>> c-ts-mode--emacs-set-ranges is registered as a range rule, so many tree-sitter function calls it before doing anything to make sure range is up-to-date. treesit-font-lock-fontify-region calls treesit-update-ranges at the beginning of its body, and treesit-update-ranges calls c-ts-mode--emacs-set-ranges.
>>>>
>>>> That seems to mean that any feature accessing the parse tree should call treesit-update-ranges first. Including syntax-propertize-functions and *-extend-region-functions, which we currently don't do.
>>>>
>>>> But which boundaries is it supposed to use? Should treesit--syntax-extend-region call treesit-update-ranges, when wait for the parser updates, then possibly call treesit-update-ranges again on the extended boundaries, and so on, until the parser stops sending notifications? Will it stop?
>>> It'll stop. If the ranges don't change, no reparse will happen. And only buffer content change can cause range change. Reparse itself can't. So at most you'll see reparse -> update ranges -> reparse.
>>>> Anyway, could you try my patch? Like I said, I'm not sure if the insufficient fontification I'm observing in c-ts-mode is due to the problem with the solution, or due to the other redisplay-related problems on my system.
>>> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.
>>> We might need to run (progn (force-parse) (update-ranges) (force-parse)) before jit-lock-fontify-now and sytax-ppss.
>>
>> Well... I've replaced
>>
>>   (treesit-buffer-root-node (treesit-language-at (point)))
>>
>> in treesit--syntax-extend-region with
>>
>>   (treesit-buffer-root-node (treesit-language-at (point)))
>>   (treesit-update-ranges beg end)
>>   (treesit-buffer-root-node (treesit-language-at (point)))
>>
>> and even tried commenting out the call to 'treesit-update-ranges' inside treesit-font-lock-fontify-region, but the result looks unchanged. And the region does get extended, according to my print-debugging inside font-lock-default-fontify-region.
>>
>> Could you check if you're seeing the same?
> 
> It should be
> 
> (treesit-buffer-root-node 'emacs-c)
> (treesit-update-ranges)
> (treesit-buffer-root-node ‘c)

This might be difficult for treesit--syntax-extend-region to do, since 
it's supposed to work across modes. But I suppose it could iterate 
through (mapcar #'treesit-parser-language (treesit-parser-list)).

> I tried, and it doesn’t make a difference. But I must admit I’m not very clear on what your patch tries to do. Does it try to extend the to-be-fontified region before jit-lock runs?

Yes, it extends the region to-be-fontified. If you set 
treesit--font-lock-verbose to t, you should see the appropriate messages 
"Fontifying region: x-y" in the message log where x is the position 
before the start of the comment (printed by 
treesit-font-lock-fontify-region).





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-15  1:01                     ` Dmitry Gutov
@ 2023-12-15  7:12                       ` Yuan Fu
  2023-12-16  5:56                         ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-15  7:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik



> On Dec 14, 2023, at 5:01 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 14/12/2023 10:29, Yuan Fu wrote:
>>> On Dec 13, 2023, at 5:43 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>> 
>>> On 13/12/2023 05:28, Yuan Fu wrote:
>>>>>> c-ts-mode--emacs-set-ranges is registered as a range rule, so many tree-sitter function calls it before doing anything to make sure range is up-to-date. treesit-font-lock-fontify-region calls treesit-update-ranges at the beginning of its body, and treesit-update-ranges calls c-ts-mode--emacs-set-ranges.
>>>>> 
>>>>> That seems to mean that any feature accessing the parse tree should call treesit-update-ranges first. Including syntax-propertize-functions and *-extend-region-functions, which we currently don't do.
>>>>> 
>>>>> But which boundaries is it supposed to use? Should treesit--syntax-extend-region call treesit-update-ranges, when wait for the parser updates, then possibly call treesit-update-ranges again on the extended boundaries, and so on, until the parser stops sending notifications? Will it stop?
>>>> It'll stop. If the ranges don't change, no reparse will happen. And only buffer content change can cause range change. Reparse itself can't. So at most you'll see reparse -> update ranges -> reparse.
>>>>> Anyway, could you try my patch? Like I said, I'm not sure if the insufficient fontification I'm observing in c-ts-mode is due to the problem with the solution, or due to the other redisplay-related problems on my system.
>>>> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.
>>>> We might need to run (progn (force-parse) (update-ranges) (force-parse)) before jit-lock-fontify-now and sytax-ppss.
>>> 
>>> Well... I've replaced
>>> 
>>>  (treesit-buffer-root-node (treesit-language-at (point)))
>>> 
>>> in treesit--syntax-extend-region with
>>> 
>>>  (treesit-buffer-root-node (treesit-language-at (point)))
>>>  (treesit-update-ranges beg end)
>>>  (treesit-buffer-root-node (treesit-language-at (point)))
>>> 
>>> and even tried commenting out the call to 'treesit-update-ranges' inside treesit-font-lock-fontify-region, but the result looks unchanged. And the region does get extended, according to my print-debugging inside font-lock-default-fontify-region.
>>> 
>>> Could you check if you're seeing the same?
>> It should be
>> (treesit-buffer-root-node 'emacs-c)
>> (treesit-update-ranges)
>> (treesit-buffer-root-node ‘c)
> 
> This might be difficult for treesit--syntax-extend-region to do, since it's supposed to work across modes. But I suppose it could iterate through (mapcar #'treesit-parser-language (treesit-parser-list)).

Actually, only calling (treesit-update-ranges) should be enough (and is generic), the range update function surely would access the parser, which would force the reparse, and the update function will access the parser in the correct order: host language first, then embedded languages.

> 
>> I tried, and it doesn’t make a difference. But I must admit I’m not very clear on what your patch tries to do. Does it try to extend the to-be-fontified region before jit-lock runs?
> 
> Yes, it extends the region to-be-fontified. If you set treesit--font-lock-verbose to t, you should see the appropriate messages "Fontifying region: x-y" in the message log where x is the position before the start of the comment (printed by treesit-font-lock-fontify-region).

Hmm, let me have a closer look tomorrow. It can’t be that difficult, we must have missed something.

Yuan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-15  7:12                       ` Yuan Fu
@ 2023-12-16  5:56                         ` Yuan Fu
  2023-12-16 15:22                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 22:56                           ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Yuan Fu @ 2023-12-16  5:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

[-- Attachment #1: Type: text/plain, Size: 4348 bytes --]



> On Dec 14, 2023, at 11:12 PM, Yuan Fu <casouri@gmail.com> wrote:
> 
> 
> 
>> On Dec 14, 2023, at 5:01 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>> 
>> On 14/12/2023 10:29, Yuan Fu wrote:
>>>> On Dec 13, 2023, at 5:43 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>>> 
>>>> On 13/12/2023 05:28, Yuan Fu wrote:
>>>>>>> c-ts-mode--emacs-set-ranges is registered as a range rule, so many tree-sitter function calls it before doing anything to make sure range is up-to-date. treesit-font-lock-fontify-region calls treesit-update-ranges at the beginning of its body, and treesit-update-ranges calls c-ts-mode--emacs-set-ranges.
>>>>>> 
>>>>>> That seems to mean that any feature accessing the parse tree should call treesit-update-ranges first. Including syntax-propertize-functions and *-extend-region-functions, which we currently don't do.
>>>>>> 
>>>>>> But which boundaries is it supposed to use? Should treesit--syntax-extend-region call treesit-update-ranges, when wait for the parser updates, then possibly call treesit-update-ranges again on the extended boundaries, and so on, until the parser stops sending notifications? Will it stop?
>>>>> It'll stop. If the ranges don't change, no reparse will happen. And only buffer content change can cause range change. Reparse itself can't. So at most you'll see reparse -> update ranges -> reparse.
>>>>>> Anyway, could you try my patch? Like I said, I'm not sure if the insufficient fontification I'm observing in c-ts-mode is due to the problem with the solution, or due to the other redisplay-related problems on my system.
>>>>> Yeah, it doesn't solve the problem in c-ts-mode regarding block comments.
>>>>> We might need to run (progn (force-parse) (update-ranges) (force-parse)) before jit-lock-fontify-now and sytax-ppss.
>>>> 
>>>> Well... I've replaced
>>>> 
>>>> (treesit-buffer-root-node (treesit-language-at (point)))
>>>> 
>>>> in treesit--syntax-extend-region with
>>>> 
>>>> (treesit-buffer-root-node (treesit-language-at (point)))
>>>> (treesit-update-ranges beg end)
>>>> (treesit-buffer-root-node (treesit-language-at (point)))
>>>> 
>>>> and even tried commenting out the call to 'treesit-update-ranges' inside treesit-font-lock-fontify-region, but the result looks unchanged. And the region does get extended, according to my print-debugging inside font-lock-default-fontify-region.
>>>> 
>>>> Could you check if you're seeing the same?
>>> It should be
>>> (treesit-buffer-root-node 'emacs-c)
>>> (treesit-update-ranges)
>>> (treesit-buffer-root-node ‘c)
>> 
>> This might be difficult for treesit--syntax-extend-region to do, since it's supposed to work across modes. But I suppose it could iterate through (mapcar #'treesit-parser-language (treesit-parser-list)).
> 
> Actually, only calling (treesit-update-ranges) should be enough (and is generic), the range update function surely would access the parser, which would force the reparse, and the update function will access the parser in the correct order: host language first, then embedded languages.
> 
>> 
>>> I tried, and it doesn’t make a difference. But I must admit I’m not very clear on what your patch tries to do. Does it try to extend the to-be-fontified region before jit-lock runs?
>> 
>> Yes, it extends the region to-be-fontified. If you set treesit--font-lock-verbose to t, you should see the appropriate messages "Fontifying region: x-y" in the message log where x is the position before the start of the comment (printed by treesit-font-lock-fontify-region).
> 
> Hmm, let me have a closer look tomorrow. It can’t be that difficult, we must have missed something.

Please see this patch, which basically does what you did in your patch. I just started from scratch when experimenting. I didn’t do the context thing, that’s a bit confusing for me. I save updated ranges in a local hash table and access it in extend-region functions to extend the region. Ideally treesit should have a function that returns the updated ranges for a parser, we don’t have that yet, so I’m using the hash table to simulate it.

I found that if you don’t set text prop fontified to nil for the whole extended region, redisplay doesn’t seem to, well, redisplay the full region, even thought the new face has been correctly applied to them.

Yuan


[-- Attachment #2: extend-region.diff --]
[-- Type: application/octet-stream, Size: 2696 bytes --]

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 8a07f5023a9..a6ca6334133 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1080,13 +1080,39 @@ treesit--font-lock-notifier
   "Ensures updated parts of the parse-tree are refontified.
 RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
 parser notifying of the change."
-  (with-current-buffer (treesit-parser-buffer parser)
-    (dolist (range ranges)
-      (when treesit--font-lock-verbose
-        (message "Notifier received range: %s-%s"
-                 (car range) (cdr range)))
-      (with-silent-modifications
-        (put-text-property (car range) (cdr range) 'fontified nil)))))
+  ;; (with-current-buffer (treesit-parser-buffer parser)
+  ;;   (dolist (range ranges)
+  ;;     (when treesit--font-lock-verbose
+  ;;       (message "Notifier received range: %s-%s"
+  ;;                (car range) (cdr range)))
+  ;;     (with-silent-modifications
+  ;;       (put-text-property (car range) (cdr range) 'fontified nil))))
+  )
+
+(defun treesit--font-lock-extend-region ()
+  (treesit-update-ranges)
+  (dolist (parser (treesit-parser-list nil nil t))
+    (treesit-parser-root-node parser))
+  (let ((old-beg font-lock-beg)
+        (old-end font-lock-end))
+    (dolist (parser (treesit-parser-list nil nil t))
+      (when-let ((table treesit--parser-updated-ranges)
+                 (ranges (gethash parser table)))
+        (dolist (range ranges)
+          (setq font-lock-beg (max (min font-lock-beg (car range)) (point-min)))
+          (setq font-lock-end (min (max font-lock-end (cdr range)) (point-max))))))
+    (put-text-property font-lock-beg font-lock-end 'fontified nil)
+    (message "extend region: %s-%s %s-%s this-command: %s" old-beg old-end
+             font-lock-beg font-lock-end this-command)))
+
+(defvar-local treesit--parser-updated-ranges nil)
+(defun treesit--general-notifier (ranges parser)
+  "Record that PARSER's updated ranges is RANGES."
+  (message "notifier range: %s %s" parser ranges)
+  (unless treesit--parser-updated-ranges
+    (setq treesit--parser-updated-ranges
+          (make-hash-table :weakness 'key)))
+  (puthash parser ranges treesit--parser-updated-ranges))
 
 ;;; Indent
 
@@ -2422,7 +2448,13 @@ treesit-major-mode-setup
   ;; Imenu.
   (when treesit-simple-imenu-settings
     (setq-local imenu-create-index-function
-                #'treesit-simple-imenu)))
+                #'treesit-simple-imenu))
+
+  (dolist (parser (treesit-parser-list))
+    (treesit-parser-add-notifier
+     parser #'treesit--general-notifier))
+  (add-hook 'font-lock-extend-region-functions
+            #'treesit--font-lock-extend-region nil t))
 
 ;;; Debugging
 

[-- Attachment #3: Type: text/plain, Size: 2 bytes --]




^ permalink raw reply related	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16  5:56                         ` Yuan Fu
@ 2023-12-16 15:22                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 17:11                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 22:56                           ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-16 15:22 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

> diff --git a/lisp/treesit.el b/lisp/treesit.el
> index 8a07f5023a9..a6ca6334133 100644
> --- a/lisp/treesit.el
> +++ b/lisp/treesit.el
> @@ -1080,13 +1080,39 @@ treesit--font-lock-notifier
>    "Ensures updated parts of the parse-tree are refontified.
>  RANGES is a list of (BEG . END) ranges, PARSER is the tree-sitter
>  parser notifying of the change."
> -  (with-current-buffer (treesit-parser-buffer parser)
> -    (dolist (range ranges)
> -      (when treesit--font-lock-verbose
> -        (message "Notifier received range: %s-%s"
> -                 (car range) (cdr range)))
> -      (with-silent-modifications
> -        (put-text-property (car range) (cdr range) 'fontified nil)))))
> +  ;; (with-current-buffer (treesit-parser-buffer parser)
> +  ;;   (dolist (range ranges)
> +  ;;     (when treesit--font-lock-verbose
> +  ;;       (message "Notifier received range: %s-%s"
> +  ;;                (car range) (cdr range)))
> +  ;;     (with-silent-modifications
> +  ;;       (put-text-property (car range) (cdr range) 'fontified nil))))
> +  )
> +
> +(defun treesit--font-lock-extend-region ()
> +  (treesit-update-ranges)
> +  (dolist (parser (treesit-parser-list nil nil t))
> +    (treesit-parser-root-node parser))
> +  (let ((old-beg font-lock-beg)
> +        (old-end font-lock-end))
> +    (dolist (parser (treesit-parser-list nil nil t))
> +      (when-let ((table treesit--parser-updated-ranges)
> +                 (ranges (gethash parser table)))
> +        (dolist (range ranges)
> +          (setq font-lock-beg (max (min font-lock-beg (car range)) (point-min)))
> +          (setq font-lock-end (min (max font-lock-end (cdr range)) (point-max))))))
> +    (put-text-property font-lock-beg font-lock-end 'fontified nil)
> +    (message "extend region: %s-%s %s-%s this-command: %s" old-beg old-end
> +             font-lock-beg font-lock-end this-command)))

Hmm... This `put-text-property` can't be right.
Can you trace `font-lock-fontify-region` and
`font-lock-default-fontify-region` to check that they correctly return
the proper (extended) bounds of the region they (re)fontified?


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 15:22                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-16 17:11                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 17:23                               ` Dmitry Gutov
  2023-12-20  2:01                               ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-16 17:11 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

> Hmm... This `put-text-property` can't be right.
> Can you trace `font-lock-fontify-region` and
> `font-lock-default-fontify-region` to check that they correctly return
> the proper (extended) bounds of the region they (re)fontified?

BTW, using the extend-region-function isn't ideal: this is meant for
things which can't be refontified separately, so if a change in the
buffer causes notification to make changes over a large portion of the
buffer we'll end up (re)fontified *right away* that whole large
portion even if only a small part (or even no part at all) is displayed.

IOW, it'd be better to mark the changed chunks with this
`put-text-property`, but to do it before jit&font-lock get triggered.

Basically, we'd like to call it from `after-change-functions`, but this
can be called *many* times within a single command, so we want to delay
it.  So we can probably just set a flag that says "there are unprocessed
changes" and then sprinkle calls to a "lazy update" function which
checks this flag before calling `treesit--font-lock-notifier` (or
something similar).

One of the places where we'd need the sprinkle would be
`pre-redisplay-functions`.  Another one might be `syntax-ppss`?


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 17:11                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-16 17:23                               ` Dmitry Gutov
  2023-12-16 17:43                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-20  2:01                               ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-16 17:23 UTC (permalink / raw)
  To: Stefan Monnier, Yuan Fu; +Cc: Eli Zaretskii, 66732, dominik

On 16/12/2023 19:11, Stefan Monnier wrote:
>> Hmm... This `put-text-property` can't be right.
>> Can you trace `font-lock-fontify-region` and
>> `font-lock-default-fontify-region` to check that they correctly return
>> the proper (extended) bounds of the region they (re)fontified?
> 
> BTW, using the extend-region-function isn't ideal: this is meant for
> things which can't be refontified separately, so if a change in the
> buffer causes notification to make changes over a large portion of the
> buffer we'll end up (re)fontified *right away* that whole large
> portion even if only a small part (or even no part at all) is displayed.

I think most of the time the parser notifier will send ranges of limited 
size (as one or other piece of text gets recognized as e.g. a comment, 
or a raw string -- the bounds of the respective node that changed).

> IOW, it'd be better to mark the changed chunks with this
> `put-text-property`, but to do it before jit&font-lock get triggered.
> 
> Basically, we'd like to call it from `after-change-functions`, but this
> can be called *many* times within a single command, so we want to delay
> it.  So we can probably just set a flag that says "there are unprocessed
> changes" and then sprinkle calls to a "lazy update" function which
> checks this flag before calling `treesit--font-lock-notifier` (or
> something similar).

I thought doing this lazily (inside font-lock-default-fontify-region), 
only when the region is set to be redisplayed, would be more economical 
than doing more work inside after-change-functions.

> One of the places where we'd need the sprinkle would be
> `pre-redisplay-functions`.  Another one might be `syntax-ppss`?

syntax-ppss has syntax-propertize-extend-region-functions. I haven't 
found any better suitable hooks than that.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 17:23                               ` Dmitry Gutov
@ 2023-12-16 17:43                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 19:18                                   ` Yuan Fu
  2023-12-16 23:02                                   ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-16 17:43 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Yuan Fu, 66732, Eli Zaretskii, dominik

> I think most of the time the parser notifier will send ranges of limited
> size (as one or other piece of text gets recognized as e.g. a comment, or
> a raw string -- the bounds of the respective node that changed).

Yes, of course, in practice it will usually work OK.
I'm talking about what should happen ideally so that we can make an
informed decision about the code we decide to use in the end, knowing
its downsides.

>> IOW, it'd be better to mark the changed chunks with this
>> `put-text-property`, but to do it before jit&font-lock get triggered.
>> Basically, we'd like to call it from `after-change-functions`, but this
>> can be called *many* times within a single command, so we want to delay
>> it.  So we can probably just set a flag that says "there are unprocessed
>> changes" and then sprinkle calls to a "lazy update" function which
>> checks this flag before calling `treesit--font-lock-notifier` (or
>> something similar).
>
> I thought doing this lazily (inside font-lock-default-fontify-region), only
> when the region is set to be redisplayed, would be more economical than
> doing more work inside after-change-functions.

Indeed we don't want to do it directly in `after-change-functions`, but
doing it in `font-lock-default-fontify-region` is too late because the
redisplay and jit-lock have already (to some extent) decided what should
be refontified at that point.

The infrastructure does offer ways to make it work (to fix previous
incorrect assumptions about what needed to be redisplayed/refontified),
but it's best if we can avoid it.

>> One of the places where we'd need the sprinkle would be
>> `pre-redisplay-functions`.  Another one might be `syntax-ppss`?
> syntax-ppss has syntax-propertize-extend-region-functions. I haven't found
> any better suitable hooks than that.

Indeed, I don't think there's a good hook for that yet, tho maybe
`syntax-propertize-extend-region-functions` is good enough (basically
make it call `treesit--font-lock-notifier` but return nil), assuming we
have set a `syntax-propertize-function`.


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 17:43                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-16 19:18                                   ` Yuan Fu
  2023-12-16 19:57                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 23:09                                     ` Dmitry Gutov
  2023-12-16 23:02                                   ` Dmitry Gutov
  1 sibling, 2 replies; 58+ messages in thread
From: Yuan Fu @ 2023-12-16 19:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

[-- Attachment #1: Type: text/plain, Size: 2490 bytes --]



> On Dec 16, 2023, at 9:43 AM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>> I think most of the time the parser notifier will send ranges of limited
>> size (as one or other piece of text gets recognized as e.g. a comment, or
>> a raw string -- the bounds of the respective node that changed).
> 
> Yes, of course, in practice it will usually work OK.
> I'm talking about what should happen ideally so that we can make an
> informed decision about the code we decide to use in the end, knowing
> its downsides.
> 
>>> IOW, it'd be better to mark the changed chunks with this
>>> `put-text-property`, but to do it before jit&font-lock get triggered.
>>> Basically, we'd like to call it from `after-change-functions`, but this
>>> can be called *many* times within a single command, so we want to delay
>>> it.  So we can probably just set a flag that says "there are unprocessed
>>> changes" and then sprinkle calls to a "lazy update" function which
>>> checks this flag before calling `treesit--font-lock-notifier` (or
>>> something similar).
>> 
>> I thought doing this lazily (inside font-lock-default-fontify-region), only
>> when the region is set to be redisplayed, would be more economical than
>> doing more work inside after-change-functions.
> 
> Indeed we don't want to do it directly in `after-change-functions`, but
> doing it in `font-lock-default-fontify-region` is too late because the
> redisplay and jit-lock have already (to some extent) decided what should
> be refontified at that point.
> 
> The infrastructure does offer ways to make it work (to fix previous
> incorrect assumptions about what needed to be redisplayed/refontified),
> but it's best if we can avoid it.

I tried forcing reparse in pre-redisplay-functions, and it seems to work fine! See attached diff. The delist part can be improved a bit, but this is just a POC.

> 
>>> One of the places where we'd need the sprinkle would be
>>> `pre-redisplay-functions`.  Another one might be `syntax-ppss`?
>> syntax-ppss has syntax-propertize-extend-region-functions. I haven't found
>> any better suitable hooks than that.
> 
> Indeed, I don't think there's a good hook for that yet, tho maybe
> `syntax-propertize-extend-region-functions` is good enough (basically
> make it call `treesit--font-lock-notifier` but return nil), assuming we
> have set a `syntax-propertize-function`.

That makes sense, I’ll give that a try on bug#67262.

Yuan


[-- Attachment #2: pre-redisplay.diff --]
[-- Type: application/octet-stream, Size: 935 bytes --]

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 8a07f5023a9..c803866130a 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1088,6 +1088,11 @@ treesit--font-lock-notifier
       (with-silent-modifications
         (put-text-property (car range) (cdr range) 'fontified nil)))))
 
+(defun treesit--pre-redisplay (&rest _)
+  (treesit-update-ranges)
+  (dolist (parser (treesit-parser-list nil nil t))
+    (treesit-parser-root-node parser)))
+
 ;;; Indent
 
 (define-error 'treesit-indent-error
@@ -2392,7 +2397,8 @@ treesit-major-mode-setup
     (treesit-font-lock-recompute-features)
     (dolist (parser (treesit-parser-list))
       (treesit-parser-add-notifier
-       parser #'treesit--font-lock-notifier)))
+       parser #'treesit--font-lock-notifier))
+    (add-hook 'pre-redisplay-functions #'treesit--pre-redisplay 0 t))
   ;; Indent.
   (when treesit-simple-indent-rules
     (setq-local treesit-simple-indent-rules

[-- Attachment #3: Type: text/plain, Size: 2 bytes --]




^ permalink raw reply related	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 19:18                                   ` Yuan Fu
@ 2023-12-16 19:57                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 23:09                                     ` Dmitry Gutov
  1 sibling, 0 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-16 19:57 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Dmitry Gutov, Eli Zaretskii, 66732, dominik

> +(defun treesit--pre-redisplay (&rest _)
> +  (treesit-update-ranges)
> +  (dolist (parser (treesit-parser-list nil nil t))
> +    (treesit-parser-root-node parser)))

Note that `pre-redisplay-functions` is called often and can be called
N times for a single redisplay (if the buffer is displayed in
N windows), so we should try and make sure we only do the hard work if
there's actually been a change since the last time.

[ And the code should probably come with a comment explaining why
  calling `treesit-parser-root-node` and ignoring its return value is
  useful here.  ]

        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16  5:56                         ` Yuan Fu
  2023-12-16 15:22                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-16 22:56                           ` Dmitry Gutov
  2023-12-18 18:27                             ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-16 22:56 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

On 16/12/2023 07:56, Yuan Fu wrote:

>> Hmm, let me have a closer look tomorrow. It can’t be that difficult, we must have missed something.
> 
> Please see this patch, which basically does what you did in your patch. I just started from scratch when experimenting. I didn’t do the context thing, that’s a bit confusing for me.

Yeah, it's more complex than I'd have liked but basically the "context" 
thing was to dispatch to the right invalidator: for font-lock and/or for 
syntax. Whether to return the range using the corresponding 
extend-function, or to invalidate push-style (when the notifier is 
called outside of any extend-function's).

I've tried your patch, and it seemed to have similar problems to mine. 
In particular, that the comment is not unfontified right away when the 
closing slash is removed.

Also, I think we're targeting Emacs 29 for this bug (the patch needed 
adjustments).

> I save updated ranges in a local hash table and access it in extend-region functions to extend the region. Ideally treesit should have a function that returns the updated ranges for a parser, we don’t have that yet, so I’m using the hash table to simulate it.

Okay. I wonder how other users (which care about specific parsers) would 
look.

> I found that if you don’t set text prop fontified to nil for the whole extended region, redisplay doesn’t seem to, well, redisplay the full region, even thought the new face has been correctly applied to them.

That sounds like a bug in font-lock? At the end of jit-lock-fontify-now, 
there is a call creating a timer with jit-lock-force-redisplay.

And that function ends with this:

     ;; Don't cause refontification (it's already been done), but just do
     ;; some random buffer change, so as to force redisplay.
     (put-text-property start end 'fontified t)))))

If I just change t to nil there (or to some other value, like 42), 
either of our patches starts behaving well. Perhaps Stefan could comment.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 17:43                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 19:18                                   ` Yuan Fu
@ 2023-12-16 23:02                                   ` Dmitry Gutov
  1 sibling, 0 replies; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-16 23:02 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Yuan Fu, 66732, Eli Zaretskii, dominik

On 16/12/2023 19:43, Stefan Monnier wrote:
>> I think most of the time the parser notifier will send ranges of limited
>> size (as one or other piece of text gets recognized as e.g. a comment, or
>> a raw string -- the bounds of the respective node that changed).
> 
> Yes, of course, in practice it will usually work OK.
> I'm talking about what should happen ideally so that we can make an
> informed decision about the code we decide to use in the end, knowing
> its downsides.

I figured that it would be useful at least to start with the same 
solution for both font-lock and syntax, because bugs in the former are 
usually easier to notice earlier.

But you have a point that doing invalidation earlier will mean that 
font-lock won't have to refontify the whole big region (whenever that 
actually happens in practice), only the visible part.

>>> One of the places where we'd need the sprinkle would be
>>> `pre-redisplay-functions`.  Another one might be `syntax-ppss`?
>> syntax-ppss has syntax-propertize-extend-region-functions. I haven't found
>> any better suitable hooks than that.
> 
> Indeed, I don't think there's a good hook for that yet, tho maybe
> `syntax-propertize-extend-region-functions` is good enough (basically
> make it call `treesit--font-lock-notifier` but return nil), assuming we
> have set a `syntax-propertize-function`.

Solutions for syntax and font-lock will likely need to work independently:

- syntax-ppss can be called outside of font-lock (e.g. by electric-pair)
- font-lock can work without syntax-propertize (e.g. when 
syntax-propertize-function is nil), and even when it does get called, it 
can happen somewhere inside font-lock highlighting code. E.g. by the 
time we reach treesit-font-lock-fontify-region, the region to refontify 
has already been decided.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 19:18                                   ` Yuan Fu
  2023-12-16 19:57                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-16 23:09                                     ` Dmitry Gutov
  2023-12-17  1:16                                       ` Yuan Fu
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-16 23:09 UTC (permalink / raw)
  To: Yuan Fu, Stefan Monnier; +Cc: Eli Zaretskii, 66732, dominik

On 16/12/2023 21:18, Yuan Fu wrote:
>> Indeed we don't want to do it directly in `after-change-functions`, but
>> doing it in `font-lock-default-fontify-region` is too late because the
>> redisplay and jit-lock have already (to some extent) decided what should
>> be refontified at that point.
>>
>> The infrastructure does offer ways to make it work (to fix previous
>> incorrect assumptions about what needed to be redisplayed/refontified),
>> but it's best if we can avoid it.
> I tried forcing reparse in pre-redisplay-functions, and it seems to work fine! See attached diff. The delist part can be improved a bit, but this is just a POC.

This also works.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 23:09                                     ` Dmitry Gutov
@ 2023-12-17  1:16                                       ` Yuan Fu
  2023-12-17 18:32                                         ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-17  1:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]



> On Dec 16, 2023, at 3:09 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 16/12/2023 21:18, Yuan Fu wrote:
>>> Indeed we don't want to do it directly in `after-change-functions`, but
>>> doing it in `font-lock-default-fontify-region` is too late because the
>>> redisplay and jit-lock have already (to some extent) decided what should
>>> be refontified at that point.
>>> 
>>> The infrastructure does offer ways to make it work (to fix previous
>>> incorrect assumptions about what needed to be redisplayed/refontified),
>>> but it's best if we can avoid it.
>> I tried forcing reparse in pre-redisplay-functions, and it seems to work fine! See attached diff. The delist part can be improved a bit, but this is just a POC.
> 
> This also works.

Ok, if there’s no objections, I’ll apply this patch.

Yuan


[-- Attachment #2: pre-redisplay.patch --]
[-- Type: application/octet-stream, Size: 3118 bytes --]

From eb2e9ca49e92465c5ef6ab82c59ccd1b395481d9 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Sat, 16 Dec 2023 17:15:04 -0800
Subject: [PATCH] Correctly refontify changed region in tree-sitter modes
 (bug#66732)

We already have treesit--font-lock-notifier that should mark changed
regions to be refontified, but it's called too late in the redsiplay &
fontification pipeline.  Here we add treesit--pre-redisplay that
forces reparse and calls notifier functions in
pre-redisplay-functions, which is early enough for the marking to take
effect.

Similarly, we force reparse in
syntax-propertize-extend-region-functions so syntax-ppss will have the
up-to-date syntax information when it scans the buffer text.

* lisp/treesit.el (treesit--pre-redisplay): New function
(treesit--pre-syntax-ppss): New function.
(treesit-major-mode-setup): Add hooks.
---
 lisp/treesit.el | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 8a07f5023a9..a9a526c0062 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1088,6 +1088,33 @@ treesit--font-lock-notifier
       (with-silent-modifications
         (put-text-property (car range) (cdr range) 'fontified nil)))))
 
+(defun treesit--pre-redisplay (&rest _)
+  "Force reparse and consequently run all notifiers.
+
+One of the notifiers is `treesit--font-lock-notifier', which will
+mark the region whose syntax has changed to \"need to refontify\".
+
+For example, when the user types the final slash of a C block
+comment /* xxx */, not only do we need to fontify the slash, but
+also the whole block comment, which previously wasn't fontified
+as comment due to incomplete parse tree."
+  ;; `treesit-update-ranges' will force the host language's parser to
+  ;; reparse and set correct ranges for embedded parsers.  Then
+  ;; `treesit-parser-root-node' will force those parsers to reparse.
+  (treesit-update-ranges)
+  ;; Force repase on _all_ the parsers might not be necessary, but
+  ;; this is probably the most robust way.
+  (dolist (parser (treesit-parser-list nil nil t))
+    (treesit-parser-root-node parser)))
+
+(defun treesit--pre-syntax-ppss (&rest _)
+  "Force reparse and consequently run all notifiers.
+
+Similar to font-lock, we want to update the `syntax' text
+property before `syntax-ppss' starts working on the text."
+  (treesit--pre-redisplay)
+  nil)
+
 ;;; Indent
 
 (define-error 'treesit-indent-error
@@ -2392,7 +2419,11 @@ treesit-major-mode-setup
     (treesit-font-lock-recompute-features)
     (dolist (parser (treesit-parser-list))
       (treesit-parser-add-notifier
-       parser #'treesit--font-lock-notifier)))
+       parser #'treesit--font-lock-notifier))
+    (add-hook 'pre-redisplay-functions #'treesit--pre-redisplay 0 t))
+  ;; Syntax
+  (add-hook 'syntax-propertize-extend-region-functions
+            #'treesit--pre-syntax-ppss 0 t)
   ;; Indent.
   (when treesit-simple-indent-rules
     (setq-local treesit-simple-indent-rules
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-17  1:16                                       ` Yuan Fu
@ 2023-12-17 18:32                                         ` Dmitry Gutov
  2023-12-19  3:12                                           ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-17 18:32 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

On 17/12/2023 03:16, Yuan Fu wrote:
> +(defun treesit--pre-syntax-ppss (&rest _)
> +  "Force reparse and consequently run all notifiers.
> +
> +Similar to font-lock, we want to update the `syntax' text
> +property before `syntax-ppss' starts working on the text."
> +  (treesit--pre-redisplay)
> +  nil)

This callback might be the best place to inform syntax-propertize about 
the extended bounds which need to be re-parsed, so I'm not sure about 
Stefan's suggestion to return nil here.

Even calling syntax-ppss-flush-cache wouldn't be a full solution, 
because that wouldn't update the value of START in syntax-propertize.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 22:56                           ` Dmitry Gutov
@ 2023-12-18 18:27                             ` Dmitry Gutov
  2023-12-18 19:12                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-18 18:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 66732, Yuan Fu, dominik

Stefan, what do you think?

On 17/12/2023 00:56, Dmitry Gutov wrote:
>> I found that if you don’t set text prop fontified to nil for the whole 
>> extended region, redisplay doesn’t seem to, well, redisplay the full 
>> region, even thought the new face has been correctly applied to them.
> 
> That sounds like a bug in font-lock? At the end of jit-lock-fontify-now, 
> there is a call creating a timer with jit-lock-force-redisplay.
> 
> And that function ends with this:
> 
>      ;; Don't cause refontification (it's already been done), but just do
>      ;; some random buffer change, so as to force redisplay.
>      (put-text-property start end 'fontified t)))))
> 
> If I just change t to nil there (or to some other value, like 42), 
> either of our patches starts behaving well. Perhaps Stefan could comment.






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-18 18:27                             ` Dmitry Gutov
@ 2023-12-18 19:12                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-18 19:33                                 ` Eli Zaretskii
  2023-12-18 23:08                                 ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-18 19:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Yuan Fu, dominik

>>> I found that if you don’t set text prop fontified to nil for the whole
>>> extended region, redisplay doesn’t seem to, well, redisplay the full
>>> region, even thought the new face has been correctly applied to them.
>> That sounds like a bug in font-lock? At the end of jit-lock-fontify-now,
>> there is a call creating a timer with jit-lock-force-redisplay.
>> And that function ends with this:
>>      ;; Don't cause refontification (it's already been done), but just do
>>      ;; some random buffer change, so as to force redisplay.
>>      (put-text-property start end 'fontified t)))))
>> If I just change t to nil there (or to some other value, like 42), either
>> of our patches starts behaving well. Perhaps Stefan could comment.

This chunk of code is present so as to cause a re-render (i.e. recompute
the glyph matrices) of the affected region, but not a re-fontification.
So a nil value would be wrong.
I'm surprised that 42 behaves differently from t.

What we really need here is to force the redisplay to consider that this
part of the buffer needs to be re-rendered.  In practice changing any
text-property on that chunk of text should do the trick, in my
experience, but from what you describe it seems that some optimisation
is sufficiently clever to notice that the old value was t and the new
value is identical so the region is not marked as modified?

Indeed, in `add_text_properties_1` I see we skip over intervals which
already have the right property values, so that might be what's happening.
I suggest we introduce a separate function with a name indicating what
we intend it to do (like `force-region-update`) so the code will
be clearer.
And its implementation could consist in adding and then removing some
dummy text property (tho a better implementation would go and modify
the underlying C-level variables in the buffer like BUF_*_UNCHANGED).


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-18 19:12                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-18 19:33                                 ` Eli Zaretskii
  2023-12-18 23:10                                   ` Dmitry Gutov
  2023-12-18 23:08                                 ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2023-12-18 19:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dmitry, casouri, 66732, dominik

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  66732@debbugs.gnu.org,
>   dominik@honnef.co,  Yuan Fu <casouri@gmail.com>
> Date: Mon, 18 Dec 2023 14:12:02 -0500
> 
> I'm surprised that 42 behaves differently from t.

The display code in C tests only for nil or non-nil, but jit-lock.el
tests against a literal t in a few places.  (And don't forget about
the special value 'defer' we use.)

> What we really need here is to force the redisplay to consider that this
> part of the buffer needs to be re-rendered.

AFAIU, it will only re-render that part of some of the faces changed.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-18 19:12                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-18 19:33                                 ` Eli Zaretskii
@ 2023-12-18 23:08                                 ` Dmitry Gutov
  2023-12-20 20:50                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-18 23:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, 66732, Yuan Fu, dominik

On 18/12/2023 21:12, Stefan Monnier wrote:
>>>> I found that if you don’t set text prop fontified to nil for the whole
>>>> extended region, redisplay doesn’t seem to, well, redisplay the full
>>>> region, even thought the new face has been correctly applied to them.
>>> That sounds like a bug in font-lock? At the end of jit-lock-fontify-now,
>>> there is a call creating a timer with jit-lock-force-redisplay.
>>> And that function ends with this:
>>>       ;; Don't cause refontification (it's already been done), but just do
>>>       ;; some random buffer change, so as to force redisplay.
>>>       (put-text-property start end 'fontified t)))))
>>> If I just change t to nil there (or to some other value, like 42), either
>>> of our patches starts behaving well. Perhaps Stefan could comment.
> 
> This chunk of code is present so as to cause a re-render (i.e. recompute
> the glyph matrices) of the affected region, but not a re-fontification.
> So a nil value would be wrong.
> I'm surprised that 42 behaves differently from t.

This patch also changes the visible behavior, fixing the problem that 
I'm seeing:

diff --git a/lisp/jit-lock.el b/lisp/jit-lock.el
index 452cbd1ca51..43db2d31856 100644
--- a/lisp/jit-lock.el
+++ b/lisp/jit-lock.el
@@ -499,6 +499,7 @@ jit-lock-force-redisplay
           (setq start (point-min) end (max start end)))
         ;; Don't cause refontification (it's already been done), but 
just do
         ;; some random buffer change, so as to force redisplay.
+       (put-text-property start end 'fontified nil)
         (put-text-property start end 'fontified t)))))
  \f
  ;;; Stealth fontification.

So it must be not about the eventual value of the property, but about 
triggering some counter, like (buffer-modified-tick). Which does get 
incremented after the above sequence (by 2).

> What we really need here is to force the redisplay to consider that this
> part of the buffer needs to be re-rendered.  In practice changing any
> text-property on that chunk of text should do the trick, in my
> experience, but from what you describe it seems that some optimisation
> is sufficiently clever to notice that the old value was t and the new
> value is identical so the region is not marked as modified?
> 
> Indeed, in `add_text_properties_1` I see we skip over intervals which
> already have the right property values, so that might be what's happening.
> I suggest we introduce a separate function with a name indicating what
> we intend it to do (like `force-region-update`) so the code will
> be clearer.
> And its implementation could consist in adding and then removing some
> dummy text property (tho a better implementation would go and modify
> the underlying C-level variables in the buffer like BUF_*_UNCHANGED).

Adding an extra 'put-text-property' call to jit-lock-force-redisplay 
seems cheap enough, but something even faster could be good too.





^ permalink raw reply related	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-18 19:33                                 ` Eli Zaretskii
@ 2023-12-18 23:10                                   ` Dmitry Gutov
  2023-12-19  3:22                                     ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-18 23:10 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: casouri, 66732, dominik

On 18/12/2023 21:33, Eli Zaretskii wrote:
>> From: Stefan Monnier<monnier@iro.umontreal.ca>
>> Cc: Eli Zaretskii<eliz@gnu.org>,66732@debbugs.gnu.org,
>>    dominik@honnef.co,  Yuan Fu<casouri@gmail.com>
>> Date: Mon, 18 Dec 2023 14:12:02 -0500
>>
>> I'm surprised that 42 behaves differently from t.
> The display code in C tests only for nil or non-nil, but jit-lock.el
> tests against a literal t in a few places.  (And don't forget about
> the special value 'defer' we use.)
> 
>> What we really need here is to force the redisplay to consider that this
>> part of the buffer needs to be re-rendered.
> AFAIU, it will only re-render that part of some of the faces changed.

It seems like in our scenario the region does not get re-rendered even 
when some of the faces changed in there.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-17 18:32                                         ` Dmitry Gutov
@ 2023-12-19  3:12                                           ` Yuan Fu
  2023-12-20  1:52                                             ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-19  3:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik



> On Dec 17, 2023, at 10:32 AM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 17/12/2023 03:16, Yuan Fu wrote:
>> +(defun treesit--pre-syntax-ppss (&rest _)
>> +  "Force reparse and consequently run all notifiers.
>> +
>> +Similar to font-lock, we want to update the `syntax' text
>> +property before `syntax-ppss' starts working on the text."
>> +  (treesit--pre-redisplay)
>> +  nil)
> 
> This callback might be the best place to inform syntax-propertize about the extended bounds which need to be re-parsed, so I'm not sure about Stefan's suggestion to return nil here.
> 
> Even calling syntax-ppss-flush-cache wouldn't be a full solution, because that wouldn't update the value of START in syntax-propertize.

Makes sense. Then I think a treeist function that returns the latest affected regions is in order?

Yuan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-18 23:10                                   ` Dmitry Gutov
@ 2023-12-19  3:22                                     ` Eli Zaretskii
  2023-12-19  3:40                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2023-12-19  3:22 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: casouri, 66732, monnier, dominik

> Date: Tue, 19 Dec 2023 01:10:39 +0200
> Cc: 66732@debbugs.gnu.org, dominik@honnef.co, casouri@gmail.com
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 18/12/2023 21:33, Eli Zaretskii wrote:
> >> From: Stefan Monnier<monnier@iro.umontreal.ca>
> >> Cc: Eli Zaretskii<eliz@gnu.org>,66732@debbugs.gnu.org,
> >>    dominik@honnef.co,  Yuan Fu<casouri@gmail.com>
> >> Date: Mon, 18 Dec 2023 14:12:02 -0500
> >>
> >> I'm surprised that 42 behaves differently from t.
> > The display code in C tests only for nil or non-nil, but jit-lock.el
> > tests against a literal t in a few places.  (And don't forget about
> > the special value 'defer' we use.)
> > 
> >> What we really need here is to force the redisplay to consider that this
> >> part of the buffer needs to be re-rendered.
> > AFAIU, it will only re-render that part of some of the faces changed.
> 
> It seems like in our scenario the region does not get re-rendered even 
> when some of the faces changed in there.

Can't happen.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-19  3:22                                     ` Eli Zaretskii
@ 2023-12-19  3:40                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-19 12:41                                         ` Eli Zaretskii
  0 siblings, 1 reply; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-19  3:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Gutov, casouri, 66732, dominik

>> It seems like in our scenario the region does not get re-rendered even 
>> when some of the faces changed in there.
> Can't happen.

I think Dmitry is talking about the case where the faces were change
after the text was rendered (but before the end of the redisplay cycle).

I.e. the redisplay starts, renders up to position POS, then calls
jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
*before* POS.


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-19  3:40                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-19 12:41                                         ` Eli Zaretskii
  2023-12-19 12:44                                           ` Dmitry Gutov
  2023-12-20 20:50                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 58+ messages in thread
From: Eli Zaretskii @ 2023-12-19 12:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dmitry, casouri, 66732, dominik

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Dmitry Gutov <dmitry@gutov.dev>,  66732@debbugs.gnu.org,
>   dominik@honnef.co,  casouri@gmail.com
> Date: Mon, 18 Dec 2023 22:40:59 -0500
> 
> >> It seems like in our scenario the region does not get re-rendered even 
> >> when some of the faces changed in there.
> > Can't happen.
> 
> I think Dmitry is talking about the case where the faces were change
> after the text was rendered (but before the end of the redisplay cycle).
> 
> I.e. the redisplay starts, renders up to position POS, then calls
> jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
> *before* POS.

This can only happen if a function called from jit-lock doesn't comply
to the protocol wrt such changes, right?  Because otherwise the
modified region would have been re-rendered on the next redisplay
cycle at the latest.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-19 12:41                                         ` Eli Zaretskii
@ 2023-12-19 12:44                                           ` Dmitry Gutov
  2023-12-20 20:50                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-19 12:44 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: casouri, 66732, dominik

On 19/12/2023 14:41, Eli Zaretskii wrote:
>> From: Stefan Monnier<monnier@iro.umontreal.ca>
>> Cc: Dmitry Gutov<dmitry@gutov.dev>,66732@debbugs.gnu.org,
>>    dominik@honnef.co,casouri@gmail.com
>> Date: Mon, 18 Dec 2023 22:40:59 -0500
>>
>>>> It seems like in our scenario the region does not get re-rendered even
>>>> when some of the faces changed in there.
>>> Can't happen.
>> I think Dmitry is talking about the case where the faces were change
>> after the text was rendered (but before the end of the redisplay cycle).
>>
>> I.e. the redisplay starts, renders up to position POS, then calls
>> jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
>> *before*  POS.
> This can only happen if a function called from jit-lock doesn't comply
> to the protocol wrt such changes, right?

We have a protocol: font-lock-extend-region-functions, but it's not 
working well enough due to the aforementioned problem.

> Because otherwise the
> modified region would have been re-rendered on the next redisplay
> cycle at the latest.

That does indeed happen, but seems too late from the user's POV.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-19  3:12                                           ` Yuan Fu
@ 2023-12-20  1:52                                             ` Dmitry Gutov
  2023-12-20  5:43                                               ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-20  1:52 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

On 19/12/2023 05:12, Yuan Fu wrote:
> 
>> On Dec 17, 2023, at 10:32 AM, Dmitry Gutov<dmitry@gutov.dev>  wrote:
>>
>> On 17/12/2023 03:16, Yuan Fu wrote:
>>> +(defun treesit--pre-syntax-ppss (&rest _)
>>> +  "Force reparse and consequently run all notifiers.
>>> +
>>> +Similar to font-lock, we want to update the `syntax' text
>>> +property before `syntax-ppss' starts working on the text."
>>> +  (treesit--pre-redisplay)
>>> +  nil)
>> This callback might be the best place to inform syntax-propertize about the extended bounds which need to be re-parsed, so I'm not sure about Stefan's suggestion to return nil here.
>>
>> Even calling syntax-ppss-flush-cache wouldn't be a full solution, because that wouldn't update the value of START in syntax-propertize.
> Makes sense. Then I think a treeist function that returns the latest affected regions is in order?

But latest compared to what? To the previous invocation of the parser?

What if the current caller uses this function and expects to see changes 
since time X, but some other feature already instantiated the parser 
sometime between X and NOW? The callbacks which would run now would send 
notifications compared to that other previous (unknown) invocation. 
There could be some system which keeps checkpoints that the callers 
could reference, but I'm not sure what's the best shape. And how the 
older history, not needed by any callers anymore, would be evicted (some 
weak hash map could work, but any caller that's not careful could create 
leaks...).

And without the above, we'll need a separate callback which would call 
syntax-ppss-flush-cache when the parser sends notifcations while invoked 
outside of syntax-propertize-extend-region-functions. And then its code 
will have to care about context anyway (inside s-p-e-r-f or not).





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-16 17:11                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-16 17:23                               ` Dmitry Gutov
@ 2023-12-20  2:01                               ` Dmitry Gutov
  2023-12-20  3:08                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-20  2:01 UTC (permalink / raw)
  To: Stefan Monnier, Yuan Fu; +Cc: Eli Zaretskii, 66732, dominik

On 16/12/2023 19:11, Stefan Monnier wrote:
>> Hmm... This `put-text-property` can't be right.
>> Can you trace `font-lock-fontify-region` and
>> `font-lock-default-fontify-region` to check that they correctly return
>> the proper (extended) bounds of the region they (re)fontified?
> BTW, using the extend-region-function isn't ideal: this is meant for
> things which can't be refontified separately, so if a change in the
> buffer causes notification to make changes over a large portion of the
> buffer we'll end up (re)fontified*right away*  that whole large
> portion even if only a small part (or even no part at all) is displayed.

BTW, do you have an example of a "thing which can't be refontified 
separately"?

Otherwise, I think a reorganization would be possible where when the 
region is extended to more than the visible area of the buffer, some 
parts of the buffer are marked as unfontified, but don't force their 
highlighting in the same cycle. Basically, what the 
pre-redisplay-functions patch does, but using the 
*-extend-region-functions protocol.

Might not be trivial code-wise. I guess it also depends on how much we 
consider the current approach of requesting an additional display 
iteration, to be a problem (e.g. does that create any real difference 
performance-wise).

The use of pre-redisplay-functions avoid that question, but asks the 
programmer to work on a lower level of abstraction (i.e. know about 
'fontified' => nil), as well as possibly apply this change to a bigger 
region than necessary (again, not sure about the performance impact).





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-20  2:01                               ` Dmitry Gutov
@ 2023-12-20  3:08                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-20  3:08 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Yuan Fu, 66732, Eli Zaretskii, dominik

> BTW, do you have an example of a "thing which can't be refontified
>  separately"?

A multi-line element fontified by a rule that uses a regexp that matches
those multiple lines.


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-20  1:52                                             ` Dmitry Gutov
@ 2023-12-20  5:43                                               ` Yuan Fu
  2023-12-20 11:31                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-20  5:43 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

[-- Attachment #1: Type: text/plain, Size: 2360 bytes --]



> On Dec 19, 2023, at 5:52 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 19/12/2023 05:12, Yuan Fu wrote:
>>> On Dec 17, 2023, at 10:32 AM, Dmitry Gutov<dmitry@gutov.dev>  wrote:
>>> 
>>> On 17/12/2023 03:16, Yuan Fu wrote:
>>>> +(defun treesit--pre-syntax-ppss (&rest _)
>>>> +  "Force reparse and consequently run all notifiers.
>>>> +
>>>> +Similar to font-lock, we want to update the `syntax' text
>>>> +property before `syntax-ppss' starts working on the text."
>>>> +  (treesit--pre-redisplay)
>>>> +  nil)
>>> This callback might be the best place to inform syntax-propertize about the extended bounds which need to be re-parsed, so I'm not sure about Stefan's suggestion to return nil here.
>>> 
>>> Even calling syntax-ppss-flush-cache wouldn't be a full solution, because that wouldn't update the value of START in syntax-propertize.
>> Makes sense. Then I think a treeist function that returns the latest affected regions is in order?
> 
> But latest compared to what? To the previous invocation of the parser?
> 
> What if the current caller uses this function and expects to see changes since time X, but some other feature already instantiated the parser sometime between X and NOW? The callbacks which would run now would send notifications compared to that other previous (unknown) invocation. There could be some system which keeps checkpoints that the callers could reference, but I'm not sure what's the best shape. And how the older history, not needed by any callers anymore, would be evicted (some weak hash map could work, but any caller that's not careful could create leaks...).
> 
> And without the above, we'll need a separate callback which would call syntax-ppss-flush-cache when the parser sends notifcations while invoked outside of syntax-propertize-extend-region-functions. And then its code will have to care about context anyway (inside s-p-e-r-f or not).

Ok, then how about we set a variable (say, treesit--syntax-ppss-lower-bound) to the lower-bound (aka, start) of the affected region in notifiers (so multiple invocation of the notifier will result in a lowest lower-bound), and in syntax-propertize-extend-region-functions, we extend the region to start from treesit--syntax-ppss-lower-bound, if it’s non-nil; after syntax-ppss, we set treesit--syntax-ppss-lower-bound to nil?

Yuan

[-- Attachment #2: Type: text/html, Size: 6632 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-20  5:43                                               ` Yuan Fu
@ 2023-12-20 11:31                                                 ` Dmitry Gutov
  0 siblings, 0 replies; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-20 11:31 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

On 20/12/2023 07:43, Yuan Fu wrote:
> Ok, then how about we set a variable (say, 
> treesit--syntax-ppss-lower-bound) to the lower-bound (aka, start) of the 
> affected region in notifiers (so multiple invocation of the notifier 
> will result in a lowest lower-bound), and in 
> syntax-propertize-extend-region-functions, we extend the region to start 
> from treesit--syntax-ppss-lower-bound, if it’s non-nil; after 
> syntax-ppss, we set treesit--syntax-ppss-lower-bound to nil?

Yep, I think that should work.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-18 23:08                                 ` Dmitry Gutov
@ 2023-12-20 20:50                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-20 20:50 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Yuan Fu, dominik

> This patch also changes the visible behavior, fixing the problem that I'm
> seeing:
>
> diff --git a/lisp/jit-lock.el b/lisp/jit-lock.el
> index 452cbd1ca51..43db2d31856 100644
> --- a/lisp/jit-lock.el
> +++ b/lisp/jit-lock.el
> @@ -499,6 +499,7 @@ jit-lock-force-redisplay
>           (setq start (point-min) end (max start end)))
>         ;; Don't cause refontification (it's already been done), but just do
>         ;; some random buffer change, so as to force redisplay.
> +       (put-text-property start end 'fontified nil)
>         (put-text-property start end 'fontified t)))))
>  \f
>  ;;; Stealth fontification.

Looks good to me (and makes sense after reading the code of
`put-text-property`).


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-19 12:41                                         ` Eli Zaretskii
  2023-12-19 12:44                                           ` Dmitry Gutov
@ 2023-12-20 20:50                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-12-23 10:17                                             ` Eli Zaretskii
  1 sibling, 1 reply; 58+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-20 20:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, casouri, 66732, dominik

>> I think Dmitry is talking about the case where the faces were change
>> after the text was rendered (but before the end of the redisplay cycle).
>> 
>> I.e. the redisplay starts, renders up to position POS, then calls
>> jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
>> *before* POS.
>
> This can only happen if a function called from jit-lock doesn't comply
> to the protocol wrt such changes, right?

I'm talking about a problem in the interaction between the redisplay
a jit-lock itself.

Jit-lock is aware of it and uses `jit-lock-force-redisplay` as needed to
account for it.

> Because otherwise the modified region would have been re-rendered on
> the next redisplay cycle at the latest.

Yes, if the jit-lock client (e.g. font-lock) follows the protocol, then
jit-lock should take care to re-render the regions when/where needed.

*Except* that the implementation of `jit-lock-force-redisplay` is
not effective and needs a change like the one suggested by Dmitry.


        Stefan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-20 20:50                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-12-23 10:17                                             ` Eli Zaretskii
  2023-12-23 18:02                                               ` Yuan Fu
  2023-12-23 20:55                                               ` Dmitry Gutov
  0 siblings, 2 replies; 58+ messages in thread
From: Eli Zaretskii @ 2023-12-23 10:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dmitry, casouri, 66732, dominik

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: dmitry@gutov.dev,  66732@debbugs.gnu.org,  dominik@honnef.co,
>   casouri@gmail.com
> Date: Wed, 20 Dec 2023 15:50:21 -0500
> 
> >> I think Dmitry is talking about the case where the faces were change
> >> after the text was rendered (but before the end of the redisplay cycle).
> >> 
> >> I.e. the redisplay starts, renders up to position POS, then calls
> >> jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
> >> *before* POS.
> >
> > This can only happen if a function called from jit-lock doesn't comply
> > to the protocol wrt such changes, right?
> 
> I'm talking about a problem in the interaction between the redisplay
> a jit-lock itself.
> 
> Jit-lock is aware of it and uses `jit-lock-force-redisplay` as needed to
> account for it.
> 
> > Because otherwise the modified region would have been re-rendered on
> > the next redisplay cycle at the latest.
> 
> Yes, if the jit-lock client (e.g. font-lock) follows the protocol, then
> jit-lock should take care to re-render the regions when/where needed.
> 
> *Except* that the implementation of `jit-lock-force-redisplay` is
> not effective and needs a change like the one suggested by Dmitry.

So can we install that change and close this bug, then?





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 10:17                                             ` Eli Zaretskii
@ 2023-12-23 18:02                                               ` Yuan Fu
  2023-12-23 20:46                                                 ` Dmitry Gutov
  2023-12-23 20:55                                               ` Dmitry Gutov
  1 sibling, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-23 18:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dmitry, 66732, Stefan Monnier, dominik

[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]



> On Dec 23, 2023, at 2:17 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: dmitry@gutov.dev,  66732@debbugs.gnu.org,  dominik@honnef.co,
>>  casouri@gmail.com
>> Date: Wed, 20 Dec 2023 15:50:21 -0500
>> 
>>>> I think Dmitry is talking about the case where the faces were change
>>>> after the text was rendered (but before the end of the redisplay cycle).
>>>> 
>>>> I.e. the redisplay starts, renders up to position POS, then calls
>>>> jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
>>>> *before* POS.
>>> 
>>> This can only happen if a function called from jit-lock doesn't comply
>>> to the protocol wrt such changes, right?
>> 
>> I'm talking about a problem in the interaction between the redisplay
>> a jit-lock itself.
>> 
>> Jit-lock is aware of it and uses `jit-lock-force-redisplay` as needed to
>> account for it.
>> 
>>> Because otherwise the modified region would have been re-rendered on
>>> the next redisplay cycle at the latest.
>> 
>> Yes, if the jit-lock client (e.g. font-lock) follows the protocol, then
>> jit-lock should take care to re-render the regions when/where needed.
>> 
>> *Except* that the implementation of `jit-lock-force-redisplay` is
>> not effective and needs a change like the one suggested by Dmitry.
> 
> So can we install that change and close this bug, then?

I have a patch. Dmitry, WDYT?

Yuan


[-- Attachment #2: extend-region.patch --]
[-- Type: application/octet-stream, Size: 4764 bytes --]

From b95e66ae13ac147d00c581a76d3ec59cb4fafb78 Mon Sep 17 00:00:00 2001
From: Yuan Fu <casouri@gmail.com>
Date: Sat, 16 Dec 2023 17:15:04 -0800
Subject: [PATCH] Correctly refontify changed region in tree-sitter modes
 (bug#66732)

We already have treesit--font-lock-notifier that should mark changed
regions to be refontified, but it's called too late in the redsiplay &
fontification pipeline.  Here we add treesit--pre-redisplay that
forces reparse and calls notifier functions in
pre-redisplay-functions, which is early enough for the marking to take
effect.

Similarly, we force reparse in
syntax-propertize-extend-region-functions so syntax-ppss will have the
up-to-date syntax information when it scans the buffer text.  We also
record the lowest start position of the affected regions, and make
sure next syntex-propertize starts from that position.

* lisp/treesit.el (treesit--syntax-propertize-start): New variable.
(treesit--syntax-propertize-notifier):
(treesit--pre-redisplay):
(treesit--pre-syntax-ppss): New functions.
(treesit-major-mode-setup): Add hooks.
---
 lisp/treesit.el | 64 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git a/lisp/treesit.el b/lisp/treesit.el
index 8a07f5023a9..f81ef4c795b 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -1088,6 +1088,64 @@ treesit--font-lock-notifier
       (with-silent-modifications
         (put-text-property (car range) (cdr range) 'fontified nil)))))
 
+(defvar-local treesit--syntax-propertize-start nil
+  "If non-nil, next `syntax-propertize' should start at this position.
+
+When tree-sitter parser reparses, it calls
+`treesit--syntax-propertize-notifier' with the affected region,
+and that function sets this variable to the start of the affected
+region.")
+
+(defun treesit--syntax-propertize-notifier (ranges parser)
+  "Sets `treesit--syntax-propertize-start' to the smallest start.
+Specifically, the smallest start position among all the ranges in
+RANGES for PARSER."
+  (with-current-buffer (treesit-parser-buffer parser)
+    (when-let* ((range-starts (mapcar #'car ranges))
+                (min-range-start
+                 (seq-reduce
+                  #'min (cdr range-starts) (car range-starts))))
+      (if (null treesit--syntax-propertize-start)
+          (setq treesit--syntax-propertize-start min-range-start)
+        (setq treesit--syntax-propertize-start
+              (min treesit--syntax-propertize-start min-range-start))))))
+
+(defun treesit--pre-redisplay (&rest _)
+  "Force reparse and consequently run all notifiers.
+
+One of the notifiers is `treesit--font-lock-notifier', which will
+mark the region whose syntax has changed to \"need to refontify\".
+
+For example, when the user types the final slash of a C block
+comment /* xxx */, not only do we need to fontify the slash, but
+also the whole block comment, which previously wasn't fontified
+as comment due to incomplete parse tree."
+  ;; `treesit-update-ranges' will force the host language's parser to
+  ;; reparse and set correct ranges for embedded parsers.  Then
+  ;; `treesit-parser-root-node' will force those parsers to reparse.
+  (treesit-update-ranges)
+  ;; Force repase on _all_ the parsers might not be necessary, but
+  ;; this is probably the most robust way.
+  (dolist (parser (treesit-parser-list))
+    (treesit-parser-root-node parser)))
+
+(defun treesit--pre-syntax-ppss (start end)
+  "Force reparse and consequently run all notifiers.
+
+Similar to font-lock, we want to update the `syntax' text
+property before `syntax-ppss' starts working on the text.  We
+also want to extend the to-be-propertized region to include the
+whole region affected by the last reparse.
+
+START and END mark the current to-be-propertized region."
+  (treesit--pre-redisplay)
+  (let ((new-start treesit--syntax-propertize-start))
+    (if (and new-start (< new-start start))
+        (progn
+          (setq treesit--syntax-propertize-start nil)
+          (cons new-start end))
+      nil)))
+
 ;;; Indent
 
 (define-error 'treesit-indent-error
@@ -2392,7 +2450,11 @@ treesit-major-mode-setup
     (treesit-font-lock-recompute-features)
     (dolist (parser (treesit-parser-list))
       (treesit-parser-add-notifier
-       parser #'treesit--font-lock-notifier)))
+       parser #'treesit--font-lock-notifier))
+    (add-hook 'pre-redisplay-functions #'treesit--pre-redisplay 0 t))
+  ;; Syntax
+  (add-hook 'syntax-propertize-extend-region-functions
+            #'treesit--pre-syntax-ppss 0 t)
   ;; Indent.
   (when treesit-simple-indent-rules
     (setq-local treesit-simple-indent-rules
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 18:02                                               ` Yuan Fu
@ 2023-12-23 20:46                                                 ` Dmitry Gutov
  2023-12-23 20:51                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-23 20:46 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii; +Cc: 66732, Stefan Monnier, dominik

On 23/12/2023 20:02, Yuan Fu wrote:
> I have a patch. Dmitry, WDYT?

Looks good! I haven't had a chance to test it yet, but it's probably not 
necessary to do before the checkin.

You can probably remove ruby-ts-mode hooking into the parser as well, in 
the same commit. Meaning, the function ruby-ts--parser-after-change and 
the corresponding treesit-parser-add-notifier call.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 20:46                                                 ` Dmitry Gutov
@ 2023-12-23 20:51                                                   ` Dmitry Gutov
  2023-12-23 23:07                                                     ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-23 20:51 UTC (permalink / raw)
  To: Yuan Fu, Eli Zaretskii; +Cc: 66732, Stefan Monnier, dominik

On 23/12/2023 22:46, Dmitry Gutov wrote:
> On 23/12/2023 20:02, Yuan Fu wrote:
>> I have a patch. Dmitry, WDYT?
> 
> Looks good! I haven't had a chance to test it yet, but it's probably not 
> necessary to do before the checkin.

Speaking of pre-redisplay-functions, though, Stefan mentioned that it 
could be beneficial to do a debounce (based on 
buffer-chars-modified-tick?) because they can be called several times 
per user command.

Although if the parser short-circuits when the buffer is unchanged since 
last time, perhaps it's not necessary.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 10:17                                             ` Eli Zaretskii
  2023-12-23 18:02                                               ` Yuan Fu
@ 2023-12-23 20:55                                               ` Dmitry Gutov
  2023-12-24  6:03                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-23 20:55 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: casouri, 66732, dominik

On 23/12/2023 12:17, Eli Zaretskii wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: dmitry@gutov.dev,  66732@debbugs.gnu.org,  dominik@honnef.co,
>>    casouri@gmail.com
>> Date: Wed, 20 Dec 2023 15:50:21 -0500
>>
>>>> I think Dmitry is talking about the case where the faces were change
>>>> after the text was rendered (but before the end of the redisplay cycle).
>>>>
>>>> I.e. the redisplay starts, renders up to position POS, then calls
>>>> jit-lock because `fonfitied` is nil, and jit-lock modifies faces of text
>>>> *before* POS.
>>>
>>> This can only happen if a function called from jit-lock doesn't comply
>>> to the protocol wrt such changes, right?
>>
>> I'm talking about a problem in the interaction between the redisplay
>> a jit-lock itself.
>>
>> Jit-lock is aware of it and uses `jit-lock-force-redisplay` as needed to
>> account for it.
>>
>>> Because otherwise the modified region would have been re-rendered on
>>> the next redisplay cycle at the latest.
>>
>> Yes, if the jit-lock client (e.g. font-lock) follows the protocol, then
>> jit-lock should take care to re-render the regions when/where needed.
>>
>> *Except* that the implementation of `jit-lock-force-redisplay` is
>> not effective and needs a change like the one suggested by Dmitry.
> 
> So can we install that change and close this bug, then?

That one-liner of mine was for a different feature which we ended up not 
using in the fix for this bug (font-lock-extend-region-functions).66732

I've pushed it to master now. Let me know if it should be backported.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 20:51                                                   ` Dmitry Gutov
@ 2023-12-23 23:07                                                     ` Yuan Fu
  2023-12-24  2:10                                                       ` Dmitry Gutov
  0 siblings, 1 reply; 58+ messages in thread
From: Yuan Fu @ 2023-12-23 23:07 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik



> On Dec 23, 2023, at 12:51 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 23/12/2023 22:46, Dmitry Gutov wrote:
>> On 23/12/2023 20:02, Yuan Fu wrote:
>>> I have a patch. Dmitry, WDYT?
>> Looks good! I haven't had a chance to test it yet, but it's probably not necessary to do before the checkin.

I tested with C block comments and python triple quotes and they both work fine.

> 
> Speaking of pre-redisplay-functions, though, Stefan mentioned that it could be beneficial to do a debounce (based on buffer-chars-modified-tick?) because they can be called several times per user command.
> 
> Although if the parser short-circuits when the buffer is unchanged since last time, perhaps it's not necessary.

Cool. I added debounce and removed ruby-ts-mode code, and pushed to emacs-29.

Yuan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 23:07                                                     ` Yuan Fu
@ 2023-12-24  2:10                                                       ` Dmitry Gutov
  2023-12-24  3:02                                                         ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Dmitry Gutov @ 2023-12-24  2:10 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik

On 24/12/2023 01:07, Yuan Fu wrote:
> 
> 
>> On Dec 23, 2023, at 12:51 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>
>> On 23/12/2023 22:46, Dmitry Gutov wrote:
>>> On 23/12/2023 20:02, Yuan Fu wrote:
>>>> I have a patch. Dmitry, WDYT?
>>> Looks good! I haven't had a chance to test it yet, but it's probably not necessary to do before the checkin.
> 
> I tested with C block comments and python triple quotes and they both work fine.

Thank you, see the follow-up in commit 9cfa498e0ab.

Also, looking at it more, treesit--pre-syntax-ppss should probably set 
treesit--syntax-propertize-start to nil in all cases. This shouldn't 
make much of a practical difference in most cases, though.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-24  2:10                                                       ` Dmitry Gutov
@ 2023-12-24  3:02                                                         ` Yuan Fu
  0 siblings, 0 replies; 58+ messages in thread
From: Yuan Fu @ 2023-12-24  3:02 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, 66732, Stefan Monnier, dominik



> On Dec 23, 2023, at 6:10 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
> 
> On 24/12/2023 01:07, Yuan Fu wrote:
>>> On Dec 23, 2023, at 12:51 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>>> 
>>> On 23/12/2023 22:46, Dmitry Gutov wrote:
>>>> On 23/12/2023 20:02, Yuan Fu wrote:
>>>>> I have a patch. Dmitry, WDYT?
>>>> Looks good! I haven't had a chance to test it yet, but it's probably not necessary to do before the checkin.
>> I tested with C block comments and python triple quotes and they both work fine.
> 
> Thank you, see the follow-up in commit 9cfa498e0ab.

Oops, thanks for the fix.

Yuan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-23 20:55                                               ` Dmitry Gutov
@ 2023-12-24  6:03                                                 ` Eli Zaretskii
  2024-02-08  1:40                                                   ` Yuan Fu
  0 siblings, 1 reply; 58+ messages in thread
From: Eli Zaretskii @ 2023-12-24  6:03 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: casouri, 66732, monnier, dominik

> Date: Sat, 23 Dec 2023 22:55:04 +0200
> Cc: 66732@debbugs.gnu.org, dominik@honnef.co, casouri@gmail.com
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> I've pushed it to master now. Let me know if it should be backported.

Thanks, I see no need to backport it for now.





^ permalink raw reply	[flat|nested] 58+ messages in thread

* bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably
  2023-12-24  6:03                                                 ` Eli Zaretskii
@ 2024-02-08  1:40                                                   ` Yuan Fu
  0 siblings, 0 replies; 58+ messages in thread
From: Yuan Fu @ 2024-02-08  1:40 UTC (permalink / raw)
  To: 66732-done


> On Dec 23, 2023, at 10:03 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> Date: Sat, 23 Dec 2023 22:55:04 +0200
>> Cc: 66732@debbugs.gnu.org, dominik@honnef.co, casouri@gmail.com
>> From: Dmitry Gutov <dmitry@gutov.dev>
>> 
>> I've pushed it to master now. Let me know if it should be backported.
> 
> Thanks, I see no need to backport it for now.

Closing this as completed.

Yuan





^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2024-02-08  1:40 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-24 14:22 bug#66732: tree-sitter fontification doesn't update multi-line syntax reliably Dominik Honnef
2023-10-24 23:15 ` Dmitry Gutov
2023-10-29 12:22   ` Eli Zaretskii
2023-11-18  8:37     ` Eli Zaretskii
2023-12-11  4:16       ` Yuan Fu
2023-12-11 12:05         ` Eli Zaretskii
2023-12-11 14:35           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-11 15:53         ` Dmitry Gutov
2023-12-12  7:50           ` Yuan Fu
2023-12-12 12:43             ` Dmitry Gutov
2023-12-13  3:28               ` Yuan Fu
2023-12-13  3:45                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-13  7:12                   ` Yuan Fu
2023-12-13 14:30                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-14  1:43                 ` Dmitry Gutov
2023-12-14  8:29                   ` Yuan Fu
2023-12-15  1:01                     ` Dmitry Gutov
2023-12-15  7:12                       ` Yuan Fu
2023-12-16  5:56                         ` Yuan Fu
2023-12-16 15:22                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-16 17:11                             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-16 17:23                               ` Dmitry Gutov
2023-12-16 17:43                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-16 19:18                                   ` Yuan Fu
2023-12-16 19:57                                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-16 23:09                                     ` Dmitry Gutov
2023-12-17  1:16                                       ` Yuan Fu
2023-12-17 18:32                                         ` Dmitry Gutov
2023-12-19  3:12                                           ` Yuan Fu
2023-12-20  1:52                                             ` Dmitry Gutov
2023-12-20  5:43                                               ` Yuan Fu
2023-12-20 11:31                                                 ` Dmitry Gutov
2023-12-16 23:02                                   ` Dmitry Gutov
2023-12-20  2:01                               ` Dmitry Gutov
2023-12-20  3:08                                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-16 22:56                           ` Dmitry Gutov
2023-12-18 18:27                             ` Dmitry Gutov
2023-12-18 19:12                               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-18 19:33                                 ` Eli Zaretskii
2023-12-18 23:10                                   ` Dmitry Gutov
2023-12-19  3:22                                     ` Eli Zaretskii
2023-12-19  3:40                                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-19 12:41                                         ` Eli Zaretskii
2023-12-19 12:44                                           ` Dmitry Gutov
2023-12-20 20:50                                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-23 10:17                                             ` Eli Zaretskii
2023-12-23 18:02                                               ` Yuan Fu
2023-12-23 20:46                                                 ` Dmitry Gutov
2023-12-23 20:51                                                   ` Dmitry Gutov
2023-12-23 23:07                                                     ` Yuan Fu
2023-12-24  2:10                                                       ` Dmitry Gutov
2023-12-24  3:02                                                         ` Yuan Fu
2023-12-23 20:55                                               ` Dmitry Gutov
2023-12-24  6:03                                                 ` Eli Zaretskii
2024-02-08  1:40                                                   ` Yuan Fu
2023-12-18 23:08                                 ` Dmitry Gutov
2023-12-20 20:50                                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-12-12 15:34             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.