bug#61369: Problem with keeping tree-sitter parse tree up-to-date

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

From: Yuan Fu <casouri@gmail.com>
To: Dmitry Gutov <dgutov@yandex.ru>
Cc: Theodor Thornhill <theo@thornhill.no>, 61369@debbugs.gnu.org
Subject: bug#61369: Problem with keeping tree-sitter parse tree up-to-date
Date: Sat, 18 Feb 2023 02:05:04 -0800	[thread overview]
Message-ID: <6FF39BA3-247A-46D9-B3A1-ECFE17A09778@gmail.com> (raw)
In-Reply-To: <7ee28606-18cc-ce4f-e601-3954489c4f4c@yandex.ru>



> On Feb 17, 2023, at 5:25 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
> On 18/02/2023 03:14, Yuan Fu wrote:
>>> On Feb 17, 2023, at 4:11 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
>>> 
>>> On 18/02/2023 00:32, Yuan Fu wrote:
>>>> Thank you very much! I thought that clipping the change into the fixed visible range, and rely on treesit_sync_visible_region to add back the clipped “tail” when we extend the visible range would be equivalent to not clipping, but I guess clipping and re-adding affects how incremental parsing works inside tree-sitter.
>>> 
>>> It seems like the "repairing" sync used a different range, one that didn't include the character number 68 inserted from the beginning.
>>> 
>>> It just synced the 1 or 2 characters at the end of the buffer, the difference between the computed visible_end and the actual BUF_ZV_BYTE.
>> That should be enough, no? Because other text didn’t change, they just moved. And tree-sitter should know that they moved. Or maybe I’m misunderstanding what you mean.
> 
> But the "unsynced" character is at position 68.
> 
> And we just tell tree-sitter to update positions 134-136. So it stays ignorant of the changed char in the middle of the buffer.
> 
> It's not just about not knowing about the change either (the character in question is a newline, so its absence wouldn't lead to a syntax error), but about wrong offsets in the old parse tree, based on which the new tree is generated. That probably creates a wrong picture of the source text in the parser.

Ok, I made some visualization to understand it, and yeah you are right. I’ll need to modify the comment a bit.

|visible range|

updated range
-------------

|aaaaaa|
|bbbbbbbbaaaa|aa  start: 0, old_end: 0, new_end: 6
 ------          
|bbbbbbbbaaaaaa|  start: 12, old_end: 12, new_end: 14
             --


> 
>>>> I don’t think this change would have any adverse effect, because if you think of it, inserting text in a narrowed region always extends the region, rather than pushing text at the end out of the narrowed region. So the right thing to do here is in fact not clipping new_end_offset.
>>> 
>>> I figured it could be a problem if both old_end_byte and new_end_byte extend past the current restriction.
>> That should be fine (ie, technically correct), since when we widen, the clipped text are reparsed by tree-sitter as new text.
> 
> I guess the effect I was thinking of is that
> 
>  XTS_PARSER (lisp_parser)->visible_end
> 
> would end up with a higher value than BUF_ZV_BYTE. Not sure if it's a problem.

It shouldn’t be, since BUF_ZV_BYTE should automatically grow when user inserts text. Even if it does, we always call treesit_sync_visible_region to sync up visible_beg/end with BUF_(Z)V_BYTE before parsing, so it shouldn’t be a problem.

Yuan

next prev parent reply	other threads:[~2023-02-18 10:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-08 15:34 bug#61369: Problem with keeping tree-sitter parse tree up-to-date Dmitry Gutov
2023-02-08 18:20 ` Theodor Thornhill via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-08 19:41   ` Dmitry Gutov
2023-02-10  1:22 ` Yuan Fu
2023-02-10  1:38   ` Dmitry Gutov
2023-02-13  9:10 ` Yuan Fu
2023-02-13 23:59 ` Yuan Fu
2023-02-15  2:17   ` Dmitry Gutov
2023-02-15 22:44     ` Dmitry Gutov
2023-02-17 22:32       ` Yuan Fu
2023-02-18  0:11         ` Dmitry Gutov
2023-02-18  1:14           ` Yuan Fu
2023-02-18  1:25             ` Dmitry Gutov
2023-02-18 10:05               ` Yuan Fu [this message]
2023-02-18  7:15           ` Eli Zaretskii
2023-02-18 17:21             ` Dmitry Gutov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6FF39BA3-247A-46D9-B3A1-ECFE17A09778@gmail.com \
    --to=casouri@gmail.com \
    --cc=61369@debbugs.gnu.org \
    --cc=dgutov@yandex.ru \
    --cc=theo@thornhill.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).