From: Yuan Fu <casouri@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 59574@debbugs.gnu.org
Subject: bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer
Date: Fri, 25 Nov 2022 19:18:09 -0800 [thread overview]
Message-ID: <6350D0DE-63CD-410A-AA48-56D924ED67EA@gmail.com> (raw)
In-Reply-To: <837czjulc4.fsf@gnu.org>
> On Nov 25, 2022, at 7:04 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
> To reproduce:
>
> emacs -Q
> C-x C-f foo.c RET
> M-x c-ts-mode RET
> Type "in"
Thanks for finding this out!
>
> Make sure foo.c doesn't exist, so you start from an empty buffer. As soon
> as you type the second character of "in", there's an assertion violation:
>
> treesit.c:1383: Emacs fatal error: assertion failed: end_byte <= BUF_ZV_BYTE (bu
> ffer)
>
> Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:427
> 427 signal (sig, SIG_DFL);
> (gdb) up
> #1 0x01230802 in die (
> msg=0x18e6778 <DEFAULT_REHASH_SIZE+3288> "end_byte <= BUF_ZV_BYTE (buffer)", file=0x18e5fcc <DEFAULT_REHASH_SIZE+1324> "treesit.c", line=1383)
> at alloc.c:7697
> 7697 terminate_due_to_signal (SIGABRT, INT_MAX);
> (gdb)
> #2 0x01355636 in treesit_make_ranges (ranges=0x856a778, len=1,
> buffer=0x7fe94b0) at treesit.c:1383
> 1383 eassert (end_byte <= BUF_ZV_BYTE (buffer));
> (gdb) p end_byte
> $1 = 4
> (gdb) p BUF_ZV_BYTE(buffer)
> $2 = 3
>
> Interestingly, this only happens once, when the buffer includes exactly 1
> byte and an additional character is inserted. If you get past this
> assertion, further characters can be inserted without any problems, and
> end_byte always equals BUF_ZV_BYTE.
>
> The backtrace is below, if it is interesting.
>
> I couldn't figure out where did tree-sitter take the range it returns to us.
> Yuan, can you describe how does the parser get the range it needs to
> consider? If I put a breakpoint in treesit-parser-set-included-ranges, the
> breakpoint never breaks, so this doesn't seem to be how the range is set in
> this scenario.
After we parse the buffer (in treesit_ensure_parsed) we compute the ranges that has changed since last parse, by calling ts_tree_get_changed_ranges, and pass the ranges to notifier functions (those added by treesit-parser-add-notifier). This range is different from the range within which a parser operates. That range is set by treesit-parser-set-included-ranges, and is not involved with the parsing, treesit_record_changes, visible_beg/end stuff.
Both feature happens to use treesit_make_ranges as a helper function, but the similarity ends there.
> There's also something strange in treesit_record_change: when it is called
> for the first time in a buffer which was empty and you insert one character,
> we bypass the updating of visible_beg and visible_end fields of the Lisp
> parser object, because XTS_PARSER (lisp_parser)->tree is NULL. But it looks
> to me that we should still update these two fields regardless, no? Only the
> call to treesit_tree_edit_1 needs the tree. (I thought that maybe this lack
> of update explains the assertion, but even if I move the condition to guard
> only treesit_tree_edit_1, the assertion still happens, so I guess my
> hypothesis eats dust.)
We don’t need to update visible_beg/end in treesit_record_change if tree is NULL, because visible_beg/end represents the range of buffer that the tree sees, so if there is no tree, visible_beg/end can be considered uninitialized. However you are right about needing to update visible_beg/end, but in treesit_ensure_position_synced (I renamed it to treesit_sync_visible_region): that’s where we ensure visible_beg/end equals to BUF_BEGV_BYTE/friends.
The problem is we don’t update visible_beg/end for the very first parse, when tree is NULL.
I also added some comments, hopefully they sufficiently explain everything.
Yuan
next prev parent reply other threads:[~2022-11-26 3:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-25 15:04 bug#59574: 29.0.50; Emacs crashes when using tree-sitter-based mode in an empty buffer Eli Zaretskii
2022-11-26 3:18 ` Yuan Fu [this message]
2022-11-26 14:31 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6350D0DE-63CD-410A-AA48-56D924ED67EA@gmail.com \
--to=casouri@gmail.com \
--cc=59574@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.