* bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region?
@ 2022-11-27 17:12 miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-27 17:28 ` Stefan Kangas
2022-11-28 22:51 ` Yuan Fu
0 siblings, 2 replies; 3+ messages in thread
From: miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-11-27 17:12 UTC (permalink / raw)
To: 59637
[-- Attachment #1: Type: text/plain, Size: 944 bytes --]
As far as I understand, the current behaviour of
treesit-parser-set-included-ranges is that the concatenation of text
from different regions in the same range set is considered as one
program. This means that for this html program
<html>
<script>
/* comment start
</script>
<script>
alert('hello');
</script>
</html>
treesitter would consider "alert('hello');" to be inside a comment and
the second script tag would contain an error about missing comment
end.
However, testing this in Firefox, it seems that the first script tag is
the erroneous one here and the alert function call isn't inside a
comment. So I guess the correct way to parse this html document would be
to have two instances of javascript parser, one for each region. On the
other hand, we should consider if this is worth the added complexity and
performance degradation.
Thanks and best regards.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region?
2022-11-27 17:12 bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-11-27 17:28 ` Stefan Kangas
2022-11-28 22:51 ` Yuan Fu
1 sibling, 0 replies; 3+ messages in thread
From: Stefan Kangas @ 2022-11-27 17:28 UTC (permalink / raw)
To: miha, 59637; +Cc: Yuan Fu
miha--- via "Bug reports for GNU Emacs, the Swiss army knife of text
editors" <bug-gnu-emacs@gnu.org> writes:
> As far as I understand, the current behaviour of
> treesit-parser-set-included-ranges is that the concatenation of text
> from different regions in the same range set is considered as one
> program. This means that for this html program
>
> <html>
> <script>
> /* comment start
> </script>
> <script>
> alert('hello');
> </script>
> </html>
>
> treesitter would consider "alert('hello');" to be inside a comment and
> the second script tag would contain an error about missing comment
> end.
>
> However, testing this in Firefox, it seems that the first script tag is
> the erroneous one here and the alert function call isn't inside a
> comment. So I guess the correct way to parse this html document would be
> to have two instances of javascript parser, one for each region. On the
> other hand, we should consider if this is worth the added complexity and
> performance degradation.
>
> Thanks and best regards.
Copying in Yuan Fu.
^ permalink raw reply [flat|nested] 3+ messages in thread
* bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region?
2022-11-27 17:12 bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-27 17:28 ` Stefan Kangas
@ 2022-11-28 22:51 ` Yuan Fu
1 sibling, 0 replies; 3+ messages in thread
From: Yuan Fu @ 2022-11-28 22:51 UTC (permalink / raw)
To: Stefan Kangas; +Cc: 59637, miha
Stefan Kangas <stefankangas@gmail.com> writes:
> miha--- via "Bug reports for GNU Emacs, the Swiss army knife of text
> editors" <bug-gnu-emacs@gnu.org> writes:
>
>> As far as I understand, the current behaviour of
>> treesit-parser-set-included-ranges is that the concatenation of text
>> from different regions in the same range set is considered as one
>> program. This means that for this html program
>>
>> <html>
>> <script>
>> /* comment start
>> </script>
>> <script>
>> alert('hello');
>> </script>
>> </html>
>>
>> treesitter would consider "alert('hello');" to be inside a comment and
>> the second script tag would contain an error about missing comment
>> end.
>>
>> However, testing this in Firefox, it seems that the first script tag is
>> the erroneous one here and the alert function call isn't inside a
>> comment. So I guess the correct way to parse this html document would be
>> to have two instances of javascript parser, one for each region. On the
>> other hand, we should consider if this is worth the added complexity and
>> performance degradation.
>>
>> Thanks and best regards.
Yeah it makes sense, but as you say the isolation comes at a cost and I
don’t know if it can be justified right now, because the complexity in
assinging different parsers for each range which can disappear/appear as
the user edits the buffer. Plus the current framework kind of assumes
one parser for each language, so we need some non-trivial change to make
"one parser per range" work smoothly.
For now, I think it’s best to just turn off error highlighting and rely
on tree-sitter’s error recovery. I think that’s what everybody else
does.
In the future if we make the framework more flexible and makes "one
parser per range" easier to implement we can try adding support for it.
>
> Copying in Yuan Fu.
Thanks :-)
Yuan
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-11-28 22:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-27 17:12 bug#59637: 29.0.50; Should treesit-range-settings support the possibility of separate parser for each region? miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-27 17:28 ` Stefan Kangas
2022-11-28 22:51 ` Yuan Fu
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.