unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Yuan Fu <casouri@gmail.com>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: Po Lu <luangruo@yahoo.com>,
	59426@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
	Stefan Kangas <stefankangas@gmail.com>
Subject: bug#59426: 29.0.50; [tree-sitter] Some functions exceed maximum recursion limit
Date: Thu, 24 Nov 2022 01:17:02 -0800	[thread overview]
Message-ID: <B74EC69F-7ACC-4A1D-A4AA-F724E5C244FA@gmail.com> (raw)
In-Reply-To: <6822E77F-3094-4E73-A7E7-EF5C096FC08F@acm.org>



> On Nov 23, 2022, at 12:01 PM, Mattias Engdegård <mattiase@acm.org> wrote:
> 
> 23 nov. 2022 kl. 19.46 skrev Yuan Fu <casouri@gmail.com>:
> 
>> It shouldn’t, but tree-sitter thinks some closing brackets are erroneous and skips them when parsing (it skips erroneous tokens in the hope to parse the rest of the file despite local errors). So a 10k wide tree becomes 10k tall.
>> 
>> We can submit a bug repot to tree-sitter-c (“maybe don’t skip closing brackets even there is error, or somthing”), but that’s another story.
> 
> Thanks for the explanation. In this case it seems that it's the #line directive that throws a spanner in the works. You probably already discovered that, but for the record, here is a cut-down example:
> 
> static hf_register_info hf[] = {
> #line 1 "./asn1/rrc/packet-rrc-hfarr.c"
>    { &hf_rrc_DL_DCCH_Message_PDU,
>      { "DL-DCCH-Message", "rrc.DL_DCCH_Message_element",
>        FT_NONE, BASE_NONE, NULL, 0,
>        NULL, HFILL }},
>    { &hf_rrc_cellIdentity_c_id,
>       {"Cell Identifier", "rrc.cellIdentity.c_id",
>       FT_UINT32, BASE_DEC, NULL, 0,
>       "The Cell Identifier (C-Id) part of the Cell Identity", HFILL }}
>  };
> 
> Note how the warning colour of the curly brackets vanishes once the #line line is removed.
> Even if this snag is corrected, there will always be cases where preprocessor use causes trouble of this or a similar kind. It seems quite convincing that we should void C recursion in favour of explicit stacks where possible.
> 

Does it worth the complexity tho? We only need a stack if we want to support this scenario, in which case tree-sitter has a wrong parse tree. Instead of spending the time and resource to go down that deep tree, it’s better to fail early, and let the user decide to either give up on weird files, or try some other approximation.

It’s too early to tell if being able to go down arbitrarily deep into a deep tree is useful. The only use of traversing the whole tree right now is to generate the imenu indexes, which don’t really need to go down more than 10 levels, since most defun nodes we are interested in are either top-level or near top-level.

So I’d prefer we keep it simple and have a hard limit for now. If we later find that a stack is favorable we can always add it in.

Yuan




  reply	other threads:[~2022-11-24  9:17 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21  0:53 bug#59426: 29.0.50; [tree-sitter] Some functions exceed maximum recursion limit Yuan Fu
2022-11-21  6:40 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-21  7:38   ` Stefan Kangas
2022-11-21 12:00     ` Mattias Engdegård
2022-11-21 13:55       ` Eli Zaretskii
2022-11-21 14:46         ` Mattias Engdegård
2022-11-21 16:43           ` Yuan Fu
2022-11-21 16:54             ` Eli Zaretskii
2022-11-21 17:10               ` Eli Zaretskii
2022-11-21 17:45                 ` Eli Zaretskii
2022-11-21 18:20                   ` Mattias Engdegård
2022-11-21 18:26                     ` Eli Zaretskii
2022-11-21 18:59                       ` Mattias Engdegård
2022-11-21 19:00                     ` Yuan Fu
2022-11-22  9:08                       ` Mattias Engdegård
2022-11-22 23:19                         ` Yuan Fu
2022-11-23 10:40                           ` Mattias Engdegård
2022-11-23 18:46                             ` Yuan Fu
2022-11-23 20:01                               ` Mattias Engdegård
2022-11-24  9:17                                 ` Yuan Fu [this message]
2022-11-24 10:24                                   ` Mattias Engdegård
2022-11-24 19:25                                     ` Yuan Fu
2022-11-24 19:28                                       ` Eli Zaretskii
2022-11-27  2:36                                         ` Yuan Fu
2022-11-24 10:24                                   ` Eli Zaretskii
2022-11-21 16:56             ` Mattias Engdegård
2022-11-21 17:01               ` Yuan Fu
2022-11-21 17:44               ` Eli Zaretskii
2022-11-22  1:46             ` Stefan Kangas
2022-11-22  0:27       ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-11-22  8:59         ` Mattias Engdegård
2022-11-21 13:19 ` Eli Zaretskii
2022-11-21 16:52   ` Yuan Fu
2022-11-21 17:16     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B74EC69F-7ACC-4A1D-A4AA-F724E5C244FA@gmail.com \
    --to=casouri@gmail.com \
    --cc=59426@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=luangruo@yahoo.com \
    --cc=mattiase@acm.org \
    --cc=stefankangas@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).