From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.bugs Subject: bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree Date: Wed, 14 Dec 2022 12:27:58 -0800 Message-ID: References: <0998189C-4A9E-4B27-A8A0-D208D11E9A39@gmail.com> <83cz8mnq6g.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5652"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 60054@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Dec 14 21:29:24 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p5YNW-0001Bq-Tv for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 14 Dec 2022 21:29:23 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p5YNH-00042W-4K; Wed, 14 Dec 2022 15:29:07 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p5YNE-00042G-A9 for bug-gnu-emacs@gnu.org; Wed, 14 Dec 2022 15:29:04 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1p5YND-0001BW-SK for bug-gnu-emacs@gnu.org; Wed, 14 Dec 2022 15:29:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1p5YNC-0001No-Ev for bug-gnu-emacs@gnu.org; Wed, 14 Dec 2022 15:29:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Yuan Fu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 14 Dec 2022 20:29:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60054 X-GNU-PR-Package: emacs Original-Received: via spool by 60054-submit@debbugs.gnu.org id=B60054.16710496905309 (code B ref 60054); Wed, 14 Dec 2022 20:29:02 +0000 Original-Received: (at 60054) by debbugs.gnu.org; 14 Dec 2022 20:28:10 +0000 Original-Received: from localhost ([127.0.0.1]:41411 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1p5YML-0001NY-9d for submit@debbugs.gnu.org; Wed, 14 Dec 2022 15:28:10 -0500 Original-Received: from mail-pl1-f177.google.com ([209.85.214.177]:38777) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1p5YMH-0001N7-Mi for 60054@debbugs.gnu.org; Wed, 14 Dec 2022 15:28:07 -0500 Original-Received: by mail-pl1-f177.google.com with SMTP id s7so4593038plk.5 for <60054@debbugs.gnu.org>; Wed, 14 Dec 2022 12:28:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TybkDbOsOzP/CZtOip5Rklkkq/aZ8GbRE8aifEmb4qw=; b=Bm+/4KNscVMmaYM3fD9DTovpM+XmqaoTWFnmQTE735AQabMXf54gJc1sIDWQhyUAIL uPPUcGotNplBT/uJzNTQcfkciWeMslTD7f8b4dBTCz4FxSf01Y1ad7pc/1rFenCRYG8a g99J4ORtH7tkc+B0bCtvQp0CROjQLvdNf3ePVvL+Ga6bM3gLMjx/iBH/0M/VDlEyEtEl 42KBuENCt/16STXedeJ9+Farf2jPJrT5ukMB2w6OZ2NwoptmNzVkQeToZmHacS15zSIN WHE1kD3f/Y+boc/fDwvbyvRBIqLUBrnIBNrbBB0Oy6YBjuL7xnYA+wVQghAS1EELU72m UY+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TybkDbOsOzP/CZtOip5Rklkkq/aZ8GbRE8aifEmb4qw=; b=mwse2gmqE53iAmPPM4umaEd8RBjl+o0dMto1FfEVBQdzkoOzrzKhtbFiBf+tkBeM/E Y0Li3Q5anLPErF8/gvylOtTq8dQyf19pobnLHlIG6iHXUbh9v8s9Uc4okSUGUpFSEdwp sZZnHr5LpWhRWZCRSK66fLJ6qBkL5gay9EgUInCDl5hT85uu7WXds4rEbzhvwbSF+Zyk p5wA72+ykrCZsR55N3Y6zVTJevsYBb7pKbowF7VmlrxAc+S35m0Ye02LkctiFB06a+zv EpH5p9aGGPt02yUy1ZpZukUFSve/8S8D36mZiwNHnsN+0h+vDgqB+ApYk22u3Hi9vtp1 r3Eg== X-Gm-Message-State: ANoB5pmemgD6+QychIsu6aRu0nWkBjl+yZBxLwD25Cmn4PdyLwznTNI0 DpDQf9bcOyufFlKLG2IAcY6GdatF7eazjw== X-Google-Smtp-Source: AA0mqf5aXLUd7XlNwtZpFP4WggY8T3SJ9t3f8IFtddZBA0RsghtQQllkroNIA+ubFUVSV+wuKNjUIw== X-Received: by 2002:a17:902:a38e:b0:189:76ef:a1b0 with SMTP id x14-20020a170902a38e00b0018976efa1b0mr26606122pla.57.1671049679753; Wed, 14 Dec 2022 12:27:59 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id w20-20020a170902ca1400b0018930dbc560sm2280234pld.96.2022.12.14.12.27.59 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 Dec 2022 12:27:59 -0800 (PST) In-Reply-To: <83cz8mnq6g.fsf@gnu.org> X-Mailer: Apple Mail (2.3696.120.41.1.1) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:250995 Archived-At: > On Dec 14, 2022, at 4:08 AM, Eli Zaretskii wrote: >=20 >> From: Yuan Fu >> Date: Tue, 13 Dec 2022 16:11:01 -0800 >>=20 >>=20 >> This is not really an Emacs bug, but either tree-siter-c or >> tree-sitter=E2=80=99s. I=E2=80=99m putting it out here so that if = I=E2=80=99m hit by a bus >> tomorrow, and treesit-search-forward-goto and friends hang,=20 >> we (eh, you) know what=E2=80=99s going on. >>=20 >> I=E2=80=99ve submitted an issue here: >> https://github.com/tree-sitter/tree-sitter-c/issues/119 >>=20 >> So far, I=E2=80=99ve only observed this in that specific edge case. >=20 > We should have protection against that, which should be easy, right? Just to make sure, we want to use something like slow-fast pointers, = where we have two pointers, and one goes twice as fast, right? That=E2=80=99= s the one I was taught in school :-) The author advices to use cursors for traversing the tree, as cursors = doesn=E2=80=99t have this bug. He also advised against using = ts_node_parent and said that they could be removed in the future. I = didn=E2=80=99t use cursors in the first place because they can=E2=80=99t = traverse the tree backwards, ie, no equivalent of ts_node_prev_sibling, = and the performance difference is not significant in Emacs settings. But = I just went to look at the source, and it seems = ts_node_prev_siblings(node) is implemented by just iterating each = children from first to last, until it finds the child just before NODE, = LOL[1]. I can do similar things in treesit.c with cursors. By doing that = we can fix this problem and be future-proof. In summary, I=E2=80=99m proposing:=20 1. I add the slow-fast pointer checks in treesit.c and treesit.el 2. I replace ts_node_parent/sibling/child with using cursors in = tree-traversal functions in treesit.c. Yuan [1] static inline TSNode ts_node__prev_sibling(TSNode self, bool = include_anonymous) { Subtree self_subtree =3D ts_node__subtree(self); bool self_is_empty =3D ts_subtree_total_bytes(self_subtree) =3D=3D 0; uint32_t target_end_byte =3D ts_node_end_byte(self); TSNode node =3D ts_node_parent(self); TSNode earlier_node =3D ts_node__null(); bool earlier_node_is_relevant =3D false; while (!ts_node_is_null(node)) { TSNode earlier_child =3D ts_node__null(); bool earlier_child_is_relevant =3D false; bool found_child_containing_target =3D false; TSNode child; NodeChildIterator iterator =3D ts_node_iterate_children(&node); while (ts_node_child_iterator_next(&iterator, &child)) { if (child.id =3D=3D self.id) break; if (iterator.position.bytes > target_end_byte) { found_child_containing_target =3D true; break; } if (iterator.position.bytes =3D=3D target_end_byte && (!self_is_empty || = ts_subtree_has_trailing_empty_descendant(ts_node__subtree(child), = self_subtree))) { found_child_containing_target =3D true; break; } if (ts_node__is_relevant(child, include_anonymous)) { earlier_child =3D child; earlier_child_is_relevant =3D true; } else if (ts_node__relevant_child_count(child, include_anonymous) = > 0) { earlier_child =3D child; earlier_child_is_relevant =3D false; } } if (found_child_containing_target) { if (!ts_node_is_null(earlier_child)) { earlier_node =3D earlier_child; earlier_node_is_relevant =3D earlier_child_is_relevant; } node =3D child; } else if (earlier_child_is_relevant) { return earlier_child; } else if (!ts_node_is_null(earlier_child)) { node =3D earlier_child; } else if (earlier_node_is_relevant) { return earlier_node; } else { node =3D earlier_node; } } return ts_node__null(); }