From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mickey Petersen Newsgroups: gmane.emacs.bugs Subject: bug#60237: 30.0.50; tree sitter core dumps when I edebug view a node Date: Mon, 27 Feb 2023 14:29:52 +0000 Organization: Mastering Emacs Message-ID: <871qmbw55n.fsf@masteringemacs.org> References: <9FCDA5B7-D216-45B1-8051-35B05633BEFB@gmail.com> <83sfeukwsb.fsf@gnu.org> <574817C4-3FD8-43EA-B53C-B2BCB60A6D0A@gmail.com> <87a610wyod.fsf@masteringemacs.org> <875ybnwm2r.fsf@masteringemacs.org> <8A0520AE-7C8C-43D2-BE93-E80D5CC8856C@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="25770"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: mu4e @VERSION@; emacs 30.0.50 Cc: Po Lu , Eli Zaretskii , 60237@debbugs.gnu.org To: Yuan Fu Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Feb 27 15:46:59 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pWemI-0006Tx-RZ for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 27 Feb 2023 15:46:59 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pWea2-0005Ar-3d; Mon, 27 Feb 2023 09:34:18 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pWeYp-0002lm-5x for bug-gnu-emacs@gnu.org; Mon, 27 Feb 2023 09:33:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pWeYn-0005dL-QX for bug-gnu-emacs@gnu.org; Mon, 27 Feb 2023 09:33:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pWeYn-0004ms-KN for bug-gnu-emacs@gnu.org; Mon, 27 Feb 2023 09:33:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mickey Petersen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 27 Feb 2023 14:33:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 60237 X-GNU-PR-Package: emacs Original-Received: via spool by 60237-submit@debbugs.gnu.org id=B60237.167750834218353 (code B ref 60237); Mon, 27 Feb 2023 14:33:01 +0000 Original-Received: (at 60237) by debbugs.gnu.org; 27 Feb 2023 14:32:22 +0000 Original-Received: from localhost ([127.0.0.1]:46478 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pWeYA-0004lw-5r for submit@debbugs.gnu.org; Mon, 27 Feb 2023 09:32:22 -0500 Original-Received: from mail-cwlgbr01on2092.outbound.protection.outlook.com ([40.107.11.92]:60321 helo=GBR01-CWL-obe.outbound.protection.outlook.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pWeY6-0004lh-NS for 60237@debbugs.gnu.org; Mon, 27 Feb 2023 09:32:21 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KvSELHWa5dFpf9bc42aXSSPJCE4EoxgIU0II16JadcTNcUAmbJldnyDbpKmdxhGvyuxUqk0KLpHONB4PVBxf9521YxyxFkYmEMClPxN3W5gmfDvEhgDigBveJaActMR/8AgJCm5Z406qh7gvXqc1+TuTazP/XpaA80wRjXuOdeZlT97pexOpxkdeePraHLX012UvXUkETqWji7zzioIJFYqXIkP9ytXa1S6rA9M1TadJHYo1Z8pZ3pUwty8KwgUGT7aceMM57EgBDgsFEsxBXRlGhTuDv+ORGv1dAt9CrNIJwsHGXPO5tqaT6VxT88HBln3fP/YumweBRBVLgcn9mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8VzTqGTi4hCv9E5zy18CQXKefZ16G7E8BFXJj36xunA=; b=jU1I9sExMiwKsR09EWUjmW1Jaih5qZHx1X2fCaASqKDIcxL6RIpCvM2NQK2URRw5eNJ1B4L7z1at2iWe+aNl2Tuk4ZXsEIvgl30ZbOr/GEVprZ3WPDRcVXB5iqzrWZm26DM4OdHdMJjZeGff/cArGUPda3qC23OZ1wUXV9pS0F8ExATbPSJhy1oKaVwwACvXhbZ8xnXVVPqNKi0NdnBmoaw4yacMTqnk7jdnfff/19WsGXIRQQbhwC3qsh+fUyNbMdkzXBV7wOabdzt/yd6VW61EwUYx6QYzVp/i0dj0jTC0q90sIp+C/H/PO5YDzIGC7FZB8H05XJ/fx1ZnRY2OoQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 178.79.136.144) smtp.rcpttodomain=debbugs.gnu.org smtp.mailfrom=masteringemacs.org; dmarc=pass (p=none sp=none pct=100) action=none header.from=masteringemacs.org; dkim=pass (signature was verified) header.d=masteringemacs.org; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semantical.onmicrosoft.com; s=selector1-semantical-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8VzTqGTi4hCv9E5zy18CQXKefZ16G7E8BFXJj36xunA=; b=sRECdiw256l275uqs0Oyzel0OeFpENihJE+vqave5HHuLIbAws8KhE0HltrMf9+m7OPHaqCP8X0vp/n8cSN0miaNJBRhYBOQDGvB+7OAXehVMRwIiZV0Rz9Pl4tZbrBUAjjT3xhxb+ukCyfglEqkO/8aQe1SgPkIg2+rRSmjHQw= Original-Received: from LO0P265CA0001.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:355::9) by CWXP265MB2184.GBRP265.PROD.OUTLOOK.COM (2603:10a6:400:79::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.29; Mon, 27 Feb 2023 14:32:09 +0000 Original-Received: from LO2GBR01FT034.eop-gbr01.prod.protection.outlook.com (2603:10a6:600:355:cafe::c0) by LO0P265CA0001.outlook.office365.com (2603:10a6:600:355::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.28 via Frontend Transport; Mon, 27 Feb 2023 14:32:09 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 178.79.136.144) smtp.mailfrom=masteringemacs.org; dkim=pass (signature was verified) header.d=masteringemacs.org;dmarc=pass action=none header.from=masteringemacs.org; Received-SPF: Pass (protection.outlook.com: domain of masteringemacs.org designates 178.79.136.144 as permitted sender) receiver=protection.outlook.com; client-ip=178.79.136.144; helo=semantical.co.uk; pr=C Original-Received: from semantical.co.uk (178.79.136.144) by LO2GBR01FT034.mail.protection.outlook.com (10.152.42.161) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.16 via Frontend Transport; Mon, 27 Feb 2023 14:32:08 +0000 Original-Received: by semantical.co.uk (Postfix, from userid 5001) id 42A65114002; Mon, 27 Feb 2023 14:32:08 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=masteringemacs.org; s=masteringemacs.org; t=1677508328; bh=ac10OEOkOj9n8GYLWyDYVXHJyQe8Iw2WxvN+cSuWEXQ=; h=References:From:To:Cc:Subject:Date:In-reply-to:From; b=KViaJVpFbF4uy4ZSqKoTdWiKz4nSCjVuNE/gWkGVncFkbyILasltyUiepnrOt4c/Z IoHqOpErEf7Ef5O6FGu2ZtTCy1kQcSfr5K4XBHfNVG1Ed3jvcDYOmes1IH6gZnhAJT Y8gm7UrXUz6aIDR32pDhfEwK+4oYQPQNweWjWyBo= In-reply-to: <8A0520AE-7C8C-43D2-BE93-E80D5CC8856C@gmail.com> X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LO2GBR01FT034:EE_|CWXP265MB2184:EE_ X-MS-Office365-Filtering-Correlation-Id: 0f506bcd-df1b-4877-1e17-08db18cf682a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: d0HTZCPbnvByml6AvnCPSanZ3tQ1PTWk77Cl4dQa53S0vEIeU3RZPnq3c+Zc4YkmbXWNYBQAl9vl4xNUTvs7V7muHuZM+vk2zarHAc7oyujrfgh0g6YXVM5krXYHBNPX03tRYESNk2gEZFW2ysxWzYEs5Du3Z9jd1SdFlcJSf9ftZuTI1qcP4UWB6wuyFETj5cpLUjAY895RIwyC6Z+UeBBZ/YMtyk2unay3w9N5a6STyKuX1aSQM2fjnNret7e97T6jApbdjvUjRJvzrEZyCo7mixyAOMo/XV8QwxEdRt70NPxb5m7bIl+krRQ2LiNSsTRlYnM8a6XjYF+OffU+Sj8pM67miEdV9KbTbdAE0FyPYNKtvzOicrixzv/6vLPJMckpeqnbszDFtkv96clLvFxc6YhoSEqRFLMYGquYcQblQ4GmCjLRyMRNTDXbkODWLDCQoWjN0XJ1Q+pDZhk9xHfr2ZBFdNXmH0UlCr5xWxJajUS4aSQYMFH9bp20qlInS9QXpTrJGcHJ8Khdfnuf4FGNArNX/Ctjq5WGT9KSriyXc1oi1MUl4tEKM4c6TbNWjRffGn0XKfev7UWw9stNKXF0z4LM6Ej9I8y8Bl9DiPtAzck2m/kZx2d1M4Zh6H37MMik7JqrV7b31S2sWuSsO/BMK0Xoq4jvDIaVLka4BOqDosCo/l2aLRVTg7U6pILvz5bYmbxzHZe0RyTKlT2ykKjHbZ1fKCtSZB0Y60qBhqBvQihIO1zxsH/yE9mbM 5OW X-Forefront-Antispam-Report: CIP:178.79.136.144; CTRY:GB; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:semantical.co.uk; PTR:semantical.co.uk; CAT:NONE; SFS:(13230025)(39830400003)(376002)(396003)(136003)(346002)(451199018)(36840700001)(46966006)(478600001)(36916002)(2616005)(336012)(47076005)(6266002)(54906003)(42186006)(186003)(53546011)(26005)(316002)(6666004)(107886003)(70206006)(70586007)(83380400001)(8676002)(4326008)(41300700001)(356005)(6862004)(5660300002)(8936002)(7596003)(7636003)(36860700001)(36756003)(86362001)(40480700001)(82310400005)(2906002)(38230200001)(79816003)(14776008); DIR:OUT; SFP:1102; X-OriginatorOrg: masteringemacs.org X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Feb 2023 14:32:08.7476 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0f506bcd-df1b-4877-1e17-08db18cf682a X-MS-Exchange-CrossTenant-Id: a4e27e3d-bab0-45e8-8942-e64cf9fbd34f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=a4e27e3d-bab0-45e8-8942-e64cf9fbd34f; Ip=[178.79.136.144]; Helo=[semantical.co.uk] X-MS-Exchange-CrossTenant-AuthSource: LO2GBR01FT034.eop-gbr01.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CWXP265MB2184 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:256887 Archived-At: Yuan Fu writes: >> On Feb 27, 2023, at 12:22 AM, Mickey Petersen wrote: >> >> >> Yuan Fu writes: >> >>>> On Feb 26, 2023, at 1:41 AM, Mickey Petersen wrote: >>>> >>>> >>>> Yuan Fu writes: >>>> >>>>>> GC has historically never called xmalloc, so the profiler will >>>>>> likely >>>>>> crash upon growing the mark stack as well. I guess another >>>>>> important >>>>>> question is why ts_delete_parser is calling xmalloc. >>>>>> >>>>> >>>>>> As you see, when we call ts_tree_delete, it calls >>>>>> ts_subtree_release, >>>>>> which in turn calls malloc (redirected into our xmalloc). Is this >>>>>> expected? Can you look in the tree-sitter sources and verify that >>>>>> this is OK? >>>>> >>>>> I had a look, and it seems legit. In tree-sitter, a TSTree (or more >>>>> precisely, a Subtree) is just some inlined data plus a refcounted >>>>> pointer to the complete data. This way multiple trees share common >>>>> subtrees/nodes. Eg, when incrementally parsing, you pass in an old >>>>> tree and get a new tree, these two trees will share the unchanged part >>>>> of the tree. >>>> >>>> Would that mean we could possibly preserve node instances -- either >>>> the real TS ones, or an Emacs-created facsimile -- between >>>> incremental parsing? That would be useful for refactoring. >>> >>> What kind of exact interface (function) do you want? The >>> treesit-node-outdated error is solely Emacs=E2=80=99s product, tree-sit= ter >>> itself doesn=E2=80=99t mark a node outdated. It is possible for Emacs t= o not >>> delete the old tree and give it to you, or allow you to access >>> information of an outdated node. >> >> OK, so let me explain: >> >> Touching the buffer for any reason invalidates the whole tree; that's >> not good. It's not good, because a lot of the information may still be >> useful and viable. Outdating the node is not a bad idea as it avoids a >> lot of 'traps' around accidental modifications that can corrupt things >> without the developer's knowledge. >> >> I'd like to be able to access all the information possible; perhaps >> behind a flag variable like `treesit-allow-outdated-node-access'. What >> I'm really mostly interested in is: >> >> - How well the node references handle changes in byte positions in TS. > > They don=E2=80=99t handle position changes. If the buffer content changed= , we > need to reparse. Once we reparsed the buffer, a new tree is > born. While it is true that the new tree shares some node with the old > tree, tree-sitter does not expose any function or information that > tells you which node in the new tree is =E2=80=9Cthe same=E2=80=9D as whi= ch node in > the old tree; nor does it tell you whether a node in the old tree > still =E2=80=9Cexists=E2=80=9D in the new tree. > > Now, there does exist a function (in tree-sitter=E2=80=99s API) that allo= ws > you to =E2=80=9Cedit=E2=80=9D a node with position changes. But a) I=E2= =80=99m not sure how > does it handle the case where the node is deleted by the change and b) > it is not very useful because once you reparse the buffer, the new > tree is completely independent from the old tree (ignoring the > implementation detail which is not exposed). > >> >> - Does changing something at X shift (like a `point-marker`) everything >> below it? Does an outdated node correctly reference its new location >> and state, such as changes to children or its position in the tree? > > Like I said above, any buffer change will create a new tree with no relat= ion to the old tree, so there is no shifting. > > And there really isn=E2=80=99t a =E2=80=9Cnew location=E2=80=9D: we don= =E2=80=99t know if the old node > is still in the new tree. Mind you, even if the node is completely > outside of the changed region, it can still disappear from the new > tree because of change of its surrounding context. For example, in the > following C code: > > /* > int c =3D 1; > > If I insert a closing comment delimiter, and buffer becomes > > /* > int c =3D 1; > */ > > Even though int c =3D 1; is not in the changed range, nor did it=E2=80=99s > position move, all those nodes (int, c, =3D, etc) are not in the new > tree anymore, because the whole thing becomes a comment. > > I made any access to outdated nodes error because there really isn=E2=80= =99t > any good reason to use them, at least I didn=E2=80=99t think of any at the > time. And make them error out should help people catch errors. > >> >> Right now, Combobulate can make a proxy node, which essentially >> captures the basics of a live node and stores it in a defstruct. That >> way I can at least retain the start/end, type, text, etc. of a node >> and still do light refactoring without contorting myself to do things >> in a particular order, which is not always possible (like delaying >> editing to the very end.) > > IIUC, you want to do some very minor whitespace edit to the buffer > which doesn=E2=80=99t really change the parse tree, so you don=E2=80=99t = want the > nodes to be invalidated for no good reason? Not erroring on outdated > nodes is easy. As you said, we can add a > treesit-inhibit-error-outdated variable. But not it=E2=80=99s not so easy= to > automatically update outdated nodes=E2=80=99 positions (with aforemention= ed > tree-sitter function). However, if you are making those changes, you > much know how to adjust your nodes position, right? So maybe it isn=E2=80= =99t > a must-have for your purpose. It's a good point, but it's also easy to create a scenario where you at least want to keep the position and esp. the type and text (for reporting information to the user, or similar.) My main interest is now refactoring and how to best do it. If TS can do some of it, then all the better. I realise it was never meant to, but if we can continue accessing the information contained in a node even if it is outdated, then that could be useful, however niche. Currently I use overlays and point markers, but they are not infallible.