From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.bugs Subject: bug#61369: Problem with keeping tree-sitter parse tree up-to-date Date: Sat, 18 Feb 2023 02:05:04 -0800 Message-ID: <6FF39BA3-247A-46D9-B3A1-ECFE17A09778@gmail.com> References: <1AC63591-F4EF-411F-B554-7CD38B4B4888@gmail.com> <9c4e551b-42b3-8202-ccff-fb8170b616a6@yandex.ru> <7751EE35-F5FF-418B-AF28-F1FF5ECEF3AE@gmail.com> <52d15d7e-82e9-ca7b-be16-0ccf89d5053c@yandex.ru> <7ee28606-18cc-ce4f-e601-3954489c4f4c@yandex.ru> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.300.101.1.3\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39605"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Theodor Thornhill , 61369@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 18 11:06:21 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pTK6n-000A6h-4O for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 18 Feb 2023 11:06:21 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pTK6W-0004XG-09; Sat, 18 Feb 2023 05:06:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pTK6U-0004X1-Jw for bug-gnu-emacs@gnu.org; Sat, 18 Feb 2023 05:06:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pTK6U-0003Hq-B0 for bug-gnu-emacs@gnu.org; Sat, 18 Feb 2023 05:06:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pTK6U-0006fe-5x for bug-gnu-emacs@gnu.org; Sat, 18 Feb 2023 05:06:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Yuan Fu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 18 Feb 2023 10:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61369 X-GNU-PR-Package: emacs Original-Received: via spool by 61369-submit@debbugs.gnu.org id=B61369.167671472525582 (code B ref 61369); Sat, 18 Feb 2023 10:06:02 +0000 Original-Received: (at 61369) by debbugs.gnu.org; 18 Feb 2023 10:05:25 +0000 Original-Received: from localhost ([127.0.0.1]:42541 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pTK5s-0006eY-Lm for submit@debbugs.gnu.org; Sat, 18 Feb 2023 05:05:25 -0500 Original-Received: from mail-pj1-f51.google.com ([209.85.216.51]:44781) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pTK5r-0006eL-A7 for 61369@debbugs.gnu.org; Sat, 18 Feb 2023 05:05:23 -0500 Original-Received: by mail-pj1-f51.google.com with SMTP id pw17-20020a17090b279100b00236a0d55d3aso919415pjb.3 for <61369@debbugs.gnu.org>; Sat, 18 Feb 2023 02:05:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8crWgUWJa70DqUiHhq8795aGrWnhA7KpyuPTqHK9w80=; b=U7NJPGzYVdAVufo5eufaKHMiRp9GFNTA7Li2BamV+zZJjCMRNkjTjRpOlr56bLdhQO Jb4M8IDBGn8qfWdfMSY3Is10BS4ow0mBI678yFR4XcF2Yzv12Zhm+1K/i0m7tABVXLJw 2eYaDhUtld4ZsmWfsHToTWch9kcWGU1Qw7LXyWgs4rvezAABtiCE5WzkvE6zwwgytt/g saveoGeL22Gbb6IGrJ8P3nw1hWL3znsBUhevnoofzWOt7zegYLY6FR67AP4xw9BReVeX dTuO4nxJUalrtHos7nGGc4F+1xaCvsypD8tESYZ1d/4oZV+8jHBZ9Vp9t68HCCFSJ9KS jfWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8crWgUWJa70DqUiHhq8795aGrWnhA7KpyuPTqHK9w80=; b=VefcVWhsF7WXluH77X8DiZ06b36L3XjHR4deYJ9yf2XMol7sqD31n4Id1OE/zNCuas 0XKNvA3lamXaNJqjOJt5L9UitiXv11juvpw4Xj8vGAcQNOSt8dSI7E/9g7uceFNeQuAd xuh/eFmXCPY/ZRuAK9ID46LQOYwSvU6sgtpbz4GUXsR2qJlK0aFRpP3qL4ZPEyexcyGm 54vxQJ713uht0Uv6lST59aa/tId71HytL/3BkXM1nOXGJyaIbswzc7Sq2qwD9DIqc5FC UWWRiec/M+quvz8jJq8BztwgN8/tJaR7sZDThpmeEQrPAFe5K89PFWoSqB8z27BHCby8 HZTg== X-Gm-Message-State: AO0yUKWA5SxmtKzjP6eK3Xg3a34r5i8HNecQrA6l50BR1K/V3t2xeQvB wz0kcyOtJHPiAeO3e3TrVeg= X-Google-Smtp-Source: AK7set/uO3PUAy5nAkKyoxs144TGHL14eTCjn9vGsRifcvbA73j6kttLAAxtp1iEt2dko3hfrOM1BQ== X-Received: by 2002:a17:90b:4c49:b0:234:eeb:8df4 with SMTP id np9-20020a17090b4c4900b002340eeb8df4mr132105pjb.14.1676714717194; Sat, 18 Feb 2023 02:05:17 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id a12-20020a17090aa50c00b002341ae23ad7sm773742pjq.1.2023.02.18.02.05.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 Feb 2023 02:05:16 -0800 (PST) In-Reply-To: <7ee28606-18cc-ce4f-e601-3954489c4f4c@yandex.ru> X-Mailer: Apple Mail (2.3731.300.101.1.3) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:255950 Archived-At: > On Feb 17, 2023, at 5:25 PM, Dmitry Gutov wrote: >=20 > On 18/02/2023 03:14, Yuan Fu wrote: >>> On Feb 17, 2023, at 4:11 PM, Dmitry Gutov wrote: >>>=20 >>> On 18/02/2023 00:32, Yuan Fu wrote: >>>> Thank you very much! I thought that clipping the change into the = fixed visible range, and rely on treesit_sync_visible_region to add back = the clipped =E2=80=9Ctail=E2=80=9D when we extend the visible range = would be equivalent to not clipping, but I guess clipping and re-adding = affects how incremental parsing works inside tree-sitter. >>>=20 >>> It seems like the "repairing" sync used a different range, one that = didn't include the character number 68 inserted from the beginning. >>>=20 >>> It just synced the 1 or 2 characters at the end of the buffer, the = difference between the computed visible_end and the actual BUF_ZV_BYTE. >> That should be enough, no? Because other text didn=E2=80=99t change, = they just moved. And tree-sitter should know that they moved. Or maybe = I=E2=80=99m misunderstanding what you mean. >=20 > But the "unsynced" character is at position 68. >=20 > And we just tell tree-sitter to update positions 134-136. So it stays = ignorant of the changed char in the middle of the buffer. >=20 > It's not just about not knowing about the change either (the character = in question is a newline, so its absence wouldn't lead to a syntax = error), but about wrong offsets in the old parse tree, based on which = the new tree is generated. That probably creates a wrong picture of the = source text in the parser. Ok, I made some visualization to understand it, and yeah you are right. = I=E2=80=99ll need to modify the comment a bit. |visible range| updated range ------------- |aaaaaa| |bbbbbbbbaaaa|aa start: 0, old_end: 0, new_end: 6 ------ =20 |bbbbbbbbaaaaaa| start: 12, old_end: 12, new_end: 14 -- >=20 >>>> I don=E2=80=99t think this change would have any adverse effect, = because if you think of it, inserting text in a narrowed region always = extends the region, rather than pushing text at the end out of the = narrowed region. So the right thing to do here is in fact not clipping = new_end_offset. >>>=20 >>> I figured it could be a problem if both old_end_byte and = new_end_byte extend past the current restriction. >> That should be fine (ie, technically correct), since when we widen, = the clipped text are reparsed by tree-sitter as new text. >=20 > I guess the effect I was thinking of is that >=20 > XTS_PARSER (lisp_parser)->visible_end >=20 > would end up with a higher value than BUF_ZV_BYTE. Not sure if it's a = problem. It shouldn=E2=80=99t be, since BUF_ZV_BYTE should automatically grow = when user inserts text. Even if it does, we always call = treesit_sync_visible_region to sync up visible_beg/end with = BUF_(Z)V_BYTE before parsing, so it shouldn=E2=80=99t be a problem. Yuan=