From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.bugs Subject: bug#61369: Problem with keeping tree-sitter parse tree up-to-date Date: Mon, 13 Feb 2023 01:10:05 -0800 Message-ID: <83A8EC13-E0E3-4280-8377-516BC23A59C5@gmail.com> References: Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.300.101.1.3\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28435"; mail-complaints-to="usenet@ciao.gmane.io" Cc: theo@thornhill.no, 61369@debbugs.gnu.org To: Dmitry Gutov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Feb 13 10:11:30 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pRUry-0007Ff-5q for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 13 Feb 2023 10:11:30 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pRUrZ-0007XB-RS; Mon, 13 Feb 2023 04:11:05 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pRUrW-0007Vd-D5 for bug-gnu-emacs@gnu.org; Mon, 13 Feb 2023 04:11:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pRUrW-0007oa-4C for bug-gnu-emacs@gnu.org; Mon, 13 Feb 2023 04:11:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pRUrW-0000MF-0o for bug-gnu-emacs@gnu.org; Mon, 13 Feb 2023 04:11:02 -0500 X-Loop: help-debbugs@gnu.org In-Reply-To: Resent-From: Yuan Fu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 13 Feb 2023 09:11:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61369 X-GNU-PR-Package: emacs Original-Received: via spool by 61369-submit@debbugs.gnu.org id=B61369.16762794301336 (code B ref 61369); Mon, 13 Feb 2023 09:11:01 +0000 Original-Received: (at 61369) by debbugs.gnu.org; 13 Feb 2023 09:10:30 +0000 Original-Received: from localhost ([127.0.0.1]:47626 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pRUr0-0000LU-Aa for submit@debbugs.gnu.org; Mon, 13 Feb 2023 04:10:30 -0500 Original-Received: from mail-pf1-f181.google.com ([209.85.210.181]:44880) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pRUqx-0000Kz-Ff for 61369@debbugs.gnu.org; Mon, 13 Feb 2023 04:10:28 -0500 Original-Received: by mail-pf1-f181.google.com with SMTP id h7so2368722pfc.11 for <61369@debbugs.gnu.org>; Mon, 13 Feb 2023 01:10:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:cc:date:message-id:subject:mime-version :content-transfer-encoding:from:from:to:cc:subject:date:message-id :reply-to; bh=/kGtskZGqNSgp4Z6oW+T3V0dzNDausqKzAyg7PZgJ3Q=; b=CvA5ADvke7gPzdB4VZ8SCk5uiX6XY8bke04l4chUzGZOdwZp0r6dAIKiL4Z/nHrPSa 14qeFr04hrFSGfyjvnwoRBct5po7+tdzPO6Q/y/fKJT8uH/OZyCHHX5NwKwfFcLfChLA SJUEbHQ1FMVfGL2k9REBMVO7LHyMux2DDDbFVtxKobJr4fTJLGY66C/J+JztZ52aOO/l nD33hZmpyxriEcCGFLAZ4UuEB8DF33H3w4tMLkPa18buhfbaNAEBan9We8GqWX3TUuyW OhvxsxLzaJs09FFQQk38kGr07F23ss+s85OqedfgDaKoueXmeL5AW3cwykGvUccvGao5 F2dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:cc:date:message-id:subject:mime-version :content-transfer-encoding:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/kGtskZGqNSgp4Z6oW+T3V0dzNDausqKzAyg7PZgJ3Q=; b=3Ml6A5aPCKxxWEs8YVkgi5X4PEaes8wKIuIHx74FENcnCyTVjC85Ld+JfY74vDXy/+ u5Z/3DPBJ+O6FFQCil2L7x/u+K0IjXcn3yOzyEO3NiiV2jxRMMiOvl+sLE/WFdSUgNmh /Po0UpsW0RXt1COwzy8rXf9bvEdz+DohedQbs/47H5pQlIUXWyC5ACVqeFWnpSa+3ebI gmjLy77q1Hq2i5sfaLQKNLbf7U6KaQkfALB1GogdDgkUXr+eZeNKDX9LeG/CfC7Mso1U oXxvePS4Zo9W8fNL5l/ddjlMfuB9IeP1BwxQcOoincsiCKV194YNgZEeG7Zl1l1Z/tPB /sHA== X-Gm-Message-State: AO0yUKXmcj/4sw8E1JUkx7ylDvWTjUv3LZUg0mSIERVEvYW7yiC3g+gU 3JkSYM3KWDQnM4AdT1wrpmA= X-Google-Smtp-Source: AK7set9+WnUpFXHplRy/ZhorDRtHDl0GWRarBbgqy3OyHmSS02/nAV6PuqlHfDtqY47nWmLiPO9LQw== X-Received: by 2002:a62:55c1:0:b0:5a0:c4b6:edd6 with SMTP id j184-20020a6255c1000000b005a0c4b6edd6mr19746290pfb.0.1676279421544; Mon, 13 Feb 2023 01:10:21 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id q22-20020a62ae16000000b005a8aab9ae7esm2429641pff.216.2023.02.13.01.10.20 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Feb 2023 01:10:20 -0800 (PST) X-Mailer: Apple Mail (2.3731.300.101.1.3) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:255472 Archived-At: Dmitry Gutov writes: > On 10/02/2023 03:22, Yuan Fu wrote: >>> I just want to confirm that I can reproduce this, and that if you = skip >>> the trailing newline from the use-statement, I don't get this = behavior. >>> So it seems like the newline is the crucial point, right? >>> >>> Yes, same. >>> >>> Thr trailing newline is necessary. >>> >>> The empty lines at the beginning of the buffer (being copied to) are = necessary to reproduce this as well. >> Hmmm, it might be related to how does tree-sitter does incremental >> parsing? If the newline is necessary, then I guess it=E2=80=99s not = because >> Emacs missed characters when reporting edits to tree-sitter. > > The newline is somewhat necessary: the scenario doesn't work, for > example, if the pasted text doesn't include the newline but the buffer > had an additional (third) one at the top. > > But the scenario also doesn't work if some other (any) character is > removed from the yanked line before pasting: it could be even one > after the comment instruction (//). > > OTOH, if I add an extra char to the yanked line, anywhere, I can skip > the newline. E.g. I can paste > > use std::path::{self, Path, PathBuf}; // good: std is a crate namee > > without a newline and still see the exact same syntax error. > > So it looks more like an off-by-one error somewhere. Maybe in our > code, but maybe in tree-sitter somewhere. Some progress report: I added a function that reads the buffer like a parser would, like this: DEFUN ("treesit--parser-view", Ftreesit__parser_view, Streesit__parser_view, 1, 1, 0, doc: /* Return the view of PARSER. Read buffer like PARSER would into a string and return it. */) (Lisp_Object parser) { const ptrdiff_t visible_beg =3D XTS_PARSER (parser)->visible_beg; const ptrdiff_t visible_end =3D XTS_PARSER (parser)->visible_end; const ptrdiff_t view_len =3D visible_end - visible_beg; char *str_buf =3D xzalloc (view_len + 1); uint32_t read =3D 0; TSPoint pos =3D { 0 }; for (int idx =3D 0; idx < view_len; idx++) { const char *ch =3D treesit_read_buffer (XTS_PARSER (parser), idx, pos, &read); if (read =3D=3D 0) { xfree (str_buf); xsignal1 (Qtreesit_error, make_fixnum (idx)); } else str_buf[idx] =3D *ch; } Lisp_Object ret_str =3D make_string (str_buf, view_len); xfree (str_buf); return ret_str; } After I follow the steps and got the error node, I run this function on the parser, and the returned string looks good. Next I=E2=80=99ll try to log every character actually read by the parser = and see if anything seems fishy. Yuan