From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#61369: Problem with keeping tree-sitter parse tree up-to-date Date: Wed, 15 Feb 2023 04:17:29 +0200 Message-ID: <9c4e551b-42b3-8202-ccff-fb8170b616a6@yandex.ru> References: <1AC63591-F4EF-411F-B554-7CD38B4B4888@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35998"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Cc: theo@thornhill.no, 61369@debbugs.gnu.org To: Yuan Fu Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Feb 15 03:18:16 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pS7NA-0009El-CG for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 15 Feb 2023 03:18:16 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pS7Mx-0005vz-Qh; Tue, 14 Feb 2023 21:18:03 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pS7Mw-0005vq-O0 for bug-gnu-emacs@gnu.org; Tue, 14 Feb 2023 21:18:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pS7Mw-0007un-DZ for bug-gnu-emacs@gnu.org; Tue, 14 Feb 2023 21:18:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pS7Mv-0002S0-RD for bug-gnu-emacs@gnu.org; Tue, 14 Feb 2023 21:18:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 15 Feb 2023 02:18:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61369 X-GNU-PR-Package: emacs Original-Received: via spool by 61369-submit@debbugs.gnu.org id=B61369.16764274599387 (code B ref 61369); Wed, 15 Feb 2023 02:18:01 +0000 Original-Received: (at 61369) by debbugs.gnu.org; 15 Feb 2023 02:17:39 +0000 Original-Received: from localhost ([127.0.0.1]:57483 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pS7MZ-0002RL-2G for submit@debbugs.gnu.org; Tue, 14 Feb 2023 21:17:39 -0500 Original-Received: from mail-ej1-f45.google.com ([209.85.218.45]:36463) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pS7MY-0002R8-1Z for 61369@debbugs.gnu.org; Tue, 14 Feb 2023 21:17:38 -0500 Original-Received: by mail-ej1-f45.google.com with SMTP id a3so10697291ejb.3 for <61369@debbugs.gnu.org>; Tue, 14 Feb 2023 18:17:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=4rOg51csOaz7slIQfB+JUREFaCifPXSm/P7mlS7Vd6A=; b=qDvOn90YAWYI8ANEjwPHJvw0x5CXZnuoXHFPN8rt/nFiZMi7JDXNyA/Y+n9eG26tUm V0rO0gl7NGPXJQPSuzHElSHaegmUgopgBh0wbehU0XS1kfyo0VBnhwtx5/6xmapCFhgF 5f8CUds18OxFTbjmgcoNyLyab0ZPiAMSL4RP4DKufhWA9Y9mxoeluLGZbucldpGLmlAl Xon3GLCHxEyIucfHW1SE/1oFGarcsM3YKaItGhYM4pchDQhGm/gIOmyvG947Snf9BAjB zwCuX+9Umw0USHYX9I6TtIZ8/YVd+zwuZjmZDFwVAawhKumGNc67L1/LTo/sgbZOB71T M4AQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4rOg51csOaz7slIQfB+JUREFaCifPXSm/P7mlS7Vd6A=; b=n+CaZbbIPq8sZwvT2Burq9PV0SrzYauzzHrBkhqzU6S+4IHhUfNL7KkTdJU8X3uRCv Sv6Lp8VC00MvjRcyeXFaSXJfqBVYHvnG8WcvpJHKkNaL5Kno0x6wb62eNd4sQ4Sb/tAu ggv+fVH64C1C4D8F+VhJEMEKXsh+5tCVduvEie8mpS7AUHWMzHUAvYuzGZUkXvWhRdci kbfOH576adc3BIi/yCQN/7ctdsGy/Rd4gTHh8iHIDYSH108nbwUpycx2KDHvf89MioUb Mx75efHEDuM+Fyb93u1/GcNcEkaZPSImCk9hb3I5sNFhmtY7xcmd0QgjwEW4x7mriE3U GhVw== X-Gm-Message-State: AO0yUKWtwPveQYUoCE6f/oFDPJCUiEd/NpBmcUjJxMf0VhE2wfNOJyv9 Is9y6X6pKnk+tG9wHkgw5/U= X-Google-Smtp-Source: AK7set+nLfPkcRxhD0BL+t0CBphnN7qLi1w+01NA+LQ266cmhi/sOosiCzLGv2N8M51AyPoz9N3fFw== X-Received: by 2002:a17:906:850e:b0:7c1:765:9cfc with SMTP id i14-20020a170906850e00b007c107659cfcmr961666ejx.34.1676427451945; Tue, 14 Feb 2023 18:17:31 -0800 (PST) Original-Received: from [192.168.0.2] ([46.251.119.176]) by smtp.googlemail.com with ESMTPSA id bp8-20020a170907918800b008806a3c22c5sm1876978ejb.25.2023.02.14.18.17.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 14 Feb 2023 18:17:30 -0800 (PST) Content-Language: en-US In-Reply-To: <1AC63591-F4EF-411F-B554-7CD38B4B4888@gmail.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:255659 Archived-At: On 14/02/2023 01:59, Yuan Fu wrote: > There are two surprises here: 1) there isn’t an off-by-one bug, 2) the > parser actually read the whole buffer, rather than reading only the new > content. Then there are even less reason for it to create that error > node. The parser reads the whole buffer, but if it tries to reparse based on the previous parse tree with incorrect positions, it might get into an invalid state as a result. I've tried gdb-ing treesit_tree_edit_1 (after dropping the 'inline' qualifier), and here's what I see: - If I paste the test line without the trailing newline or not, the value. - If I paste the test line with the trailing newline, the value of new_end_byte is still 67. But then it is followed by this call right away: Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=tree@entry=0x5555574139b0, start_byte=start_byte@entry=134, old_end_byte=old_end_byte@entry=134, new_end_byte=135) at treesit.c:739 - If I 'undo' after that, the call is as expected: Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=0x555557435cd0, start_byte=start_byte@entry=0, old_end_byte=old_end_byte@entry=68, new_end_byte=new_end_byte@entry=0) at treesit.c:739 739 { So I tried again to figure out the odd call, with the backtrace: Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=tree@entry=0x5555575b64f0, start_byte=start_byte@entry=134, old_end_byte=old_end_byte@entry=134, new_end_byte=269) at treesit.c:739 739 { (gdb) backtrace #0 treesit_tree_edit_1 (tree=tree@entry=0x5555575b64f0, start_byte=start_byte@entry=134, old_end_byte=old_end_byte@entry=134, new_end_byte=269) at treesit.c:739 #1 0x00005555557cb085 in treesit_sync_visible_region (parser=parser@entry=XIL(0x555556fc329d)) at treesit.c:931 #2 0x00005555557ccf28 in treesit_ensure_parsed (parser=XIL(0x555556fc329d)) at treesit.c:1025 #3 Ftreesit_parser_root_node (parser=XIL(0x555556fc329d)) at treesit.c:1507 treesit.c:739 points to a treesit_tree_edit_1 call which is predicated on this condition: if (visible_end < BUF_ZV_BYTE (buffer)) ...which shouldn't be the case since the buffer is small enough to fit in the default window. It might already be the consequence of passing the wrong value of new_end_byte to ts_tree_edit, though. Going back to the first call, the backtrace looks like this: Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=0x5555574f0ff0, start_byte=start_byte@entry=0, old_end_byte=old_end_byte@entry=0, new_end_byte=new_end_byte@entry=67) at treesit.c:739 739 { (gdb) backtrace #0 treesit_tree_edit_1 (tree=0x5555574f0ff0, start_byte=start_byte@entry=0, old_end_byte=old_end_byte@entry=0, new_end_byte=new_end_byte@entry=67) at treesit.c:739 #1 0x00005555557cc991 in treesit_record_change (start_byte=1, old_end_byte=1, new_end_byte=69) at treesit.c:806 #2 0x00005555556f8bb7 in insert_from_string_1 (string=XIL(0x55555744c4f4), pos=0, pos_byte=0, nchars=68, nbytes=68, inherit=, before_markers=false) at insdel.c:1084 Seems like treesit_record_change turns new_end_byte=69 into new_end_byte=67 inside treesit_tree_edit_1. It seems to fail in this calculation: ptrdiff_t new_end_offset = (min (visible_end, max (visible_end, new_end_byte)) - visible_beg); because visible_end is still 68 there. It value gets updated later, closer to the end of this function.