From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: parser error recovery algorithm vs treesit indentation "blinking" Date: Tue, 4 Apr 2023 16:40:10 +0300 Message-ID: <234fa18d-8b99-65c0-f4d0-161954888831@yandex.ru> References: <87lejgsf0m.fsf@gmail.com> <83pm8s70o3.fsf@gnu.org> <83mt3u65vw.fsf@gnu.org> <87y1newqus.fsf@gmail.com> <83bkka5z7w.fsf@gnu.org> <871ql6a4d4.fsf@gmail.com> <83jzyy4776.fsf@gnu.org> <9F152CAA-6326-459F-84FF-87988B3A92B6@gmail.com> <868rf8vdse.fsf_-_@stephe-leake.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10937"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Cc: Alan Mackenzie , Yuan Fu , Eli Zaretskii , theodor thornhill , geza.herman@gmail.com, Daniel Colascione , emacs-devel@gnu.org To: John Yates , Stephen Leake Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Apr 04 15:41:11 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pjguN-0002fp-T0 for ged-emacs-devel@m.gmane-mx.org; Tue, 04 Apr 2023 15:41:11 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pjgth-0001kA-Kv; Tue, 04 Apr 2023 09:40:30 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pjgtV-0001b0-Dq for emacs-devel@gnu.org; Tue, 04 Apr 2023 09:40:20 -0400 Original-Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pjgtT-0002GW-EH; Tue, 04 Apr 2023 09:40:17 -0400 Original-Received: by mail-wm1-x335.google.com with SMTP id n9-20020a05600c4f8900b003f05f617f3cso954453wmq.2; Tue, 04 Apr 2023 06:40:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680615613; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=bjfUOwDrvjPVPhlYiIFfV914NKfVpF1xk4n+oNGeqwY=; b=L+GglBZoJPI1mj4aHbHT3AeIL3vFQRcJxIq7/2X3RT//MNakYYS4IWkh3wLIDr99lj rMYYln+PeFLzZONzQnBcfh7icdCLdv0/hjIsvWUI2oQyC0YaKOfb1bX7qMxzfvnCCGUn CE6FpxZLbBNcCyIaZNrB2E3pRd0i2sWVt11O652vkCedBuAilNb22gMlNajtpuYasSoX wQ5ilMGrwUXCesnb+5b1vOJWfzhjI6SFF1YCoddBff6fHvhXP4qhkyI2l2flvpql0wy7 Pi7WNafa1qth19tIYS4DzHzRgcyIZJDueC5M7T8Ad2oQ9QcPKpPZ+OZ7n5gYb7TW8lM5 QQpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680615613; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bjfUOwDrvjPVPhlYiIFfV914NKfVpF1xk4n+oNGeqwY=; b=gn62+btjM8HAOaAZYO/C0RIkHaGg1deNrqkwU0TsaNWcHd8vmoi61XH9Tvf1kJOq9h a49285XBc8qMjQeYj7/G0rTSN8+cKo6Vriz4Ej4P+QQa0rgnZtVofc1fJETyQR/u63ln zUGvS2l0Xm0PRzZTXqRqvBceaJ/Sjry0d6jGClgkSCIuULsJs99PrEqSG6eU9GM4z9K8 mm7f1myEBEwIgQZhWItghgsdDyYeuncxU2XNLslY9ctzffgZMzlBF7XnS+VrrfW4Pleb E6jgVGW6Ap8mHVZNXN7+8MOEuOPiRUxhFZc0+KPWiyUXFKoFvxK/lR+KolTWpN+iOLfy cnMA== X-Gm-Message-State: AAQBX9fsMqCOQD7QLhQO50b3KC1ytHOk6zLzVhCTuk9nnc2pg8KQ5MHX jShigpNwh/FMN68G53eXZCI= X-Google-Smtp-Source: AKy350ZtO3MBWLvAVoFBRDaDV6zuBPOEvn7c+t7w0hNGZgVjiaaBiwWc5Mu6AngF1SRmlsV1azzHVA== X-Received: by 2002:a05:600c:378d:b0:3ef:6ae7:89bd with SMTP id o13-20020a05600c378d00b003ef6ae789bdmr2164354wmr.6.1680615613105; Tue, 04 Apr 2023 06:40:13 -0700 (PDT) Original-Received: from [192.168.1.2] ([31.216.80.60]) by smtp.googlemail.com with ESMTPSA id 24-20020a05600c231800b003ed2276cd0dsm15268533wmo.38.2023.04.04.06.40.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 04 Apr 2023 06:40:12 -0700 (PDT) Content-Language: en-US In-Reply-To: Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=raaahh@gmail.com; helo=mail-wm1-x335.google.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-1.925, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:305106 Archived-At: On 04/04/2023 15:01, John Yates wrote: > On Mon, Apr 3, 2023 at 5:49 PM Stephen Leake > wrote: >> That's because the tree-sitter >> algorithm does not insert symbols, it only skips them. > Is this a fundamental architectural limitation of tree-sitter's parsing > scheme? Was it a design decision that trying insertions would be > too costly? Or is it an improvement that should be explored? > > Has this been discussed in the wider tree-sitter community? I would > be surprised if emacs is the first to encounter this weakness. I think the answer is "it varies". E.g. this unfinished snippet actually parses without errors: int foo() { Even creating a "virtual" closing brace node, 0 characters in length. But insert this snippet twice (maybe with a different function name) - and you get errors in the parse tree. So there is some mechanism for virtual insertion. One could say that our indentation mechanism could be at fault here (a little), given that the first step is to find the node enclosing the current position of point. But the "virtual" closer is positioned to be at the end of the previous line. It's a matter of perspective, which side the fault lies at: maybe the virtual closer should be positioned at EOB, then our code would work fine with this example. This would make a lot of sense from my POV. But maybe we should adjust the logic in this particular case: when between point and EOB is only whitespace, it could look for nodes at the end of the last non-whitespace character, and consider that the current node. This will need a fair amount of testing, I think, to make sure we don't get false positives this way. Here's a relevant discussion, but there's nothing about positions in there: https://github.com/tree-sitter/tree-sitter/issues/224