From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stephen Leake Newsgroups: gmane.emacs.devel Subject: Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fontification feels random) Date: Sat, 24 Jul 2021 13:05:01 -0700 Message-ID: <86im0z8olu.fsf@stephe-leake.org> References: <62e438b5-d27f-1d3c-69c6-11fe29a76d74@dancol.org> <83fsxsdxhu.fsf@gnu.org> <179f22a44d8.2816.cc5b3318d7e9908e2c46732289705cb0@dancol.org> <179f38c0370.2816.cc5b3318d7e9908e2c46732289705cb0@dancol.org> <236e62c2-be9b-b26d-8cd0-4b5a1a86e19a@dancol.org> <86mtqsoh3f.fsf@stephe-leake.org> <286d815e-d1a1-07ca-6696-a7f51923ab4e@piermont.com> <86wnpl6f0y.fsf@stephe-leake.org> <865yx45y7g.fsf@stephe-leake.org> <0c575ca7-d287-4699-02bd-65822c11bf5d@piermont.com> <2e5ead63-624e-57bf-feaa-996f078fc782@dancol.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35286"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (windows-nt) Cc: emacs-devel@gnu.org, Stefan Monnier , "Perry E. Metzger" To: Daniel Colascione Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Jul 24 22:06:09 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m7NuT-0008pq-1l for ged-emacs-devel@m.gmane-mx.org; Sat, 24 Jul 2021 22:06:09 +0200 Original-Received: from localhost ([::1]:37178 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m7NuR-0007Rn-Jp for ged-emacs-devel@m.gmane-mx.org; Sat, 24 Jul 2021 16:06:07 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:53662) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7NtU-0006ch-MX for emacs-devel@gnu.org; Sat, 24 Jul 2021 16:05:09 -0400 Original-Received: from gateway21.websitewelcome.com ([192.185.45.43]:40785) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m7NtR-0002NJ-G5 for emacs-devel@gnu.org; Sat, 24 Jul 2021 16:05:08 -0400 Original-Received: from cm13.websitewelcome.com (cm13.websitewelcome.com [100.42.49.6]) by gateway21.websitewelcome.com (Postfix) with ESMTP id 3B1424024922B for ; Sat, 24 Jul 2021 15:05:04 -0500 (CDT) Original-Received: from host2007.hostmonster.com ([67.20.76.71]) by cmsmtp with SMTP id 7NtOmhKQDrJtZ7NtPmT1RJ; Sat, 24 Jul 2021 15:05:04 -0500 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=stephe-leake.org; s=default; h=Content-Type:MIME-Version:Message-ID: In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=x5fmi4g2rFxeQxlp6LrwKm+t2/UkNEofrreB5dPCCbI=; b=lYm59Sdozq9OxSxmFybThxKvuk 36h5tmOMInSbMVHDVGERv37wk8t/QNq2ZhAEM/wV1M3cUgHNFPCZAyXDejG5OWcH/3BlXaTUSQk/F V4PhcXv2q8Kk+XydJ2q2EWncYRNgnL/rJ77HnY0no2DZSWpmK4EnCiH5JwHxzWJ8XswguEsHxHy5J /8agX8W7QscyC8YatqkswMBzl9pEk4r4oSjOR/iA93zLqnPiTsFYjWl+qexGQ4VqW15FKIxAgkhMZ HmmSkR/xryYYO7MJixBXJlyIr2vX0nZNYW5OV5ytJ/iN8xQ4TUtIq8C5toIfXb4U82T80Oed9xFV+ BuGom1bw==; Original-Received: from [76.77.182.20] (port=53996 helo=Takver4) by host2007.hostmonster.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1m7NtO-002dOK-JP; Sat, 24 Jul 2021 14:05:02 -0600 In-Reply-To: <2e5ead63-624e-57bf-feaa-996f078fc782@dancol.org> (Daniel Colascione's message of "Wed, 21 Jul 2021 18:16:05 -0700") X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - host2007.hostmonster.com X-AntiAbuse: Original Domain - gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - stephe-leake.org X-BWhitelist: no X-Source-IP: 76.77.182.20 X-Source-L: No X-Exim-ID: 1m7NtO-002dOK-JP X-Source-Sender: (Takver4) [76.77.182.20]:53996 X-Source-Auth: stephen_leake@stephe-leake.org X-Email-Count: 6 X-Source-Cap: c3RlcGhlbGU7c3RlcGhlbGU7aG9zdDIwMDcuaG9zdG1vbnN0ZXIuY29t X-Local-Domain: yes Received-SPF: permerror client-ip=192.185.45.43; envelope-from=stephen_leake@stephe-leake.org; helo=gateway21.websitewelcome.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_NEUTRAL=0.779 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:271577 Archived-At: Daniel Colascione writes: > On 7/21/21 12:15 PM, Perry E. Metzger wrote: >> On 7/21/21 12:21, Daniel Colascione wrote: >>> On 7/21/21 7:43 AM, Perry E. Metzger wrote: >>>> Thought I would note that there's a substantial literature now on >>>> incremental parsing, especially the sort that is needed for editor >>>> tools. One doesn't need to reinvent the algorithms, they're out >>>> there waiting to be used. The Tree Sitter project is based on >>>> previous published work. >>> >>> There is indeed a big literature! I wish there were a bigger >>> literature on *composable* incremental parsers though. IMHO, what >>> we need is an incremental GLR system (yes, GLR is bad worst-case, >>> but it's not a practical concern) that spits out a parse *forest* >>> which we then pare down to a parse tree with ad-hoc syntactic >>> consistency rules. Something like this naturally supports >>> multi-language modes and incorporation of out-of-band semantic >>> information. >>> >> Tree sitter handles GLR. >> > > Cool. How does it prune the parse forest? wisi also uses GLR. It prunes trees during parse when the parse stacks contained in the trees are identical; it uses error recover cost and length to decide which tree to delete, or picks one at random. It's an error if more than one tree is alive at the end of parse. That's because programming languages must be unambiguous. It would be possible to adapt the wisi parser to use some other pruning strategy. -- -- Stephe