From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stephen Leake Newsgroups: gmane.emacs.devel Subject: Re: Handling extensions of programming languages Date: Tue, 30 Mar 2021 11:41:11 -0700 Message-ID: <86ft0c4h88.fsf@stephe-leake.org> References: <87o8ff560t.fsf@hajtower> <87im5lhi6i.fsf@rfc20.org> <87r1k94cnx.fsf@hajtower> <20a4ef1c-beaf-1d63-b984-12be9a856c86@gmail.com> <87h7l43fa1.fsf@hajtower> <87blbc33tm.fsf@hajtower> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1366"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (windows-nt) Cc: Stefan Monnier , emacs-devel@gnu.org To: haj@posteo.de (Harald =?utf-8?Q?J=C3=B6rg?=) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Mar 30 20:42:33 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lRJJx-0000EQ-2j for ged-emacs-devel@m.gmane-mx.org; Tue, 30 Mar 2021 20:42:33 +0200 Original-Received: from localhost ([::1]:57764 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lRJJw-0007dS-2r for ged-emacs-devel@m.gmane-mx.org; Tue, 30 Mar 2021 14:42:32 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39244) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lRJIm-0007AP-Uu for emacs-devel@gnu.org; Tue, 30 Mar 2021 14:41:20 -0400 Original-Received: from gateway22.websitewelcome.com ([192.185.46.234]:41642) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lRJIj-0005qC-VJ for emacs-devel@gnu.org; Tue, 30 Mar 2021 14:41:20 -0400 Original-Received: from cm16.websitewelcome.com (cm16.websitewelcome.com [100.42.49.19]) by gateway22.websitewelcome.com (Postfix) with ESMTP id 11DBF1A97B for ; Tue, 30 Mar 2021 13:41:15 -0500 (CDT) Original-Received: from host2007.hostmonster.com ([67.20.76.71]) by cmsmtp with SMTP id RJIgldWGsb8LyRJIglvgi5; Tue, 30 Mar 2021 13:41:15 -0500 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=stephe-leake.org; s=default; h=Content-Transfer-Encoding:Content-Type: MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=QDSvpgJz0p8YPd6wcWfjw0LE2kwCOz/JboMO3zcDiWA=; b=Ja15doQNui/WBc64QnGGA0Nsu8 wUKtkEuPSNqgJWzkGnxjwRwfFIL9U6h5FTfmuG1/3iW7D1vcFreVcdwZz0WYD9bMVPaUdpKT73uvL 927dOTXBi/Hy92V75v3wwI5Sm8B2LgV71U+sr7+U7p8Brqu50pLOM1SrVnijW8XKlF+DY1P386HTK oS2HQ+ZQ2zieyS8WQm2139i/UZJ+WvIFWZjbiqQRbAGEOH9zBKuhlN/cU1VaX+ca0F2HuqxY7eCHD 1qfQPy36Ipjfi1tzoOBtGtS3g4h6SVkLGD0CUSrg4Ps3vBBq/AaFg29F7t0kNIRsRTC2ZEK//Vua3 ZsSpQFbg==; Original-Received: from [76.77.182.20] (port=65239 helo=Takver4) by host2007.hostmonster.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1lRJIg-0016Zb-4c; Tue, 30 Mar 2021 12:41:14 -0600 In-Reply-To: <87blbc33tm.fsf@hajtower> ("Harald =?utf-8?Q?J=C3=B6rg=22's?= message of "Sun, 21 Mar 2021 16:48:53 +0100") X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - host2007.hostmonster.com X-AntiAbuse: Original Domain - gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - stephe-leake.org X-BWhitelist: no X-Source-IP: 76.77.182.20 X-Source-L: No X-Exim-ID: 1lRJIg-0016Zb-4c X-Source-Sender: (Takver4) [76.77.182.20]:65239 X-Source-Auth: stephen_leake@stephe-leake.org X-Email-Count: 3 X-Source-Cap: c3RlcGhlbGU7c3RlcGhlbGU7aG9zdDIwMDcuaG9zdG1vbnN0ZXIuY29t X-Local-Domain: yes Received-SPF: neutral client-ip=192.185.46.234; envelope-from=stephen_leake@stephe-leake.org; helo=gateway22.websitewelcome.com X-Spam_score_int: 4 X-Spam_score: 0.4 X-Spam_bar: / X-Spam_report: (0.4 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_BL_SPAMCOP_NET=1.347, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_NEUTRAL=0.779 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:267190 Archived-At: haj@posteo.de (Harald J=C3=B6rg) writes: >> For indentation, it's fundamentally harder (for the same reason that >> combining two LALR grammars doesn't necessarily give you an LALR >> grammar), so it will have to be done in a somewhat ad-hoc way. > > Indeed. Indentation needs more "context". The Gnu ELPA package 'wisi' provides a way to declare indentation in the grammar as actions; that provides all the context needed. The wisi parsers also have excellent error correction, so the grammar actions operate on a complete syntax tree (or fail utterly when the input is really bad). I have not tried to use wisi for Perl; it works for Ada and Java. This does not address your issue of extending a language with new syntax; as far as wisi is concerned, that is a new language, and needs an entirely new grammar file. This is true for any LR parser. It may not be true for a packrat parser, although the base parser would have to provide hooks in each nonterminal parsing routine. In wisi, it might be possible to extend the grammar file syntax with something like: #base_grammar but it would still generate separate parsers for the base and extended languages. As long as the extended language is a superset of the base language, it mostly doesn't hurt to always use the extended language parser. The ada-mode parser implements a language that is an extension of standard Ada 2012; that reduces conflicts and simplifies specifying indentation. One downside of using an extended parser; it will not report syntax errors for extended syntax in a file that is not supposed to contain any. For ada-mode this is not a significant problem; the extensions allow things that no Ada programmer would write even by mistake, and the real compiler catches them soon enough. > And as for indentation... I'd say the code in both modes needs to catch > up with current perl before we consider extensions. Maybe they could > share functions or regular expressions how to find the beginning of a > function, or how to identify closing braces which terminate a statement: > The specification for this logic comes from Perl and should be the same > for both modes. The reason I started the wisi package and WisiToken parser generator was to migrate ada-mode away from ad-hoc code to grammar based code, to support Ada 2012. To work well, the parser needs to be error correcting. SMIE is inherently more error tolerant than an LR parser without error correction, but I doubt it's good enough for indent. --=20 -- Stephe