From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: Questions about tree-sitter Date: Thu, 7 Sep 2023 20:11:06 -0400 Message-ID: References: <12fe5895-7d34-4f3e-b1cf-aa133b718c24@mailo.com> <581816B0-2F41-42C9-B49A-70F7DD800212@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27019"; mail-complaints-to="usenet@ciao.gmane.io" Cc: =?UTF-8?Q?Augustin_Ch=C3=A9neau_=28BTuin=29?= , emacs-devel@gnu.org To: Yuan Fu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Sep 08 02:12:31 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qeP6t-0006qL-Qp for ged-emacs-devel@m.gmane-mx.org; Fri, 08 Sep 2023 02:12:31 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeP5o-0007dX-T1; Thu, 07 Sep 2023 20:11:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeP5n-0007dP-QX for emacs-devel@gnu.org; Thu, 07 Sep 2023 20:11:23 -0400 Original-Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeP5l-0003JR-Ix for emacs-devel@gnu.org; Thu, 07 Sep 2023 20:11:23 -0400 Original-Received: by mail-pj1-x102c.google.com with SMTP id 98e67ed59e1d1-269304c135aso1149072a91.3 for ; Thu, 07 Sep 2023 17:11:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694131880; x=1694736680; darn=gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=J+pvR4nk7E84nCW1g+AuNIfQ3KAHgUW75mq7BXGPZmY=; b=jNOc7ad/Yyj3gzyKSPKeptIE8k6d3NwDv4i+ACpi/Hw9c1SC24lalriOIRiB8bLkb6 HC1KW2YBGdt4MNyPDDKKgoNI5fY+z9kZoVLRMwrfqY1s7jsZ9kfKGmSLBl/pHwVCeSAU zYh9xbfUisJrEUnOfD6eKNDnP67tNz6mVG9KRFUiugSQsSb9I2i1lPZy8gm/0tbjL2GE +bhCk95nonnzdH8o9bhlX6VfSHm4EBB/EvEiCSbKs8+gIUI9/9kBEP3a6yIRagzlaZmW fAUi6rE0J34SknANGEUETJvFXIzLFFVj71hWRNpM2bvCEKjZ9edNwNZfwWLQv+aDe8JF otWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694131880; x=1694736680; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J+pvR4nk7E84nCW1g+AuNIfQ3KAHgUW75mq7BXGPZmY=; b=w4Avrn9FRMFraqdy6HDjMKCF7OH5IhCH3gMjQGhOUqYP8Qa3YDhVB4CdVZ951U3mrE ORzVIJCXnJzihjSXqOUZAyaWQa6O6MS2M/YBofJ9nrB+O5BJQPIB79iuyyVmgsk4ewsW Odtm8yAcC2LBcVf/kZaMkiWw7uDZYpRP3iUOZ6PAr1J3IH7f56SSaLVOE8Xo2WuRdbjk Y5mfUw5tCYlZF2qjupAHlO7d8e0Uz1HgHNZzerrW9hSSMgZr/zcX6/58H3jKQQ8oKNSz tXfWZUDNE05ME/4Q6PGOIEUGM188R4VUUdmBYqNMii74VKJbcRIzlda+qfoduF3NZOph fuqQ== X-Gm-Message-State: AOJu0YxTPDr+6ucBenI+UeNBiYKF32AVCS6LoKOyQrf4pAd2SJk0afqJ FMSI6N4UWtzvSR+rAFrO+pE3dvuOIt/z43FQgvg= X-Google-Smtp-Source: AGHT+IG/n5Jnyf0EGjGMJI1LGbvhBIxVc5fAG2FIFqTfdIcw/qhpcdmCgwHv9lBQ3N52anVm6fmrQeq8yEuWvYG6tH8= X-Received: by 2002:a17:90a:9484:b0:26b:c5b:bb44 with SMTP id s4-20020a17090a948400b0026b0c5bbb44mr1174730pjo.13.1694131879687; Thu, 07 Sep 2023 17:11:19 -0700 (PDT) In-Reply-To: <581816B0-2F41-42C9-B49A-70F7DD800212@gmail.com> Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=owinebar@gmail.com; helo=mail-pj1-x102c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:310298 Archived-At: On Thu, Sep 7, 2023 at 7:42=E2=80=AFPM Yuan Fu wrote: > > On Sep 6, 2023, at 9:11 AM, Lynn Winebarger wrote: > > > > On Wed, Aug 30, 2023 at 3:03=E2=80=AFAM Yuan Fu wro= te: > >>> Is it possible to reload a grammar after modifying it? > >> > >> No, and it=E2=80=99s probably not easy to implement either, since unlo= ading the grammar would require Emacs to purge/invalid all the node/query/p= arsers using that grammar. > > > > [ ... ] > > Therefore, given functionality to translate elisp data into the raw C > > structures, we should be able to dynamically create language data > > structures to pass to the tree-sitter library to create a library. > > We would also need a table driven lexer framework in place of the > > generated lexer in the C file to completely avoid going through a C > > compiler. > > The other novel features of tree-sitter parsers appear to be > > implemented in the parser runtime, not in the table calculation. > > > > I've implemented LALR(1) parser generators two or three times in the > > last couple of decades, this might be a fun project for me while I am > > unambiguously able to contribute to GNU Emacs. > > That=E2=80=99ll be great. But note that the parser structure has scape ha= tches: certain things can be implemented by arbitrary C function. Also tree= -sitter allows grammars to use custom scanners [1]. > My primary interest is in using the tree-sitter parser framework with the grammars and lexers constructed for Semantic in elisp. That's the strongest use-case. That can be done by a single library implementing a generic table-driven scanner function. For other cases, it's a mixed bag. If only the grammar changes, and all C code is fixed, then modifications to the grammar could be reloaded. If this feature was really important to the user, they could probably implement the C code to call Elisp functions that could be updated dynamically, at least during development. But you are correct that this will not solve the problem for arbitrary tree-sitter language definitions with embedded C code. For use in emacs, the user might implement any required functions in a dynamic module that could be loaded and unloaded separately from the tree-sitter language library. But that will not happen with the parser.c produced by the tree-sitter cli tool. Lynn