From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Questions about tree-sitter Date: Tue, 5 Sep 2023 21:07:58 -0700 Message-ID: <2B46C452-DC8B-4BD0-A64B-8773235C1FA8@gmail.com> References: <12fe5895-7d34-4f3e-b1cf-aa133b718c24@mailo.com> <52f09345-85c8-4049-b12d-bf8b84b08f75@mailo.com> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_059004B4-87FA-4EEB-8E39-172406A4A128" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13571"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: =?utf-8?B?IkF1Z3VzdGluIENow6luZWF1IChCVHVpbiki?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Sep 06 06:09:14 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qdjqq-0003KB-HS for ged-emacs-devel@m.gmane-mx.org; Wed, 06 Sep 2023 06:09:12 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qdjpy-0002Sa-O3; Wed, 06 Sep 2023 00:08:18 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qdjpw-0002S6-Rj for emacs-devel@gnu.org; Wed, 06 Sep 2023 00:08:16 -0400 Original-Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qdjpt-0007Ms-SP for emacs-devel@gnu.org; Wed, 06 Sep 2023 00:08:16 -0400 Original-Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-68a3e943762so2818651b3a.1 for ; Tue, 05 Sep 2023 21:08:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1693973292; x=1694578092; darn=gnu.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=CXdT5Lx1/LZTJq4bJ6fUrMhKnHg9RBCCOeR6NNGeap0=; b=NwMZdVTBFGgDQ2FEv14UpkAp6qqg4Qh+yZgmPOIm33numJsqi//MEtoM5QExlZhK7+ HrjP602zCcNEqz83IcPSkPJre6FZLKRMrHBx7/nVklw2LJ2mI6LRoztNo9BVMAFdP8VV pF0mQ0ZRcd3o7J1GJkxj2jIy7UPUbYmaeOd+BAXeSFZ7I4DgTIMhKZDTtShpDLAVGkJk ALw+qS2TLfZH3OMOpH0McbR2yNQXLal7Fv9Art8aOClFqAfCGIBnyYqemSQSkkhYhkFr xJ9NgC7hM8uNMNvkjZjWpQR47JFOlYlwOSqYLFl5CjdgnHGtXtGd40+p2KND9heXNELB RpJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693973292; x=1694578092; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CXdT5Lx1/LZTJq4bJ6fUrMhKnHg9RBCCOeR6NNGeap0=; b=VkUYssXWzVW/hMc0lwzA5e0GokNJCUMhDiHt+BThD2YIY59ZFaKiVquw4AXNtmvrVT xknhio17tHI1+mfMsX5gqnijTYC0kxGpwpE9q3SqpP14HfAJf9aEjThb4cFBXzLFy6lm cjhV6SuIXO62NnRck2R/L970UKd5ytDV8hfcZN+PAT4DPtatx7t2qI7KorsZlOEb9zQ4 SU6vmmvSwWfwbSQ6yw9h/tTxjYz7XHf7AN/+7aFMWWJ7bIC9YJ8rERsL4ypQGbGSwsqw XqFirQUMVx7evkg8mLDIu29h88AbhOrkBCrQHKIiMx6u/O1DMJapuhRFKa44WU5Mtq1c UCGg== X-Gm-Message-State: AOJu0Yw/Ng1wikrNKe21Gj8eEcJfQY4XRO9Lqc2ArVNNilaSgQVxdbhe fm/4QZIWA9WiQgOo/rTWq5Y= X-Google-Smtp-Source: AGHT+IH1Mhr4Hd1zhI6NLjC8AaV6YHze9m5YbXfFLZr9tFGLOFn6ZOJDWbimX0Ng149iu9LYPyj7+Q== X-Received: by 2002:a05:6a21:6d8a:b0:14d:d636:ed3a with SMTP id wl10-20020a056a216d8a00b0014dd636ed3amr20938091pzb.23.1693973291851; Tue, 05 Sep 2023 21:08:11 -0700 (PDT) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id ja20-20020a170902efd400b001c0aa301703sm10075255plb.63.2023.09.05.21.08.10 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Sep 2023 21:08:11 -0700 (PDT) In-Reply-To: <52f09345-85c8-4049-b12d-bf8b84b08f75@mailo.com> X-Mailer: Apple Mail (2.3731.700.6) Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=casouri@gmail.com; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:310170 Archived-At: --Apple-Mail=_059004B4-87FA-4EEB-8E39-172406A4A128 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Aug 30, 2023, at 4:28 AM, Augustin Ch=C3=A9neau (BTuin) = wrote: >=20 > Le 30/08/2023 =C3=A0 09:03, Yuan Fu a =C3=A9crit : >>> On Aug 29, 2023, at 2:26 PM, Augustin Ch=C3=A9neau (BTuin) = wrote: >>>=20 >>> Hello, >>>=20 >>> I have a few questions about tree-sitter. >>>=20 >>> I'm currently developing a grammar for GNU Bison alongside a = tree-sitter >>> major mode, it's a work in progress. The grammar is here: >>> , still incomplete but = so >>> far able to parse simple files, and the major mode prototype is >>> attached to this message. >>>=20 >>> So, the questions: >>>=20 >>> 1. Is there a way to reload a grammar? >>>=20 >>> Emacs is pretty nice as a playground for testing grammars, but once = a >>> grammar is loaded, it won't be loaded again until Emacs restarts (as = far >>> as I know). >>> Is it possible to reload a grammar after modifying it? >> No, and it=E2=80=99s probably not easy to implement either, since = unloading the grammar would require Emacs to purge/invalid all the = node/query/parsers using that grammar. >>> 2. How to mix multiple languages? >>>=20 >>> It would be very useful for Bison since its mixed with C or other = languages. >>> According to the documentation I need to use the function >>> `treesit-range-rules` to set the variable `treesit-range-settings`, = but >>> it seems to have no effect. The language in the selected nodes = doesn't >>> change (as attested by `(treesit-language-at (point))`). >>>=20 >>> I did it that way (extracted from the attachment): >>>=20 >>> (setq-local treesit-range-settings >>> (treesit-range-rules >>> :embed 'c >>> :host 'bison >>> '((undelimited_code_block) @capture))) >>>=20 >>> Am I missing something? >> The ranges are set correctly, actually. But the C parse sees all = those blocks stitched together as a whole, rather than individual = blocks, and the code it sees is obviously not syntactically correct. >> We should really work on supporting isolated ranges, there has been = multiple requests for it. I=E2=80=99ll try to work on that. >>> 3. Is it possible to trigger a hook when a node is modified? >>>=20 >>> Since Bison supports multiple languages (C, C++, Java and D), I'd = like >>> to watch the declaration "%language LANGUAGE" to change the embedded >>> language when needed. >>> Is there a way to do that? >> treesit-parser-add-notifier might be what you want. >> Yuan >=20 > I see. Thank you for your answers and for your great work on = tree-sitter! I added local parser support to master. If everything goes right, you = just need to add a :local t flag in treesit-range-rules. Check out the = modified bision-ts-mode.el that I hacked up for an example. BTW, it=E2=80=99= s vital that you define treesit-language-at-point-function for a = multi-language mode. Yuan --Apple-Mail=_059004B4-87FA-4EEB-8E39-172406A4A128 Content-Disposition: attachment; filename=bison-ts-mode.el Content-Type: application/octet-stream; x-unix-mode=0644; name="bison-ts-mode.el" Content-Transfer-Encoding: 7bit ;;; bison-ts-mode --- Tree-sitter mode for Bison ;;; Commentary: ;;; Code: (require 'treesit) (require 'c-ts-mode) (declare-function treesit-parser-create "treesit.c") (declare-function treesit-induce-sparse-tree "treesit.c") (declare-function treesit-node-child-by-field-name "treesit.c") (declare-function treesit-search-subtree "treesit.c") (declare-function treesit-node-parent "treesit.c") (declare-function treesit-node-next-sibling "treesit.c") (declare-function treesit-node-type "treesit.c") (declare-function treesit-node-child "treesit.c") (declare-function treesit-node-end "treesit.c") (declare-function treesit-node-start "treesit.c") (declare-function treesit-node-string "treesit.c") (declare-function treesit-query-compile "treesit.c") (declare-function treesit-query-capture "treesit.c") (declare-function treesit-parser-add-notifier "treesit.c") (declare-function treesit-parser-buffer "treesit.c") (declare-function treesit-parser-list "treesit.c") (defun bison-ts--font-lock-settings (language) (treesit-font-lock-rules :language language :feature 'comment '((comment) @font-lock-comment-face) :language language :feature 'declaration '((declaration (declaration_name) @font-lock-keyword-face)))) (define-derived-mode bison-ts-mode prog-mode "Bison" "A mode for Bison." (when (treesit-ready-p 'bison) (setq-local treesit-font-lock-settings (append (bison-ts--font-lock-settings 'bison) (c-ts-mode--font-lock-settings 'c))) (setq-local treesit-font-lock-feature-list '((comment ;; c-ts-mode definition) (declaration ;; c-ts-mode keyword preprocessor string type) ( ;; c-ts-mode assignment constant escape-sequence label literal))) (setq-local treesit-range-settings (treesit-range-rules :embed 'c :host 'bison :local t '((undelimited_code_block) @capture))) (treesit-major-mode-setup))) (provide 'bison-ts-mode) ;;; bison-ts-mode.el ends here --Apple-Mail=_059004B4-87FA-4EEB-8E39-172406A4A128 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_059004B4-87FA-4EEB-8E39-172406A4A128--