From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Tree sitter: issue embedding HTML, CSS, JavaScript within a new php-ts-mode Date: Thu, 9 Feb 2023 21:45:50 -0800 Message-ID: <494661C8-02BC-4A22-9217-B2110A8D0668@gmail.com> References: <87o7q3ngu8.fsf@polaris64.net> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.300.101.1.3\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3257"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Emacs developers To: Simon Pugnet Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Feb 10 06:46:53 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pQMFJ-0000cy-07 for ged-emacs-devel@m.gmane-mx.org; Fri, 10 Feb 2023 06:46:53 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pQMEa-0003rz-TL; Fri, 10 Feb 2023 00:46:10 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pQMEY-0003rn-Uf for emacs-devel@gnu.org; Fri, 10 Feb 2023 00:46:07 -0500 Original-Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pQMEW-000666-KS for emacs-devel@gnu.org; Fri, 10 Feb 2023 00:46:06 -0500 Original-Received: by mail-pl1-x62f.google.com with SMTP id b5so5426179plz.5 for ; Thu, 09 Feb 2023 21:46:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LlKYsSdWvag7OSvsUJLRGy8QoLxGwpwgcmbmcFaLNUU=; b=TIAP5OcqRVOmw8euUSrSjNWzS/1OqrTP51DG3OnBhSleczOy9f2CqqE3QqSWzC+t3+ od10tnccfc2bydJMnTxAVAb9/ffxCRakIpPtV/rMpA79Uxg1xVLV0Qfb9atClCCNHcCP j68Hyfm9aHFIrSNCAIu1H5lIOHgFBwV+i+/G/SxL+KC04Oqx6EC4qxd7wkVB263W+R8I 7pEKvlwsjRIUoVuYce0de9kRIvpPLUxT9r8a+68+PnqzODntxxLgbh9lnbb8NkT/w3qI 7EOHQwHRFh68X96Z2EIfHtkdDaz6/T5FJoqYnxuwn2YDjj4p4aE1Vud7QYTELddQ/Oti LN0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LlKYsSdWvag7OSvsUJLRGy8QoLxGwpwgcmbmcFaLNUU=; b=vkqRCAB8kiSzGP/sjSg4X22q1JFPodlJbNKGwQZhTrJkoyJZ0PT3dHrF7XWYIdHEAM XnwmZKP48lxW3Z6G0MEjgvJeMl2KjVAeUZEP5/3cpEVlwcQRS2mgBr6BSKcAy82tgdQJ qBawV1BjvfMu0CkS/AGmgAODS64h3B8UgZ19twWj0Jio+CGkklgpoJsN04Fqajc1utJz yi1efUFbLNjPqF8gnUw4j2xAoyn6DJ7AIFdwE1Kr/p3YjA1zVkWNEdrcpab/AqxiwSCy VDh5WiC2Q/Kq1qhHwxUWJz2D8q1X1xxk93HwVM9nt3XQO9d+AKaGIXkxKrePa12qyWxK Kj0w== X-Gm-Message-State: AO0yUKX84CfN1jVXsqoitBIGw8qHNdHTNNn0Bh++Qk7ZJNjph5UoZC3h PTc/Yia+oZG+0DncVvgXRT94Bjlx+Y8= X-Google-Smtp-Source: AK7set83Ndt62xBSuXzq/JTtQQ0VXIBGwaRuWJMB0L4hJAGIxzEWM6onjU8Gm5j8X+bBLwPBF/OYJA== X-Received: by 2002:a05:6a20:3d2a:b0:bc:c5a8:ffc1 with SMTP id y42-20020a056a203d2a00b000bcc5a8ffc1mr17405290pzi.20.1676007962640; Thu, 09 Feb 2023 21:46:02 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id u11-20020a6540cb000000b0044ed37dbca8sm2181681pgp.2.2023.02.09.21.46.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Feb 2023 21:46:01 -0800 (PST) In-Reply-To: <87o7q3ngu8.fsf@polaris64.net> X-Mailer: Apple Mail (2.3731.300.101.1.3) Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=casouri@gmail.com; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:303091 Archived-At: Hey Simon, Thanks for trying this out! Feedbacks like this are very welcome. > On Feb 9, 2023, at 4:45 AM, Simon Pugnet wrote: >=20 > Dear Emacs maintainers, >=20 > I have recently started work on a PHP tree sitter major mode. Things = are going well so far, however I'm having trouble with embedding = multiple languages in the PHP buffer. >=20 > In case you're not familiar with PHP, here's a quick example (I'm = using org-mode mark-up in this message which hopefully will help): - >=20 > #+begin_src php > > > > >=20 > > $a =3D [1, 2, "3", 4.5]; > if (is_array($a)) { > echo "$a is an array"; > } else { > echo "$a is not an array"; > } > ?> >=20 >
>

This is a test

>
>=20 > >=20 > >=20 > > > #+end_src >=20 > As you can see, PHP code is usually encapsulated within a HTML = document, with PHP code enclosed within ~~ blocks. >=20 > The first block of HTML from the beginning of the buffer to the first = ~~ and before the second ~=20 > #+begin_src emacs-lisp > (setq-local treesit-range-settings > (treesit-range-rules > :embed 'html > :host 'php > '((program (text) @capture) > (text_interpolation (text) @capture)))) > #+end_src >=20 > This seems to work however when I evaluate ~(treesit-language-at = (point))~ anywhere in this buffer I get =3D'html=3D in response. This is = of course expected within a HTML region, but not within a PHP region. = Despite this, the font-locking I have defined for PHP appears to work = correctly. I have also defined a custom face and applied it via = font-locking to the above two nodes to confirm that those regions are = indeed enclosed as expected and they are. >=20 > My hope eventually is to use the following ~treesit-range-settings~: - >=20 > #+begin_src emacs-lisp > (setq-local treesit-range-settings > (treesit-range-rules > :embed 'html > :host 'php > '((program (text) @capture) > (text_interpolation (text) @capture)) >=20 > :embed 'css > :host 'html > '((style_element (raw_text) @capture)) >=20 > :embed 'typescript > :host 'html > '((script_element (raw_text) @capture)))) > #+end_src >=20 > As well as defining these rules, I require =3Dcss-mode=3D and = =3Dtypescript-ts-mode=3D and append their own font-locking rules to my = own. My hope is that this will allow CSS and JavaScript embedded within = HTML regions to be font-locked according to those separate major modes = too. This appears to work for simple files but does not work reliably = for more complex files. Also when using the above I get =3D'typescript=3D = whenever I evaluate ~(treesit-language-at (point))~. I'm not sure if = this is just a bug with the language grammars that I'm using or if = perhaps because I'm not using the treesit library correctly. Because of = the issue with ~treesit-language-at~ above I'm concerned that it's the = latter. >=20 > So my questions are: - >=20 > 1. Based on my rules for embedding =3D'html=3D within =3D'php=3D = above, should I expect ~(treesit-language-at (point))~ to return =3D'php=3D= when the point is within a PHP region? Because we don=E2=80=99t have much experience with tree-sitter and its = interfaces, I made treesit-language-at simply delegate work to = treesit-language-at-point-function, which can be an arbitrary function, = giving developers maximum flexibility. You need to set that variables to = a function, otherwise treesit-language-at simply returns the first = parser in the parser list.=20 > 2. Is my goal of embedding HTML within PHP, then embedding CSS and = JavaScript/TypeScript within HTML feasible and if so am I going about = this in the right way? It should be. Although I didn=E2=80=99t thought of having multiple = layers of embedded language (in this case PHP embedding HTML embedding = CSS/Javascript), if you order the entries in treesit-range-rules like = you do now (outer most host language, then embedded language, then = embedded embedded language), it should work. Try setting = treesit-language-at-point-function and it should work right. If not=E2=80=A6= then we need to look into it. Yuan