From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?Jo=C3=A3o_Paulo_Labegalini_de_Carvalho?= Newsgroups: gmane.emacs.devel Subject: Re: Call for volunteers: add tree-sitter support to major modes Date: Fri, 21 Oct 2022 17:45:41 -0600 Message-ID: References: <83sfjtd2bg.fsf@gnu.org> <83o7uhawb9.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000d07e1a05eb940a64" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27002"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Oct 22 02:39:51 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1om2YI-0006p9-UZ for ged-emacs-devel@m.gmane-mx.org; Sat, 22 Oct 2022 02:39:51 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1om1ir-0000ta-SM; Fri, 21 Oct 2022 19:46:42 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1om1iT-0000sY-II for emacs-devel@gnu.org; Fri, 21 Oct 2022 19:46:18 -0400 Original-Received: from mail-oi1-x236.google.com ([2607:f8b0:4864:20::236]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1om1iR-0002Rr-3M; Fri, 21 Oct 2022 19:46:17 -0400 Original-Received: by mail-oi1-x236.google.com with SMTP id y72so4971317oia.3; Fri, 21 Oct 2022 16:46:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GSWIRFfKiv7FfVV5/bBhYcawgXdx/4RjLMdyJpN5GjU=; b=YG8qavhLnLE+jnZuubvgqh2IwFkBNeXrUn8Yh8gHmQWCXybRKB4onwNGMSM9GD51B7 1H3C6QZ8a2T4SOF12GbnWAQyddG+ikzupSguGuXzJpf5rOdN5ESgEnjuWtZZ93joeRfO WNIrPBSwRpR6Eio1lLz0+1M2dPvZbWj17DpTemnvI2yPSql6yylLnDj5vYeAiUSPAb0i +aHie05vLa0GcTdxHBX7SyksxJM9vZjrrYVEOmEPgONWrYl07pkV3NCppo1xTK5Gpf0F aVQrG/G66A/9Tvngf9BYfqPoIKLaAupb6aN/9Eqy44ip7cw3RcYFpqJXPs8dWAoei3aB RkbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GSWIRFfKiv7FfVV5/bBhYcawgXdx/4RjLMdyJpN5GjU=; b=pYem2WIKM/sAve7/CZt5xxMHDhWhK91luXDpTzwt85zQKPsp/mOG6pGLImLk2vbahM 7frqqrcTL6hitouQHd6l6NaAIbS+AY6RYN1i6FsCWmCiDNR7ud88KTe4vgJIg5ylVnGi iFeO0E4ID4rFkoxHdjwRXGdviDaMN2vlpcDBYhzxgXloGdIFnhPJ/rOsHA8YzEHvpBBg I1AlY5yF0joc0CpyK/28G8tZcbPr2ylq26PGaHEcSiek++KejvzXLVu+brSHrrXc/F9T sMV7QTLw4FvEPTiFTriJ6vIjl1cVKpngYY8KvFvlj5LWNylZ8rKE3h98cH4UDIKq7VHD Wkyg== X-Gm-Message-State: ACrzQf2kQ+xg0NjADk4uSnTY818CidnjLCjsld/4Q7gOhEykyoPZxbh4 YLQr/3z7zpGx2TZf5pE/JhwY+gxHGHucaSGJe+2zIDFW X-Google-Smtp-Source: AMsMyM4DMfulcpSASrMXvGTvNfvgSebXf0FjABrsveG5cUYgYqAtNrgEQTBmW0tMSd2XwYYhB8z21hyPoc1Cl3u6goM= X-Received: by 2002:a05:6870:d5a0:b0:13a:af1a:362c with SMTP id u32-20020a056870d5a000b0013aaf1a362cmr11205723oao.17.1666395952674; Fri, 21 Oct 2022 16:45:52 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::236; envelope-from=jaopaulolc@gmail.com; helo=mail-oi1-x236.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: "Emacs-devel" Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:298249 Archived-At: --000000000000d07e1a05eb940a64 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable With the work a did today I was able to answer my own questions. I managed to use `sh-feature' function to obtain the lists of keywords & builtins for fontification. I am planning to soon submit a patch for review but mainly for the style of bits I have implemented. On Fri., Oct. 21, 2022, 10:47 a.m. Jo=C3=A3o Paulo Labegalini de Carvalho, = < jaopaulolc@gmail.com> wrote: > I have finally got some time to work on this. > > From my initial understanding of `sh-script-mode' is that it supports man= y > shell scripting languages via an elegant "inheritance" structured code. > > As part of the tree-sitter project, I only found a repo to generate the > parser for bash. So it is my impression that other shell languages might > not be correctly parsed by tree-sitter-bash from here: > https://github.com/tree-sitter/tree-sitter-bash. > > My idea to incrementally add support for shell languages is to start with > bash and setup three `tree-sitter-font-lock-rules' for it, like so: > > (defvar sh-script--treesit-settings > (treesit-font-lock-rules > :language 'bash > :feature 'basic > ;; queries for 'basic feature here > :language 'bash > :feature 'moderate > ;; queries for 'moderate feature here > :language 'bash > :feature 'full > ;; queries for 'full feature here)) > > Would that be acceptable? Or should I use a function in the `language:' > field that returns a shell language symbol? > > On Wed, Oct 12, 2022 at 9:36 AM Eli Zaretskii wrote: > >> > From: Jo=C3=A3o Paulo Labegalini de Carvalho >> > Date: Wed, 12 Oct 2022 09:09:26 -0600 >> > Cc: emacs-devel@gnu.org >> > >> > On Tue, Oct 11, 2022 at 11:43 PM Eli Zaretskii wrote: >> > >> > What and how we should handle the C and derived modes is currently >> > under discussion. So if you could start with shell-script-mode, >> > that'd be ideal, I think. >> > >> > For sure. I will start working on shell-script-mode. >> >> Thanks! >> > > > -- > Jo=C3=A3o Paulo L. de Carvalho > Ph.D Computer Science | IC-UNICAMP | Campinas , SP - Brazil > Postdoctoral Research Fellow | University of Alberta | Edmonton, AB - > Canada > joao.carvalho@ic.unicamp.br > joao.carvalho@ualberta.ca > --000000000000d07e1a05eb940a64 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
With the work a did today I was able to answer my own que= stions.

I managed to use `sh-f= eature' function to obtain the lists of keywords & builtins for fon= tification.

I am plannin= g to soon submit a patch for review but mainly for the style of bits I have= implemented.

On Fri., Oct. 21, 2022, 10:47 a.m. Jo=C3=A3o Paulo Labeg= alini de Carvalho, <jaopaulolc@g= mail.com> wrote:
I have finally=C2=A0got some time to work on this.

From my ini= tial understanding of `sh-script-mode' is that it supports many shell s= cripting languages via an elegant "inheritance" structured code.<= br>
As part of the tree-sitter project, I only found a repo to generate = the parser for bash. So it is my impression that other shell languages migh= t not be correctly parsed by tree-sitter-bash from here: https://github.com/tree-sitter/tree-sitter-bash.

My idea to = incrementally=C2=A0add support for shell languages is to start with bash an= d setup three `tree-sitter-font-lock-rules' for it, like so:

(defvar sh-script--treesit-settings
=C2=A0 =C2=A0 (treesit-font-lock-rules
<= font face=3D"monospace">=C2=A0 =C2=A0 =C2=A0 =C2=A0 :language 'bash
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 :feature 'basic
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 ;; queries for 'basic feature here
=C2=A0 =C2=A0 =C2=A0 =C2=A0 :language &#= 39;bash
=C2=A0 =C2=A0 =C2=A0 =C2=A0 :feature 'moderate
=C2=A0= =C2=A0 =C2=A0 =C2=A0 ;; queries for 'moderate feature here
=C2=A0 =C2=A0 =C2=A0 =C2=A0 :language 'bash
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 :feature 'full
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ;; queries f= or 'full feature here))

W= ould that be acceptable? Or should I use a function in the `language:' = field that returns a shell language symbol?

On Wed, Oct 12, 2022= at 9:36 AM Eli Zaretskii <eliz@gnu.org> wrote:
> From: Jo=C3=A3o Paulo Labegalini d= e Carvalho <jaopaulolc@gmail.com>
> Date: Wed, 12 Oct 2022 09:09:26 -0600
> Cc: emacs-devel@gnu.org
>
> On Tue, Oct 11, 2022 at 11:43 PM Eli Zaretskii <eliz@gnu.org> wrot= e:
>
>=C2=A0 What and how we should handle the C and derived modes is current= ly
>=C2=A0 under discussion.=C2=A0 So if you could start with shell-script-= mode,
>=C2=A0 that'd be ideal, I think.
>
> For sure. I will start working on shell-script-mode.

Thanks!


--
Jo=C3=A3o Paulo L. de Carval= ho
Ph.D Computer Science | =C2=A0IC-UNICAMP | Campinas , SP - Brazil
= Postdoctoral Research Fellow | University of Alberta | Edmonton, AB - Canad= a
--000000000000d07e1a05eb940a64--