From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Initial fontification in sh-mode with tree-sittter Date: Sat, 12 Nov 2022 14:28:52 -0800 Message-ID: References: <6C8B0F8E-DF61-4BC3-B0D0-56DBB66BE637@gmail.com> <7AE71CCA-6F18-4DE6-8608-7D9B3E9E52FB@gmail.com> <9BA853EA-8B7F-41A0-A174-D86DF5CE7788@gmail.com> <83sfj3cfl0.fsf@gnu.org> <03309451-1AEB-458C-88FD-9715CECC27A2@gmail.com> <83mt9bc9ke.fsf@gnu.org> <8335b19ndr.fsf@gnu.org> <39ECD413-BD10-4BF3-90AC-36F02276607E@gmail.com> <8A2361BB-1081-4550-AC29-B9E99BFC2FB8@gmail.com> <42DBD4F5-71D9-434E-B7B4-4E0FF89F934F@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8880"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , emacs-devel@gnu.org To: =?utf-8?Q?Jo=C3=A3o_Paulo_Labegalini_de_Carvalho?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Nov 12 23:29:50 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1otz0X-0002Cj-Mv for ged-emacs-devel@m.gmane-mx.org; Sat, 12 Nov 2022 23:29:49 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1otyzj-0004ES-PF; Sat, 12 Nov 2022 17:28:59 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1otyzi-0004E4-KS for emacs-devel@gnu.org; Sat, 12 Nov 2022 17:28:58 -0500 Original-Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1otyzg-0002bZ-Hy; Sat, 12 Nov 2022 17:28:58 -0500 Original-Received: by mail-pf1-x435.google.com with SMTP id k22so7865746pfd.3; Sat, 12 Nov 2022 14:28:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5kOsHxt78J1sWnRmPtxA+UEGZbXXpuFQQ94gRNG1s28=; b=RNNeoyCdlyIwJMj4aH8JHd3z6ONW/sdJXGnfCUo13jSTHeYNzJZNbmguU/pd1Mf+ob H847pmdzD8UszGhOJsggLTHptm3FXJ6LJkqGjLL9ZfE767r9oJ6ayybgiVH3fNbQt72H Bzsu2pm4nPhHgOCitcOZkBxpWyshFMzWCJEkq7CHU9OWITirsa3yd6KdTqDAmrK4JKXi quXaL5x/B4OeJhgGX3d+Xt5M8/ghzZUTRccmdglBum7vQ4Ood/yES+p0q1KeM7rVwBAr 6IoWbf0TgUjleT4rxmNBcI8w23tfhYYnnRK27h8Lnrn/OwJEohPZE+3NYennEjTiJ57T DZ5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5kOsHxt78J1sWnRmPtxA+UEGZbXXpuFQQ94gRNG1s28=; b=ZWhvDA1V0kaOhHl8xEsjU8XWa+mQWTZTUpKavFv5GGoOKRbX+opH7GSVGhr2VW4fTU pMYXvDFH9cdFs8ZTIvfuZyEhbzBjwv9v13DQEMPKHEM8wMrRZkkmAhwrUuSmj3HsVp9s bXKgp4gGtMeOMKpC5rjMuT/z6GodvFQ4Mm8DXTKWOvcfyn2LosXUYiKnGkGzmac9vrvw 5r26q0oqn97TFxNhFAuR/DJjCsQti6vutYFZJCqfit+AgTFAfrQUCel/GvoKZM6CD1lr 0+Fih9WzdDeRtKE4+kOmur7u5Nbnm9Fa3t1dYbodxE4gmapXXlSlGvnDyva76U7OJqko w4/A== X-Gm-Message-State: ANoB5pmidXFQRqcIGYrgi+wHUIBQk9c/UzMemmBOsV8kax8HsC68Y1gJ dXeSv6yIhsBgM5DTTW6POpKNk5c737g= X-Google-Smtp-Source: AA0mqf64AfGx14pIcy4X2et9Qno9JX7jAeoT8Q+ZdNydGSROFJNzbTxpSHHwSSG/NtOPGivT2AZn0g== X-Received: by 2002:a63:fe4f:0:b0:470:4acb:1eb with SMTP id x15-20020a63fe4f000000b004704acb01ebmr6518364pgj.440.1668292134546; Sat, 12 Nov 2022 14:28:54 -0800 (PST) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id n4-20020a17090a670400b0020dda7efe61sm6978688pjj.5.2022.11.12.14.28.53 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 12 Nov 2022 14:28:53 -0800 (PST) In-Reply-To: X-Mailer: Apple Mail (2.3696.120.41.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=casouri@gmail.com; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:299683 Archived-At: > On Nov 12, 2022, at 2:04 PM, Jo=C3=A3o Paulo Labegalini de Carvalho = wrote: >=20 > I see. This is tree-sitter-bash=E2=80=99s problem. When there are only = newlines between two EOF=E2=80=99s, the parser erroneously marks = everything that follows as heredoc_body. I tried tree-sitter=E2=80=99s = online demo and it gives the same result[1]. We should report this to = tree-sitter-bash=E2=80=99s author. >=20 > Sorry for the delay. I confirmed the problem was in the = tree-sitter-bash side and submitted a PR to fix it: = https://github.com/tree-sitter/tree-sitter-bash/pull/137 > Once my fixes are pulled in, there is no change required to my patch. > =20 > Also, when defining sh-mode--treesit-settings, instead of using the = value sh-shell as the language, it=E2=80=99s better to just use =E2=80=98b= ash. Here is what happened to me: my default value for sh-shell is fish, = so sh-mode--treesit-settings was defined with language =3D fish. When I = open heredoc-issue.sh, sh-mode parses the shebang and sets sh-shell to = bash. Since bash does have a parser, (treesit-ready-p =E2=80=99sh-mode = sh-shell) returns t, and tree-sitter is activated. However when = font-lock tries to use the query, it errors because query tries to load = a parser for fish. >=20 > I see. I thought that because sh-mode--treesit-settings is executed = after the local variable sh-shell is defined, it would always be equal = to the detected/file shell type. I am still getting my head around scope = in elisp. When the defvar evaluates at load time, the value of sh-shell is the = value set by user=E2=80=99s configuration, not the detected/file shell = type. When the major-mode initialization runs (when we open a file), = sh-shell=E2=80=99s value becomes the detected/file shell type.=20 Because the tree-sitter language definition only works with bash, it = doesn=E2=80=99t make sense to define those queries with anything other = than bash, in sh-mode--treesit-settings. > I did the change and I think it is good to go, unless there is = anything else to improve for now. >=20 > I hope to soon get time to work on imenu, navigation, and indentation = for sh-mode & bash with tree-sitter. >=20 > Please find the corrected patch attached. Thanks, some comments: +(defun sh-mode--treesit-fontify-decl-command (node override _start = _end) + "Fontifies only the name of declaration_command nodes. + +This is used instead of `font-lock-builtion-face' directly because +otherwise the whole command, including the variable assignment part, +is fontified with with `font-lock-builtin-face'. An alternative to +this would be to declaration_command nodes to have a `name:' field.=E2=80= =9D I guess you meant =E2=80=9C...for declaration_command node to = have=E2=80=A6=E2=80=9D? (Declaimer: not native speaker) + (let* ((maybe-decl-cmd (treesit-node-parent node)) + (node-type (treesit-node-type maybe-decl-cmd))) + (when (string=3D node-type "declaration_command") + (let* ((name-node (car (treesit-node-children maybe-decl-cmd))) + (name-beg (treesit-node-start name-node)) + (name-end (treesit-node-end name-node))) + (put-text-property name-beg + name-end + 'face + font-lock-builtin-face))))) + + (cond + ;; Tree-sitter + ((treesit-ready-p 'sh-mode sh-shell) + (setq-local font-lock-keywords-only t) This line is not necessary anymore due to recent changes. + (setq-local treesit-font-lock-feature-list + '((comments functions strings heredocs) + (variables keywords commands decl-commands) + (constants operators builtin-variables))) + (setq-local treesit-font-lock-settings + sh-mode--treesit-settings) + (treesit-major-mode-setup)) + ;; Elisp. + (t + (setq font-lock-defaults + `((sh-font-lock-keywords + sh-font-lock-keywords-1 sh-font-lock-keywords-2) + nil nil + ((?/ . "w") (?~ . "w") (?. . "w") (?- . "w") (?_ . "w")) = nil + (font-lock-syntactic-face-function + . ,#'sh-font-lock-syntactic-face-function)))))) +(defvar sh-mode--treesit-settings + (treesit-font-lock-rules + :feature 'comments + :language sh-shell + '((comment) @font-lock-comment-face) + :feature 'functions + :language sh-shell + '((function_definition name: (word) @font-lock-function-name-face)) + :feature 'strings + :language sh-shell + '([(string) (raw_string)] @font-lock-string-face) + :feature 'heredocs + :language sh-shell + '([(heredoc_start) (heredoc_body)] @sh-heredoc) + :feature 'variables + :language sh-shell + '((variable_name) @font-lock-variable-name-face) Because of reasons I mentioned earlier, we should use =E2=80=98bash = instead of sh-shell here. Once those are changed I think we can push to feature/tree-sitter, other = features/fixes can come later. Yuan