From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuan Fu Newsgroups: gmane.emacs.devel Subject: Re: Initial fontification in sh-mode with tree-sittter Date: Wed, 2 Nov 2022 18:25:13 -0700 Message-ID: <39ECD413-BD10-4BF3-90AC-36F02276607E@gmail.com> References: <6C8B0F8E-DF61-4BC3-B0D0-56DBB66BE637@gmail.com> <7AE71CCA-6F18-4DE6-8608-7D9B3E9E52FB@gmail.com> <9BA853EA-8B7F-41A0-A174-D86DF5CE7788@gmail.com> <83sfj3cfl0.fsf@gnu.org> <03309451-1AEB-458C-88FD-9715CECC27A2@gmail.com> <83mt9bc9ke.fsf@gnu.org> <8335b19ndr.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3461"; mail-complaints-to="usenet@ciao.gmane.io" Cc: =?utf-8?Q?Jo=C3=A3o_Paulo_Labegalini_de_Carvalho?= , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Nov 03 02:26:16 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oqOzo-0000jK-0N for ged-emacs-devel@m.gmane-mx.org; Thu, 03 Nov 2022 02:26:16 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oqOyy-0002Aw-B8; Wed, 02 Nov 2022 21:25:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oqOyw-0002AR-IB for emacs-devel@gnu.org; Wed, 02 Nov 2022 21:25:22 -0400 Original-Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oqOyu-0000p3-PH; Wed, 02 Nov 2022 21:25:22 -0400 Original-Received: by mail-pl1-x631.google.com with SMTP id j12so511398plj.5; Wed, 02 Nov 2022 18:25:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tRlsAFEiWCGp517cQ9mQsUkGYkvrS1qn7ScHugi0vYQ=; b=INWlVJPeNVUE7ReL6e5lGxMW8+JTQc5gCutlbTluz6rU8AIdjulgg1+nxYOVhOLdEH XO0HsRl20DFqcF6JxLEkvrP1rn+CMeTZjE/+qLpy8FY+8a8pG9o2etgfODYFF032Rych 1ztFyui2OKDZTPsngZBhSjXi57f0pKc3a8J/WiRlEotCXVBjqfTm1HQo6KjklfhU4VeE jxwyvU0+d3HMzffeGJjKxzc+A229iGbBto4Us27S2WvlZhvw0CDR12xb9G02E8dAAt9D AcFjIbFFv/mH5pfworjEQ7zBTghro+KHdDj5whuvU3vSD1+K7NzBrczcKfeF2ePfmrw3 4vHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tRlsAFEiWCGp517cQ9mQsUkGYkvrS1qn7ScHugi0vYQ=; b=oarMth6f3Li6rMi60OXn7Ivjj+48M26nQ/Jw6fOVCSwL8rZMLG1OspKrwVUo7EIF7s FvNz3sJsnVq876avawTspFklUo4Q5v+3+yrscUwIlH3RqM+FqPQuXcHIKSm5GHl67Twr hHKlTBv8heaHs5UUPGyct9zrDsXoy3R621PLbfojF/V2Fv4NSWo1ScqTXZs0s/Sb+dUU qs5saAQ4rBnpUWNyTe/9dRsIzFAy5RMtwaLKGWheCuSRO3uY7HTWYBpX3Fj6GDvSe0uY KWOoFPztGgaNSHobn4FnV6zxHaTNhQlW3vpFc2a/vq78dKa7wun6i68/ffglq0v46B5/ +7FA== X-Gm-Message-State: ACrzQf1ZxmfP1osvzdmSgfNVeJDULXfZiUoyHz+JwdWIw0ZeZ4k8kv5u /dIvayt/gDpsJtpusBgz8K+L0QJ5A2E= X-Google-Smtp-Source: AMsMyM7qe/jA5LvxXEUlrp+U8ZrAn/fKKgf9Kw8nCe7aHY8RFVc7PWr1cgYBmMhdC4PAlv8dnNqgcw== X-Received: by 2002:a17:90a:c24a:b0:213:13aa:3e2a with SMTP id d10-20020a17090ac24a00b0021313aa3e2amr45206803pjx.107.1667438718426; Wed, 02 Nov 2022 18:25:18 -0700 (PDT) Original-Received: from smtpclient.apple (cpe-172-117-161-177.socal.res.rr.com. [172.117.161.177]) by smtp.gmail.com with ESMTPSA id d14-20020a170902cece00b0017f64ab80e5sm9025708plg.179.2022.11.02.18.25.17 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Nov 2022 18:25:18 -0700 (PDT) In-Reply-To: <8335b19ndr.fsf@gnu.org> X-Mailer: Apple Mail (2.3696.120.41.1.1) Received-SPF: pass client-ip=2607:f8b0:4864:20::631; envelope-from=casouri@gmail.com; helo=mail-pl1-x631.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: "Emacs-devel" Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:299043 Archived-At: > On Nov 2, 2022, at 12:17 PM, Eli Zaretskii wrote: >=20 >> From: Jo=C3=A3o Paulo Labegalini de Carvalho >> Date: Wed, 2 Nov 2022 12:34:56 -0600 >>=20 >> Yes, the fontification in the buffer is still not updated correctly. >>=20 >> With the step below, the definition of function foo is not = re-fontified and remains with string-face until C-x x f is >> executed. >>=20 >> emacs -Q from top of the feature/tree-sitter branch >> M-: (require 'treesit) >> M-x customize-variable treesit-settings RET >> Set "Activate" to "Yes" and apply the change. >> C-x b sample.py RET >> M-x python-mode >> Write the following program: >>=20 >> def main(): >> return 0 >>=20 >> M-< >> """ >> M-> >> """ (at this point everything is in string-face, as expected) >> C-a >> C-k (everything is still fontified as string) >> C-x x f (leading """ is not fontified and definition of foo is = correctly fontified) >>=20 >> I am interested in understanding what is causing this as a similar = thing happens with heredocs in sh-mode & >> tree-sitter. >=20 > Yuan, can you look into this? It sounds like some general problem > with integration of tree-sitter with jit-lock. =20 Yeah, this is what I=E2=80=99ve been working in the past two days. I = just pushed a change f331be1f074d68e7e5cdbac324419e07c186492a which = should fix it. [ I just saw that I missed a function in the commit = message, I guess I can=E2=80=99t fix it now :-( ] If you are interested to know what=E2=80=99s going on: The problem is as I described earlier, when the user inserts the = starting quote (=E2=80=9C=E2=80=9D=E2=80=9D), the parse tree is = incomplete and there is no string node. Only when the user inserts the = ending quote is there now a string node to be captured by the = fontification rules.=20 For example, in this snippet there are two regions A and B """ ---+ def main(): | Region A return 0 ---+ ---+=20 """ | Region B ---+ when user inserts the =E2=80=9C=E2=80=9D=E2=80=9D in B, and jit-lock = fontifies region B, it captures the string node and the part of the = string in region A needs to be updated. If we fontify the whole string = in string face, redisplay does not immediately reflect the change in = region A (maybe due to some optimization? jit-lock is definitely aware = of this, see jit-lock-force-redisplay). Redisplay not reflecting the change in face is just the surface problem, = and can be fixed by setting fontified to t on region A, which seems to = trigger redisplay to reflect changes immediately (this is what = jit-lock-force-redisplay does). The deeper problem is, if there is some = regex-based-font-lock face in region A (applied when Emacs fontified = region A), eg, highlighted TODO keywords in a docstring, they will be = overwritten by the string face, if we just apply string face to the = whole string and trigger redisplay. Maybe we can apply string face and re-run regex-based font-lock on the = whole string, but that works against jit-lock. If the string is long = regexp-font-lock might take a long time. What I ended up doing is to set jit-lock-context-unfontify-pos to the = beginning of the string node (aka beginning of region A). Then in a = timer jit-lock-context will refontify everything after that position. = And I have some measure to break possible infinite recursion (fontify = region -> set jit-lock-context-unfontify-pos -> cause refontification -> = fontify region -> =E2=80=A6). The alternative, where we put everything in string face when we see an = opening =E2=80=9C=E2=80=9D=E2=80=9D, is not really feasible. It=E2=80=99s = kind of feasible in python, where an opening =E2=80=9C=E2=80=9D=E2=80=9D = alone looks like (ERROR =E2=80=9C=E2=80=9D=E2=80=9D), aka the quote node = inside an error node. But if you insert /* in javascript, you get (ERROR = =E2=80=9C/=E2=80=9C) and (ERROR (regexp_pattern))=E2=80=94tree-sitter = can=E2=80=99t tell that they are opening comment (which is fair). Plus = this approach requires non-trivial involvement from each major mode: = each major mode needs to somehow tell Emacs what is a =E2=80=9Cpotential = opening comment/string=E2=80=9D. > Or maybe something is > missing from the way tree-sitter nodes are mapped into face text > properties. Are the faces actually being put on the relevant text, > for starters? They are. Fortunately this is not the problem. Yuan