unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
@ 2022-08-15 12:32 Jostein Kjønigsen
  2022-08-15 12:50 ` Dmitry Gutov
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jostein Kjønigsen @ 2022-08-15 12:32 UTC (permalink / raw)
  To: Yuan Fu, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska
  Cc: Theodor Thornhill

[-- Attachment #1: Type: text/plain, Size: 5106 bytes --]

Hey everyone.

Sorry for writing one of those long emails.

For those who wants to cut to the brief, the executive summary seems to 
be that using tree-sitter for fontification can give much lower 
performance than expected, and used together with linum-mode for line 
numbering will cause severe performance-degradation.

How bad this issue is ofcourse depends on the major-mode involved, and 
how complex the tree-sitter grammars used by that mode is. These 
findings are found in major-modes not part of core Emacs, but I still 
think it could provide valuable feedback wrt the current state of the 
feature/tree-sitter code and what can be improved.

As a reference I have 2 major-mode demonstrates this:

  * csharp-mode [1]
  * typescript-mode [2]

Both of these major modes are either implemented (or in the process of 
being implemented) for the following scenarios:

  * plain elisp (or cc-mode)
  * using the emacs-tree-sitter package/library written by Tuan-Anh
    Nhuyen[3], compatible with "any" Emacs.
  * using native Emacs tree-sitter based on the git feature/tree-sitter
    branch by Yuan Fu, requires Emacs-build from git.

For csharp-mode we've been successfully been able to pivot from 
elisp/cc-mode to emacs-tree-sitter with great success. The code is 
simpler, and performance is perfectly acceptable, and long-standing bugs 
where fixed in the process.

For typescript-mode we tried to do the same[4], but learnt about Yuan 
Fu's work before completing. The result instead was a new major-mode 
depending on native Emacs tree-sitter support[5]. This has also worked 
out well enough for me to use it as my "daily driver".

Motivated by that success, I've tried to rewrite csharp-mode to also use 
native Emacs tree-sitter support[6]. And while porting the code seems to 
work, performance for this mode has been VERY far from acceptable.

Even on a modern, fast Intel CPU, keystrokes are lagging several seconds 
behind and it's not really usable. You just have to stop typing and wait 
for your input to suddenly appear many, many seconds later.

This is in great contrast to the csharp-mode implementation which uses 
Tuan-Anh's library, and quite opposite of what I would expect. While 
perhaps somewhat naive, I honestly expected "native support" would 
perform better. Could there be optimizations in Tuan-Anh's library we 
need to add treesit.el in Emacs?

Another thing which made me really notice this issue is that by default 
I have linum-mode enabled for all prog-mode buffers.

And linum-mode -easily- reduces input-performance in tree-sitter mode 
buffers by a factor of 4 (this has been measured using profile-start, 
profile-stop and profile-report).

The following profiling-report stems from enabling csharp-mode based on 
native Emacs tree-sitter support, linum-mode and then proceeding to 
writing a long line with random letters (no need to be valid code).

     382,605,711  71% - linum-update-current
     382,605,711  71%  - linum-update
     382,605,711  71%   - mapc
     382,601,487  71%    - linum-update-window
     382,176,351  71%     - window-end
     382,176,351  71%      - jit-lock-function
     382,176,351  71%       - jit-lock-fontify-now
     382,176,351  71%        - jit-lock--run-functions
     382,176,351  71%         - run-hook-wrapped
     382,176,351  71%          - #<compiled -0x156ee8ca7e527443>
     382,176,351  71%           - font-lock-fontify-region
     382,127,055  71%            + treesit-font-lock-fontify-region
          49,296   0%            + font-lock-default-fontify-region
          30,616   0%       linum--face-width
     137,009,221  25% - command-execute

I realize linum-mode has been controversial wrt to performance in the 
past, but this kind of slow-down had me quite surprised. Disabling 
linum-mode makes the major-mode borderline usable, but it's still much 
slower than I know it -can- be (based on Thuan-Anh's library).

Can something be done to Yuan's code to make it perform equally to 
Thuan-Anh's? Are there improvements which can be done to linum-mode to 
avoid these kinds of issues?

I know for sure I'm not qualified to answer those questions, but I think 
it's definitely something which needs to be looked into and if anyone 
has anything they want me to provide feedback on though, I will be more 
than happy test those changes and report back.

[1] https://github.com/emacs-csharp/csharp-mode
[2] https://github.com/emacs-typescript/typescript.el
[3] https://github.com/emacs-tree-sitter/elisp-tree-sitter
[4] 
https://github.com/emacs-typescript/typescript.el/blob/feature/tsx-support/typescript-tree-sitter.el
[5] 
https://git.sr.ht/~theo/tree-sitter-modes/tree/master/item/typescript-mode.el
[6] 
https://git.sr.ht/~jostein/tree-sitter-modes/tree/feature/csharp/item/csharp-mode.el


-- 
Kind regards
*Jostein Kjønigsen*

jostein@kjonigsen.net 🍵 jostein@gmail.com
https://jostein.kjønigsen.no <https://jostein.kjønigsen.no>

[-- Attachment #2: Type: text/html, Size: 7050 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-15 12:32 Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Jostein Kjønigsen
@ 2022-08-15 12:50 ` Dmitry Gutov
  2022-08-15 13:53 ` Stefan Monnier
  2022-08-18  9:44 ` Yuan Fu
  2 siblings, 0 replies; 19+ messages in thread
From: Dmitry Gutov @ 2022-08-15 12:50 UTC (permalink / raw)
  To: jostein, Yuan Fu, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska
  Cc: Theodor Thornhill

On 15.08.2022 15:32, Jostein Kjønigsen wrote:
> Are there improvements which can be done to linum-mode to avoid these 
> kinds of issues?

You can try nlinum (from ELPA) or the built-in 
display-line-numbers-mode. Either should be faster than the original 
linum-mode.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-15 12:32 Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Jostein Kjønigsen
  2022-08-15 12:50 ` Dmitry Gutov
@ 2022-08-15 13:53 ` Stefan Monnier
  2022-08-18  7:50   ` Yuan Fu
  2022-08-18  9:44 ` Yuan Fu
  2 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2022-08-15 13:53 UTC (permalink / raw)
  To: Jostein Kjønigsen
  Cc: Yuan Fu, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, jostein,
	Theodor Thornhill

> Motivated by that success, I've tried to rewrite csharp-mode to also use
> native Emacs tree-sitter support[6].  And while porting the code seems to
> work, performance for this mode has been VERY far from acceptable.

I'd report this as a (performance) bug in the native Emacs tree-sitter
code (maybe the actual Emacs<->tree-sitter interface but more likely the
font-lock code in treesit.el).


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-15 13:53 ` Stefan Monnier
@ 2022-08-18  7:50   ` Yuan Fu
  0 siblings, 0 replies; 19+ messages in thread
From: Yuan Fu @ 2022-08-18  7:50 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Jostein Kjønigsen, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, jostein,
	Theodor Thornhill



> but more likely the
> font-lock code in treesit.el).
> 

That would be my guess too. Maybe something in the highlighting patterns caused the slowness in csharp-mode that are not in typescript-mode. Let me see what’s going on.

Yuan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-15 12:32 Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Jostein Kjønigsen
  2022-08-15 12:50 ` Dmitry Gutov
  2022-08-15 13:53 ` Stefan Monnier
@ 2022-08-18  9:44 ` Yuan Fu
  2022-08-19  6:01   ` Jostein Kjønigsen
  2 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2022-08-18  9:44 UTC (permalink / raw)
  To: jostein
  Cc: Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill

> 
> Can something be done to Yuan's code to make it perform equally to Thuan-Anh's? Are there improvements which can be done to linum-mode to avoid these kinds of issues?
> 
> I know for sure I'm not qualified to answer those questions, but I think it's definitely something which needs to be looked into and if anyone has anything they want me to provide feedback on though, I will be more than happy test those changes and report back.

Good news, the slowness can be easily resolved by compiling the query pattern in csharp-mode-font-lock-settings-1 (this is a recent addition to treesit). I guess typescript-mode’s query pattern is short enough to not bog down font-locking, but C#’s relatively large set of patterns makes query captures very slow.

I changed

(setq csharp-mode-font-lock-settings-1
      `((c-sharp
         (
           ;; Various constructs
           (comment)  @font-lock-comment-face
           (modifier) @font-lock-keyword-face
           (this_expression) @font-lock-keyword-face
           (escape_sequence) @font-lock-keyword-face
           …

to

(setq csharp-mode-font-lock-settings-1
      `((c-sharp
         ,(treesit-query-compile
           'c-sharp
           '(
             ;; Various constructs
             (comment)  @font-lock-comment-face
             (modifier) @font-lock-keyword-face
             (this_expression) @font-lock-keyword-face
             (escape_sequence) @font-lock-keyword-face
             …

And now csharp-mode is pretty fast!

Yuan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-18  9:44 ` Yuan Fu
@ 2022-08-19  6:01   ` Jostein Kjønigsen
  2022-08-19 21:58     ` Yuan Fu
  0 siblings, 1 reply; 19+ messages in thread
From: Jostein Kjønigsen @ 2022-08-19  6:01 UTC (permalink / raw)
  To: Yuan Fu, jostein
  Cc: Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill,
	dgutov

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]

On 18.08.2022 11:44, Yuan Fu wrote:
>
> Good news, the slowness can be easily resolved by compiling the query pattern in csharp-mode-font-lock-settings-1 (this is a recent addition to treesit).
>
> Yuan

Thanks for the reply and thanks for looking into this.

I can confirm that by compiling the query like you suggested, and 
replacing linum-mode with nlinum-mode, I'm not experiencing any 
performance issues any more!

To avoid issues like this... Should perhaps the function 
treesit-query-capture (in treesit.c) emit a warning/message when 
encountering non-compiled queries?

That way writing more performant major-modes would be more 
self-explanatory, resulting in a better, faster Emacs for everyone.


-- 
Vennlig hilsen
*Jostein Kjønigsen*

jostein@kjonigsen.net 🍵 jostein@gmail.com
https://jostein.kjønigsen.no <https://jostein.kjønigsen.no>

[-- Attachment #2: Type: text/html, Size: 1581 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-19  6:01   ` Jostein Kjønigsen
@ 2022-08-19 21:58     ` Yuan Fu
  2022-08-20 14:14       ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Yuan Fu @ 2022-08-19 21:58 UTC (permalink / raw)
  To: jostein
  Cc: Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill,
	dgutov



> On Aug 18, 2022, at 11:01 PM, Jostein Kjønigsen <jostein@secure.kjonigsen.net> wrote:
> 
> On 18.08.2022 11:44, Yuan Fu wrote:
>> 
>> Good news, the slowness can be easily resolved by compiling the query pattern in csharp-mode-font-lock-settings-1 (this is a recent addition to treesit).
>> 
>> Yuan
>> 
> Thanks for the reply and thanks for looking into this.
> 
> I can confirm that by compiling the query like you suggested, and replacing linum-mode with nlinum-mode, I'm not experiencing any performance issues any more!
> 
> To avoid issues like this... Should perhaps the function treesit-query-capture (in treesit.c) emit a warning/message when encountering non-compiled queries?
> 
> That way writing more performant major-modes would be more self-explanatory, resulting in a better, faster Emacs for everyone.

Warning/message seems a bit drastic. There are valid use-cases where one want to use an uncompiled query. For now I have words in the docstring that advices using compiled queries (albeit not in all caps :-)

Yuan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-19 21:58     ` Yuan Fu
@ 2022-08-20 14:14       ` Stefan Monnier
  2022-08-23  1:53         ` Fu Yuan
  2022-09-06  2:53         ` Clément Pit-Claudel
  0 siblings, 2 replies; 19+ messages in thread
From: Stefan Monnier @ 2022-08-20 14:14 UTC (permalink / raw)
  To: Yuan Fu
  Cc: jostein, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill,
	dgutov

Yuan Fu [2022-08-19 14:58:49] wrote:

>> On Aug 18, 2022, at 11:01 PM, Jostein Kjønigsen <jostein@secure.kjonigsen.net> wrote:
>> 
>> On 18.08.2022 11:44, Yuan Fu wrote:
>>> 
>>> Good news, the slowness can be easily resolved by compiling the query
>>> pattern in csharp-mode-font-lock-settings-1 (this is a recent addition to
>>> treesit).
>>> 
>>> Yuan
>>> 
>> Thanks for the reply and thanks for looking into this.
>> 
>> I can confirm that by compiling the query like you suggested, and
>> replacing linum-mode with nlinum-mode, I'm not experiencing any
>> performance issues any more!
>> 
>> To avoid issues like this... Should perhaps the function
>> treesit-query-capture (in treesit.c) emit a warning/message when
>> encountering non-compiled queries?
>> 
>> That way writing more performant major-modes would be more
>> self-explanatory, resulting in a better, faster Emacs for everyone.
>
> Warning/message seems a bit drastic. There are valid use-cases where one
> want to use an uncompiled query. For now I have words in the docstring that
> advices using compiled queries (albeit not in all caps :-)

FWIW, I think specifying the highlighting rules with something akin to:

    (defvar <foo> '<rules>)

is a mistake.  It should go through some kind of macro, such as (maybe):

    (defvar <foo> (tree-sitter-rules <rules>))

which can thus do any preprocessing we may want, such as pre-compiling
queries.  It also helps evolve the syntax since we can more easily warn
about obsolete uses, etc...

I've had a "rewrite font-lock.el so the rules go through a macro" in my
todo list for ages.


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-20 14:14       ` Stefan Monnier
@ 2022-08-23  1:53         ` Fu Yuan
  2022-08-23 14:30           ` Stefan Monnier
  2022-09-06  2:53         ` Clément Pit-Claudel
  1 sibling, 1 reply; 19+ messages in thread
From: Fu Yuan @ 2022-08-23  1:53 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: jostein, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill,
	dgutov


> 
> Yuan Fu [2022-08-19 14:58:49] wrote:
> 
>>>> On Aug 18, 2022, at 11:01 PM, Jostein Kjønigsen <jostein@secure.kjonigsen.net> wrote:
>>> 
>>> On 18.08.2022 11:44, Yuan Fu wrote:
>>>> 
>>>> Good news, the slowness can be easily resolved by compiling the query
>>>> pattern in csharp-mode-font-lock-settings-1 (this is a recent addition to
>>>> treesit).
>>>> 
>>>> Yuan
>>>> 
>>> Thanks for the reply and thanks for looking into this.
>>> 
>>> I can confirm that by compiling the query like you suggested, and
>>> replacing linum-mode with nlinum-mode, I'm not experiencing any
>>> performance issues any more!
>>> 
>>> To avoid issues like this... Should perhaps the function
>>> treesit-query-capture (in treesit.c) emit a warning/message when
>>> encountering non-compiled queries?
>>> 
>>> That way writing more performant major-modes would be more
>>> self-explanatory, resulting in a better, faster Emacs for everyone.
>> 
>> Warning/message seems a bit drastic. There are valid use-cases where one
>> want to use an uncompiled query. For now I have words in the docstring that
>> advices using compiled queries (albeit not in all caps :-)
> 
> FWIW, I think specifying the highlighting rules with something akin to:
> 
>    (defvar <foo> '<rules>)
> 
> is a mistake.  It should go through some kind of macro, such as (maybe):
> 
>    (defvar <foo> (tree-sitter-rules <rules>))
> 
> which can thus do any preprocessing we may want, such as pre-compiling
> queries.  It also helps evolve the syntax since we can more easily warn
> about obsolete uses, etc...
> 
> I've had a "rewrite font-lock.el so the rules go through a macro" in my
> todo list for ages.

Yeah! I can do that. Do you have some ideas on the syntax you would use for font-lock? If possible I’d like treesit stuff to be similar to font-lock. For example I made treesit to use font-lock’s decoration level system.

Yuan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-23  1:53         ` Fu Yuan
@ 2022-08-23 14:30           ` Stefan Monnier
  2022-08-24  1:18             ` John Yates
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2022-08-23 14:30 UTC (permalink / raw)
  To: Fu Yuan
  Cc: jostein, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill,
	dgutov

> Yeah! I can do that. Do you have some ideas on the syntax you would use for
> font-lock?

Not really, which is why it hasn't happened yet I guess :-)

`syntax-propertize-rules` could be an inspiration (tho, compared to
font-lock, this one had the added constraint that it had to be applied
in a single scan of an N-element regexp whereas font-lock can use
N scans of simpler regexps).

But maybe it's better if you try to come up with your own plan without
suffering from my biases.  Then I can look at it and complain :-)
The result will probably be better.

> If possible I’d like treesit stuff to be similar to font-lock.

Since font-lock's hasn't happened yet, this should presumably happen by
font-lock mimicking treesit rather than the reverse.

> For example I made treesit to use font-lock’s decoration level system.

That is the kind of bias that would have better been avoided (I think
a better approach is to let the major mode specify features that can be
included/excluded: practice seems to have shown that a "level" is too
meaningless and coarse).


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-23 14:30           ` Stefan Monnier
@ 2022-08-24  1:18             ` John Yates
  0 siblings, 0 replies; 19+ messages in thread
From: John Yates @ 2022-08-24  1:18 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Fu Yuan, jostein, Ergus via Emacs development discussions.,
	Tuấn-Anh Nguyễn, Markus Triska, Theodor Thornhill,
	dgutov

On Tue, Aug 23, 2022 at 10:34 AM Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>
> practice seems to have shown that a "level" is too
> meaningless and coarse).

My favorite example is the *nix "run levels".  "Level" seems
to suggest a total ordering.  So one might reasonably expect
higher levels to exhibit all of the functionality of lower levels
and then some.  Sadly that is not the case.  A "run level" is
just a random collection of features and capabilities unrelated
to all other run levels.

Obligatory quote:

“When I use a word,” Humpty Dumpty said, in rather a scornful
tone, “it means just what I choose it to mean—neither more nor
less.” “The question is,” said Alice, “whether you can make
words mean so many different things.” “The question is,” said
Humpty Dumpty, “which is to be master—that’s all.”



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-08-20 14:14       ` Stefan Monnier
  2022-08-23  1:53         ` Fu Yuan
@ 2022-09-06  2:53         ` Clément Pit-Claudel
  2022-09-06  4:44           ` Macros considered harmful (was: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)) Stefan Monnier
  2022-09-07  0:41           ` Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Yuan Fu
  1 sibling, 2 replies; 19+ messages in thread
From: Clément Pit-Claudel @ 2022-09-06  2:53 UTC (permalink / raw)
  To: emacs-devel

On 8/20/22 07:14, Stefan Monnier wrote:
> FWIW, I think specifying the highlighting rules with something akin to:
> 
>      (defvar <foo> '<rules>)
> 
> is a mistake.  It should go through some kind of macro, such as (maybe):
> 
>      (defvar <foo> (tree-sitter-rules <rules>))
> 
> which can thus do any preprocessing we may want, such as pre-compiling
> queries.  It also helps evolve the syntax since we can more easily warn
> about obsolete uses, etc...
> 
> I've had a "rewrite font-lock.el so the rules go through a macro" in my
> todo list for ages.

We do things this way in Flycheck, but we've been bitten a few times by the way this pattern interacts with `with-eval-after-load`, so I would be careful about adopting this pattern in tree-sitter (unless we expect it to be preloaded).

In fact, I think your suggestion back then was to *not* use a macro?  I can't find the exact thread, but this is close enough: https://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00820.html :)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Macros considered harmful (was: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode))
  2022-09-06  2:53         ` Clément Pit-Claudel
@ 2022-09-06  4:44           ` Stefan Monnier
  2022-09-06 15:33             ` Macros considered harmful Basil L. Contovounesios
  2022-09-07  0:41           ` Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Yuan Fu
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2022-09-06  4:44 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

Clément Pit-Claudel [2022-09-05 19:53:26] wrote:
> On 8/20/22 07:14, Stefan Monnier wrote:
>> FWIW, I think specifying the highlighting rules with something akin to:
>>      (defvar <foo> '<rules>)
>> is a mistake.  It should go through some kind of macro, such as (maybe):
>>      (defvar <foo> (tree-sitter-rules <rules>))
>> which can thus do any preprocessing we may want, such as pre-compiling
>> queries.  It also helps evolve the syntax since we can more easily warn
>> about obsolete uses, etc...
>> I've had a "rewrite font-lock.el so the rules go through a macro" in my
>> todo list for ages.
>
> We do things this way in Flycheck, but we've been bitten a few times by the
> way this pattern interacts with `with-eval-after-load`, so I would be
> careful about adopting this pattern in tree-sitter (unless we expect it to
> be preloaded).

Very good point.  It would introduce a strong compile-time dependency on
the tree-sitter package, which could be a serious annoyance for random
`foo-mode` packages which want to keep working in the absence of
tree-sitter.

Hmm...

[ I guess we'd have to make that macro lightweight and independent from
  tree-sitter itself and put it into a separate package distributed via
  GNU ELPA, so packages can feel free to depend on it.
  Hmm...  Another problem.  ]

> In fact, I think your suggestion back then was to *not* use a macro?

Indeed,

[ Lua eshews macros altogether for those kinds of reasons, AFAIU.  ]

Admittedly, another way around these kinds of problems is to teach the
compiler how to deal with an unknown macro.  I.e. something like
(declare-macro my-foo ...) so that if the compiler see (my-foo ...) but
`my-foo` can't be macroexpanded (because the macro is not yet defined),
it doesn't incorrectly compile it into a function call, but instead
residualizes it into something like a call to `eval`.  Making it
interact correctly with lexical scoping could be tricky (I guess the
simplest solution would be to residualize the whole toplevel expression
in which the macro call was found).


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Macros considered harmful
  2022-09-06  4:44           ` Macros considered harmful (was: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)) Stefan Monnier
@ 2022-09-06 15:33             ` Basil L. Contovounesios
  2022-09-06 16:01               ` Stefan Monnier
                                 ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Basil L. Contovounesios @ 2022-09-06 15:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Clément Pit-Claudel, emacs-devel

Stefan Monnier [2022-09-06 00:44 -0400] wrote:

> Clément Pit-Claudel [2022-09-05 19:53:26] wrote:
>> On 8/20/22 07:14, Stefan Monnier wrote:
>>> FWIW, I think specifying the highlighting rules with something akin to:
>>>      (defvar <foo> '<rules>)
>>> is a mistake.  It should go through some kind of macro, such as (maybe):
>>>      (defvar <foo> (tree-sitter-rules <rules>))
>>> which can thus do any preprocessing we may want, such as pre-compiling
>>> queries.  It also helps evolve the syntax since we can more easily warn
>>> about obsolete uses, etc...
>>> I've had a "rewrite font-lock.el so the rules go through a macro" in my
>>> todo list for ages.
>>
>> We do things this way in Flycheck, but we've been bitten a few times by the
>> way this pattern interacts with `with-eval-after-load`, so I would be
>> careful about adopting this pattern in tree-sitter (unless we expect it to
>> be preloaded).
>
> Very good point.  It would introduce a strong compile-time dependency on
> the tree-sitter package, which could be a serious annoyance for random
> `foo-mode` packages which want to keep working in the absence of
> tree-sitter.
>
> Hmm...
>
> [ I guess we'd have to make that macro lightweight and independent from
>   tree-sitter itself and put it into a separate package distributed via
>   GNU ELPA, so packages can feel free to depend on it.
>   Hmm...  Another problem.  ]
>
>> In fact, I think your suggestion back then was to *not* use a macro?
>
> Indeed,
>
> [ Lua eshews macros altogether for those kinds of reasons, AFAIU.  ]
>
> Admittedly, another way around these kinds of problems is to teach the
> compiler how to deal with an unknown macro.  I.e. something like
> (declare-macro my-foo ...) so that if the compiler see (my-foo ...) but
> `my-foo` can't be macroexpanded (because the macro is not yet defined),
> it doesn't incorrectly compile it into a function call, but instead
> residualizes it into something like a call to `eval`.  Making it
> interact correctly with lexical scoping could be tricky (I guess the
> simplest solution would be to residualize the whole toplevel expression
> in which the macro call was found).

Another downside of macros not directly addressed by this approach is
that packages using them may have the outrageous desire to both support
older Emacsen and build cleanly, at the same time!  Recall, for example,
this unresolved shortdoc thread:
https://lists.gnu.org/r/emacs-devel/2021-09/msg01719.html

I suppose in the general case this can be alleviated only through
'compat'-like ELPA dependencies (or by expecting each package to write
its own compatibility shims, of course).

-- 
Basil



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Macros considered harmful
  2022-09-06 15:33             ` Macros considered harmful Basil L. Contovounesios
@ 2022-09-06 16:01               ` Stefan Monnier
  2022-11-03 18:27                 ` Basil L. Contovounesios
  2022-09-06 16:13               ` Philip Kaludercic
  2022-09-06 16:54               ` T.V Raman
  2 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2022-09-06 16:01 UTC (permalink / raw)
  To: Basil L. Contovounesios; +Cc: Clément Pit-Claudel, emacs-devel

>> Admittedly, another way around these kinds of problems is to teach the
>> compiler how to deal with an unknown macro.  I.e. something like
>> (declare-macro my-foo ...) so that if the compiler see (my-foo ...) but
>> `my-foo` can't be macroexpanded (because the macro is not yet defined),
>> it doesn't incorrectly compile it into a function call, but instead
>> residualizes it into something like a call to `eval`.  Making it
>> interact correctly with lexical scoping could be tricky (I guess the
>> simplest solution would be to residualize the whole toplevel expression
>> in which the macro call was found).

A low-tech way to do it is to let the programmer do it by hand, e.g.:

    (defmacro smalltalk--when-fboundp (sym exp)
      (declare (indent 1) (debug (symbolp form)))
      (if (fboundp sym)
          exp
        ;; `sym' is not defined during compilation, but keep the test at run-time,
        ;; in case we use the compiled file on a newer Emacs.
        `(eval '(if (fboundp ',sym) ,exp))))

It can still break if you use in `exp` lexically scoped vars declared in
the context, but that's considered a "programmer's problem" :-(

> Another downside of macros not directly addressed by this approach is
> that packages using them may have the outrageous desire to both support
> older Emacsen and build cleanly, at the same time!  Recall, for example,
> this unresolved shortdoc thread:
> https://lists.gnu.org/r/emacs-devel/2021-09/msg01719.html

Would this kind of `<foo>--when-fboundp` help there?


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Macros considered harmful
  2022-09-06 15:33             ` Macros considered harmful Basil L. Contovounesios
  2022-09-06 16:01               ` Stefan Monnier
@ 2022-09-06 16:13               ` Philip Kaludercic
  2022-09-06 16:54               ` T.V Raman
  2 siblings, 0 replies; 19+ messages in thread
From: Philip Kaludercic @ 2022-09-06 16:13 UTC (permalink / raw)
  To: Basil L. Contovounesios
  Cc: Stefan Monnier, Clément Pit-Claudel, emacs-devel

"Basil L. Contovounesios" <contovob@tcd.ie> writes:

> Stefan Monnier [2022-09-06 00:44 -0400] wrote:
>
>> Clément Pit-Claudel [2022-09-05 19:53:26] wrote:
>>> On 8/20/22 07:14, Stefan Monnier wrote:
>>>> FWIW, I think specifying the highlighting rules with something akin to:
>>>>      (defvar <foo> '<rules>)
>>>> is a mistake.  It should go through some kind of macro, such as (maybe):
>>>>      (defvar <foo> (tree-sitter-rules <rules>))
>>>> which can thus do any preprocessing we may want, such as pre-compiling
>>>> queries.  It also helps evolve the syntax since we can more easily warn
>>>> about obsolete uses, etc...
>>>> I've had a "rewrite font-lock.el so the rules go through a macro" in my
>>>> todo list for ages.
>>>
>>> We do things this way in Flycheck, but we've been bitten a few times by the
>>> way this pattern interacts with `with-eval-after-load`, so I would be
>>> careful about adopting this pattern in tree-sitter (unless we expect it to
>>> be preloaded).
>>
>> Very good point.  It would introduce a strong compile-time dependency on
>> the tree-sitter package, which could be a serious annoyance for random
>> `foo-mode` packages which want to keep working in the absence of
>> tree-sitter.
>>
>> Hmm...
>>
>> [ I guess we'd have to make that macro lightweight and independent from
>>   tree-sitter itself and put it into a separate package distributed via
>>   GNU ELPA, so packages can feel free to depend on it.
>>   Hmm...  Another problem.  ]
>>
>>> In fact, I think your suggestion back then was to *not* use a macro?
>>
>> Indeed,
>>
>> [ Lua eshews macros altogether for those kinds of reasons, AFAIU.  ]
>>
>> Admittedly, another way around these kinds of problems is to teach the
>> compiler how to deal with an unknown macro.  I.e. something like
>> (declare-macro my-foo ...) so that if the compiler see (my-foo ...) but
>> `my-foo` can't be macroexpanded (because the macro is not yet defined),
>> it doesn't incorrectly compile it into a function call, but instead
>> residualizes it into something like a call to `eval`.  Making it
>> interact correctly with lexical scoping could be tricky (I guess the
>> simplest solution would be to residualize the whole toplevel expression
>> in which the macro call was found).
>
> Another downside of macros not directly addressed by this approach is
> that packages using them may have the outrageous desire to both support
> older Emacsen and build cleanly, at the same time!  Recall, for example,
> this unresolved shortdoc thread:
> https://lists.gnu.org/r/emacs-devel/2021-09/msg01719.html
>
> I suppose in the general case this can be alleviated only through
> 'compat'-like ELPA dependencies (or by expecting each package to write
> its own compatibility shims, of course).

As the maintainer of the package "compat", I'd be more than happy to
help out with these issues.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Macros considered harmful
  2022-09-06 15:33             ` Macros considered harmful Basil L. Contovounesios
  2022-09-06 16:01               ` Stefan Monnier
  2022-09-06 16:13               ` Philip Kaludercic
@ 2022-09-06 16:54               ` T.V Raman
  2 siblings, 0 replies; 19+ messages in thread
From: T.V Raman @ 2022-09-06 16:54 UTC (permalink / raw)
  To: Basil L. Contovounesios
  Cc: Stefan Monnier, Clément Pit-Claudel, emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030, Size: 3101 bytes --]

"Basil L. Contovounesios" <contovob@tcd.ie> writes:


Note that all this gets significantly more complex in the world of
native-comp as long as the .eln files are generated from the .el, rather
than the .elc files.

> Stefan Monnier [2022-09-06 00:44 -0400] wrote:
>
>> Cl¨¦ment Pit-Claudel [2022-09-05 19:53:26] wrote:
>>> On 8/20/22 07:14, Stefan Monnier wrote:
>>>> FWIW, I think specifying the highlighting rules with something akin to:
>>>>      (defvar <foo> '<rules>)
>>>> is a mistake.  It should go through some kind of macro, such as (maybe):
>>>>      (defvar <foo> (tree-sitter-rules <rules>))
>>>> which can thus do any preprocessing we may want, such as pre-compiling
>>>> queries.  It also helps evolve the syntax since we can more easily warn
>>>> about obsolete uses, etc...
>>>> I've had a "rewrite font-lock.el so the rules go through a macro" in my
>>>> todo list for ages.
>>>
>>> We do things this way in Flycheck, but we've been bitten a few times by the
>>> way this pattern interacts with `with-eval-after-load`, so I would be
>>> careful about adopting this pattern in tree-sitter (unless we expect it to
>>> be preloaded).
>>
>> Very good point.  It would introduce a strong compile-time dependency on
>> the tree-sitter package, which could be a serious annoyance for random
>> `foo-mode` packages which want to keep working in the absence of
>> tree-sitter.
>>
>> Hmm...
>>
>> [ I guess we'd have to make that macro lightweight and independent from
>>   tree-sitter itself and put it into a separate package distributed via
>>   GNU ELPA, so packages can feel free to depend on it.
>>   Hmm...  Another problem.  ]
>>
>>> In fact, I think your suggestion back then was to *not* use a macro?
>>
>> Indeed,
>>
>> [ Lua eshews macros altogether for those kinds of reasons, AFAIU.  ]
>>
>> Admittedly, another way around these kinds of problems is to teach the
>> compiler how to deal with an unknown macro.  I.e. something like
>> (declare-macro my-foo ...) so that if the compiler see (my-foo ...) but
>> `my-foo` can't be macroexpanded (because the macro is not yet defined),
>> it doesn't incorrectly compile it into a function call, but instead
>> residualizes it into something like a call to `eval`.  Making it
>> interact correctly with lexical scoping could be tricky (I guess the
>> simplest solution would be to residualize the whole toplevel expression
>> in which the macro call was found).
>
> Another downside of macros not directly addressed by this approach is
> that packages using them may have the outrageous desire to both support
> older Emacsen and build cleanly, at the same time!  Recall, for example,
> this unresolved shortdoc thread:
> https://lists.gnu.org/r/emacs-devel/2021-09/msg01719.html
>
> I suppose in the general case this can be alleviated only through
> 'compat'-like ELPA dependencies (or by expecting each package to write
> its own compatibility shims, of course).

-- 

Thanks,

--Raman(I Search, I Find, I Misplace, I Research)
7©4 Id: kg:/m/0285kf1  •0Ü8



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)
  2022-09-06  2:53         ` Clément Pit-Claudel
  2022-09-06  4:44           ` Macros considered harmful (was: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)) Stefan Monnier
@ 2022-09-07  0:41           ` Yuan Fu
  1 sibling, 0 replies; 19+ messages in thread
From: Yuan Fu @ 2022-09-07  0:41 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel



> On Sep 5, 2022, at 7:53 PM, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote:
> 
> On 8/20/22 07:14, Stefan Monnier wrote:
>> FWIW, I think specifying the highlighting rules with something akin to:
>>     (defvar <foo> '<rules>)
>> is a mistake.  It should go through some kind of macro, such as (maybe):
>>     (defvar <foo> (tree-sitter-rules <rules>))
>> which can thus do any preprocessing we may want, such as pre-compiling
>> queries.  It also helps evolve the syntax since we can more easily warn
>> about obsolete uses, etc...
>> I've had a "rewrite font-lock.el so the rules go through a macro" in my
>> todo list for ages.
> 
> We do things this way in Flycheck, but we've been bitten a few times by the way this pattern interacts with `with-eval-after-load`, so I would be careful about adopting this pattern in tree-sitter (unless we expect it to be preloaded).
> 
> In fact, I think your suggestion back then was to *not* use a macro?  I can't find the exact thread, but this is close enough: https://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00820.html :)
> 

I see Stefan’s follow-up in the new thread, but in terms of tree-sitter, I actually defined it as a function rather than a macro (because the habit of always preferring functions if a function will do). IIUC function doesn’t have the same problem as macros (when byte-compiling)? It is used like this:

(treesit-font-lock-rules
 :language 'c
 '((false) @font-lock-constant-face))

Definition:

(defun treesit-font-lock-rules (&rest args)
  "Return a value suitable for `treesit-font-lock-settings'.

Take a series of QUERIES in either string, s-expression or
compiled form.  Same as in `treesit-font-lock-settings', for each
query, captured nodes are highlighted with the capture name as
its face.

Before each QUERY there could be :KEYWORD VALUE pairs that
configure the query.  For example,

    (treesit-font-lock-rules
     :language javascript
     '((true) @font-lock-constant-face
       (false) @font-lock-constant-face
     :language html
     \"(script_element) @font-lock-builtin-face\")

For each QUERY, a :language keyword is required.  Currently the
only recognized keyword is :language.

\(fn :KEYWORD VALUE QUERY...)"
  (let (;; Tracks the current language that following queries will
        ;; apply to.
        (current-language nil)
        ;; The list this function returns.
        (result nil))
    (while args
      (let ((token (pop args)))
        (pcase token
          (:language
           (let ((lang (pop args)))
             (when (or (not (symbolp lang)) (null lang))
               (signal 'wrong-type-argument `(symbolp ,lang)))
             (setq current-language lang)))
          ((pred treesit-query-p)
           (when (null current-language)
             (signal 'treesit-error
                     `("Language unspecified, use :language keyword to specify a language for this query" ,token)))
           (if (treesit-compiled-query-p token)
               (push `(,current-language token) result)
             (push `(,current-language
                     ,(treesit-query-compile current-language token))
                   result))
           ;; Clears any configurations set for this query.
           (setq current-language nil))
          (_ (signal 'treesit-error
                     `("Unexpected value" token))))))
    (nreverse result)))

Yuan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Macros considered harmful
  2022-09-06 16:01               ` Stefan Monnier
@ 2022-11-03 18:27                 ` Basil L. Contovounesios
  0 siblings, 0 replies; 19+ messages in thread
From: Basil L. Contovounesios @ 2022-11-03 18:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Clément Pit-Claudel, emacs-devel

Stefan Monnier [2022-09-06 12:01 -0400] wrote:

>>> Admittedly, another way around these kinds of problems is to teach the
>>> compiler how to deal with an unknown macro.  I.e. something like
>>> (declare-macro my-foo ...) so that if the compiler see (my-foo ...) but
>>> `my-foo` can't be macroexpanded (because the macro is not yet defined),
>>> it doesn't incorrectly compile it into a function call, but instead
>>> residualizes it into something like a call to `eval`.  Making it
>>> interact correctly with lexical scoping could be tricky (I guess the
>>> simplest solution would be to residualize the whole toplevel expression
>>> in which the macro call was found).
>
> A low-tech way to do it is to let the programmer do it by hand, e.g.:
>
>     (defmacro smalltalk--when-fboundp (sym exp)
>       (declare (indent 1) (debug (symbolp form)))
>       (if (fboundp sym)
>           exp
>         ;; `sym' is not defined during compilation, but keep the test at run-time,
>         ;; in case we use the compiled file on a newer Emacs.
>         `(eval '(if (fboundp ',sym) ,exp))))
>
> It can still break if you use in `exp` lexically scoped vars declared in
> the context, but that's considered a "programmer's problem" :-(
>
>> Another downside of macros not directly addressed by this approach is
>> that packages using them may have the outrageous desire to both support
>> older Emacsen and build cleanly, at the same time!  Recall, for example,
>> this unresolved shortdoc thread:
>> https://lists.gnu.org/r/emacs-devel/2021-09/msg01719.html
>
> Would this kind of `<foo>--when-fboundp` help there?

Yes, thanks, it allows the macro to be used within with-eval-after-load
across all Emacs versions.  Sadly eval-after-load is still needed to
account for Emacs 28 not autoloading the shortdoc entrypoint macro, but
that's down to the package's API/hooks, not the use of macros.

-- 
Basil



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-11-03 18:27 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-15 12:32 Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Jostein Kjønigsen
2022-08-15 12:50 ` Dmitry Gutov
2022-08-15 13:53 ` Stefan Monnier
2022-08-18  7:50   ` Yuan Fu
2022-08-18  9:44 ` Yuan Fu
2022-08-19  6:01   ` Jostein Kjønigsen
2022-08-19 21:58     ` Yuan Fu
2022-08-20 14:14       ` Stefan Monnier
2022-08-23  1:53         ` Fu Yuan
2022-08-23 14:30           ` Stefan Monnier
2022-08-24  1:18             ` John Yates
2022-09-06  2:53         ` Clément Pit-Claudel
2022-09-06  4:44           ` Macros considered harmful (was: Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode)) Stefan Monnier
2022-09-06 15:33             ` Macros considered harmful Basil L. Contovounesios
2022-09-06 16:01               ` Stefan Monnier
2022-11-03 18:27                 ` Basil L. Contovounesios
2022-09-06 16:13               ` Philip Kaludercic
2022-09-06 16:54               ` T.V Raman
2022-09-07  0:41           ` Tree-sitter integration on feature/tree-sitter (severe performance issues together with linum-mode) Yuan Fu

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).