Hey everyone.
Sorry for writing one of those long emails.
For those who wants to cut to the brief, the executive summary
seems to be that using tree-sitter for fontification can give much
lower performance than expected, and used together with linum-mode
for line numbering will cause severe performance-degradation.
As a reference I have 2 major-mode demonstrates this:
Both of these major modes are either implemented (or in the process of being implemented) for the following scenarios:
For csharp-mode we've been successfully been able to pivot from elisp/cc-mode to emacs-tree-sitter with great success. The code is simpler, and performance is perfectly acceptable, and long-standing bugs where fixed in the process.
For typescript-mode we tried to do the same[4], but learnt about Yuan Fu's work before completing. The result instead was a new major-mode depending on native Emacs tree-sitter support[5]. This has also worked out well enough for me to use it as my "daily driver".
Motivated by that success, I've tried to rewrite csharp-mode to also use native Emacs tree-sitter support[6]. And while porting the code seems to work, performance for this mode has been VERY far from acceptable.
Even on a modern, fast Intel CPU, keystrokes are lagging several
seconds behind and it's not really usable. You just have to stop
typing and wait for your input to suddenly appear many, many
seconds later.
This is in great contrast to the csharp-mode implementation which
uses Tuan-Anh's library, and quite opposite of what I would
expect. While perhaps somewhat naive, I honestly expected "native
support" would perform better. Could there be optimizations in
Tuan-Anh's library we need to add treesit.el in Emacs?
Another thing which made me really notice this issue is that by default I have linum-mode enabled for all prog-mode buffers.
And linum-mode -easily- reduces input-performance in tree-sitter mode buffers by a factor of 4 (this has been measured using profile-start, profile-stop and profile-report).
The following profiling-report stems from enabling csharp-mode
based on native Emacs tree-sitter support, linum-mode and then
proceeding to writing a long line with random letters (no need to
be valid code).
382,605,711 71% - linum-update-current
382,605,711 71% - linum-update
382,605,711 71% - mapc
382,601,487 71% - linum-update-window
382,176,351 71% - window-end
382,176,351 71% - jit-lock-function
382,176,351 71% - jit-lock-fontify-now
382,176,351 71% - jit-lock--run-functions
382,176,351 71% - run-hook-wrapped
382,176,351 71% - #<compiled
-0x156ee8ca7e527443>
382,176,351 71% - font-lock-fontify-region
382,127,055 71% + treesit-font-lock-fontify-region
49,296 0% + font-lock-default-fontify-region
30,616 0% linum--face-width
137,009,221 25% - command-execute
I realize linum-mode has been controversial wrt to performance in the past, but this kind of slow-down had me quite surprised. Disabling linum-mode makes the major-mode borderline usable, but it's still much slower than I know it -can- be (based on Thuan-Anh's library).
Can something be done to Yuan's code to make it perform equally to Thuan-Anh's? Are there improvements which can be done to linum-mode to avoid these kinds of issues?
I know for sure I'm not qualified to answer those questions, but
I think it's definitely something which needs to be looked into
and if anyone has anything they want me to provide feedback on
though, I will be more than happy test those changes and report
back.
[1] https://github.com/emacs-csharp/csharp-mode
[2] https://github.com/emacs-typescript/typescript.el
[3] https://github.com/emacs-tree-sitter/elisp-tree-sitter
[4]
https://github.com/emacs-typescript/typescript.el/blob/feature/tsx-support/typescript-tree-sitter.el
[5]
https://git.sr.ht/~theo/tree-sitter-modes/tree/master/item/typescript-mode.el
[6]
https://git.sr.ht/~jostein/tree-sitter-modes/tree/feature/csharp/item/csharp-mode.el