unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Arthur Miller <arthur.miller@live.com>
To: Andrei Kuznetsov <r12451428287@163.com>
Cc: Eli Zaretskii <eliz@gnu.org>,
	Stephen Leake <stephen_leake@stephe-leake.org>,
	manuel@ledu-giraud.fr, emacs-devel@gnu.org
Subject: Re: [SPAM UNSURE] Maybe we're taking a wrong approach towards tree-sitter
Date: Fri, 30 Jul 2021 14:06:00 +0200	[thread overview]
Message-ID: <AM9PR09MB49775E10862C9E89DF85AD8796EC9@AM9PR09MB4977.eurprd09.prod.outlook.com> (raw)
In-Reply-To: <87o8akmy4p.fsf@163.com> (Andrei Kuznetsov's message of "Fri, 30 Jul 2021 08:41:26 +0800")

Andrei Kuznetsov <r12451428287@163.com> writes:

 Leake <stephen_leake@stephe-leake.org> writes:
>
>> That's true for the common TS runtime, which implements the parser and
>> error recovery, but the code for each language, that builds the LR parse
>> table and some other data structures, is generated in C from a grammar
>> file written in javascript, and must be linked into Emacs somehow. In
>> addition, some languages require an "external scanner", which is more
>> code in C that is specific to the language.
>
> Interesting.  I assume it would be possible to reuse the source grammar
> files?

It probably is, and looking at neowim's gh repo, there are some
instructions on how to create a grammar for new language:

https://github.com/nvim-treesitter/nvim-treesitter

The process could probably be somehow automated from lisp.

I have though a sincere question about this entire tree-sitter
venture. Is it really worth trouble in Emacs case? As I understand TS it
is a specialized regex matcher, and looking at some language specs leave
me with that feeling (for example the grammar for bash):

https://github.com/tree-sitter/tree-sitter-bash/blob/master/src/grammar.json

I undestand that having specialized regex matcher is more efficient than
some generalized regular matcher current font-locking in Emacs relies
upon, but is it *that* more efficient to be worth the extra troubles?
TS seem to keep state (a node) for each character typed, that will be a
lot of memory consumed in some big files. If this syntax tree it keeps
to implement what it does can be re-used for something else than it
could be very useful, but just for syntax-highlight and indentation?
Some years ago, when opening some 10k lines as found in Emacs src dir, I
noticed some slowdown on font lock. But nowadays I don't experience any
hickups with syntax hightlighting or indentation.

Anyway, it is very educating to see TS get merged into Emacs and to read
Eli's tips and guidance about Emacs internals.



  reply	other threads:[~2021-07-30 12:06 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-28  1:57 Maybe we're taking a wrong approach towards tree-sitter Andrei Kuznetsov
2021-07-28  3:53 ` [SPAM UNSURE] " Stephen Leake
2021-07-28  8:23   ` Manuel Giraud
2021-07-28 11:48     ` Andrei Kuznetsov
2021-07-28 13:04       ` Eli Zaretskii
2021-07-28 13:14         ` Andrei Kuznetsov
2021-07-28 13:27           ` Eli Zaretskii
2021-07-28 13:31             ` Andrei Kuznetsov
2021-07-28 14:24             ` Dmitry Gutov
2021-07-28 14:36               ` Dmitry Gutov
2021-07-28 14:51               ` Daniele Nicolodi
2021-07-28 16:10               ` Eli Zaretskii
2021-07-28 16:24                 ` Perry E. Metzger
2021-07-28 16:29                   ` Eli Zaretskii
2021-07-29 23:12         ` Stephen Leake
2021-07-29 23:21           ` Yuan Fu
2021-07-30 18:38             ` Stephen Leake
2021-07-30  0:41           ` Andrei Kuznetsov
2021-07-30 12:06             ` Arthur Miller [this message]
2021-07-30 12:52               ` Óscar Fuentes
2021-07-30 13:30                 ` Arthur Miller
2021-07-30 13:57                   ` Ergus
2021-07-30 14:52                     ` Arthur Miller
2021-07-30 13:59                   ` Eli Zaretskii
2021-07-30 15:45                     ` Arthur Miller
2021-07-30 13:32               ` Ergus
2021-07-30 15:07                 ` Arthur Miller
2021-08-02 22:13               ` Perry E. Metzger
2021-07-30 18:42             ` Stephen Leake
2021-07-30  6:05           ` Eli Zaretskii
2021-07-31 12:12             ` Stephen Leake
2021-07-31 13:07               ` Eli Zaretskii
2021-07-31 16:55                 ` Stephen Leake
2021-07-31 17:12                   ` Eli Zaretskii
2021-07-28 11:43   ` Andrei Kuznetsov
2021-07-28 11:50     ` Eli Zaretskii
2021-07-28 12:06       ` Andrei Kuznetsov
2021-07-28 13:05         ` Eli Zaretskii
2021-07-28 13:16           ` Andrei Kuznetsov
2021-07-28 12:36     ` Ergus
2021-07-28 13:07       ` Andrei Kuznetsov
2021-07-28 13:16         ` Eli Zaretskii
2021-07-28 13:27           ` Andrei Kuznetsov
2021-07-28 13:32             ` Eli Zaretskii
2021-07-28 13:38               ` Andrei Kuznetsov
2021-07-28 14:41                 ` Manuel Giraud
2021-07-28 15:15                   ` Perry E. Metzger
2021-07-28 16:10                   ` Eli Zaretskii
2021-07-29 23:25         ` [SPAM UNSURE] " Stephen Leake
2021-07-30  0:54           ` Andrei Kuznetsov
2021-07-30  3:02             ` Andrei Kuznetsov
2021-07-30 18:48             ` Stephen Leake
2021-07-28 15:12     ` Perry E. Metzger
2021-07-29 23:28       ` Stephen Leake
2021-07-30  0:19         ` Perry E. Metzger
2021-07-30 18:44           ` [SPAM UNSURE] " Stephen Leake
2021-07-29  4:35     ` Richard Stallman
2021-07-28 15:09 ` Perry E. Metzger
2021-07-29 23:35   ` Stephen Leake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM9PR09MB49775E10862C9E89DF85AD8796EC9@AM9PR09MB4977.eurprd09.prod.outlook.com \
    --to=arthur.miller@live.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=manuel@ledu-giraud.fr \
    --cc=r12451428287@163.com \
    --cc=stephen_leake@stephe-leake.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).