unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Tree-sitter introduction documentation
@ 2022-12-16 14:47 Perry Smith
  2022-12-16 15:06 ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Perry Smith @ 2022-12-16 14:47 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 581 bytes --]

There are (I believe) four pieces to get Tree Sitter major modes to work.

Emacs needs to be compiled with tree-sitter enabled
The tree sitter binary needs to be installed
The tree sitter language specific parser needs to be installed
The appropriate major mode needs to be loaded and enabled

Is there a page either in Info or on the web that contains all these steps?

If not, and others agree, how can I help create one?  I am thinking the entire page should be small and brief with references to more elaborate details on each of the four steps if needed.

Perry


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 14:47 Tree-sitter introduction documentation Perry Smith
@ 2022-12-16 15:06 ` Eli Zaretskii
  2022-12-16 15:24   ` João Távora
  2022-12-16 17:23   ` Perry Smith
  0 siblings, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 15:06 UTC (permalink / raw)
  To: Perry Smith; +Cc: emacs-devel

> From: Perry Smith <pedz@easesoftware.com>
> Date: Fri, 16 Dec 2022 08:47:18 -0600
> 
> There are (I believe) four pieces to get Tree Sitter major modes to work.
> 
> Emacs needs to be compiled with tree-sitter enabled
> The tree sitter binary needs to be installed
> The tree sitter language specific parser needs to be installed
> The appropriate major mode needs to be loaded and enabled

That's correct.

> Is there a page either in Info or on the web that contains all these steps?

Not that I know of, no.

And I'm not sure we have in our manuals places to describe these
setups.

> If not, and others agree, how can I help create one?  I am thinking the entire page should be small and brief with references to more elaborate details on each of the four steps if needed.

My main problem is where to put this stuff.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:06 ` Eli Zaretskii
@ 2022-12-16 15:24   ` João Távora
  2022-12-16 15:36     ` Perry Smith
                       ` (2 more replies)
  2022-12-16 17:23   ` Perry Smith
  1 sibling, 3 replies; 138+ messages in thread
From: João Távora @ 2022-12-16 15:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Perry Smith, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1164 bytes --]

On Fri, Dec 16, 2022 at 3:06 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Perry Smith <pedz@easesoftware.com>
> > Date: Fri, 16 Dec 2022 08:47:18 -0600
> >
> > There are (I believe) four pieces to get Tree Sitter major modes to
work.
> >
> > Emacs needs to be compiled with tree-sitter enabled
> > The tree sitter binary needs to be installed
> > The tree sitter language specific parser needs to be installed
> > The appropriate major mode needs to be loaded and enabled
>
> That's correct.

I agree this is a problem, especially the language specific parser
bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
Finding the tree-sitter system lib was easy enough, but finding the C++
definition object wasn't so easy.  Eventually I made it,  but it needed
compilation from source and a NodeJS toolchain that I didn't know
I needed for that.

> > If not, and others agree, how can I help create one?  I am thinking the
entire page should be small and brief with references to more elaborate
details on each of the four steps if needed.
>
> My main problem is where to put this stuff.

Another problem is how to keep this information up-to-date.

[-- Attachment #2: Type: text/html, Size: 1507 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:24   ` João Távora
@ 2022-12-16 15:36     ` Perry Smith
  2022-12-16 15:43       ` João Távora
  2022-12-16 17:56       ` Philip Kaludercic
  2022-12-16 15:38     ` Eli Zaretskii
  2022-12-16 15:53     ` Manuel Giraud
  2 siblings, 2 replies; 138+ messages in thread
From: Perry Smith @ 2022-12-16 15:36 UTC (permalink / raw)
  To: João Távora; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]



> On Dec 16, 2022, at 09:24, João Távora <joaotavora@gmail.com> wrote:
> 
> On Fri, Dec 16, 2022 at 3:06 PM Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > > From: Perry Smith <pedz@easesoftware.com>
> > > Date: Fri, 16 Dec 2022 08:47:18 -0600
> > >
> > > There are (I believe) four pieces to get Tree Sitter major modes to work.
> > >
> > > Emacs needs to be compiled with tree-sitter enabled
> > > The tree sitter binary needs to be installed
> > > The tree sitter language specific parser needs to be installed
> > > The appropriate major mode needs to be loaded and enabled
> >
> > That's correct.
> 
> I agree this is a problem, especially the language specific parser
> bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
> Finding the tree-sitter system lib was easy enough, but finding the C++
> definition object wasn't so easy.  Eventually I made it,  but it needed
> compilation from source and a NodeJS toolchain that I didn't know
> I needed for that.

This is new news to me.  I downloaded this repository:
https://github.com/casouri/tree-sitter-module

And did: ./build.sh <lang>

The same scripts are in the Emacs source down this path
admin/notes/tree-sitter/build-module

Perry


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:24   ` João Távora
  2022-12-16 15:36     ` Perry Smith
@ 2022-12-16 15:38     ` Eli Zaretskii
  2022-12-16 15:48       ` João Távora
  2022-12-16 16:01       ` Manuel Giraud
  2022-12-16 15:53     ` Manuel Giraud
  2 siblings, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 15:38 UTC (permalink / raw)
  To: João Távora; +Cc: pedz, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Date: Fri, 16 Dec 2022 15:24:59 +0000
> Cc: Perry Smith <pedz@easesoftware.com>, emacs-devel@gnu.org
> 
> I agree this is a problem, especially the language specific parser 
> bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
> Finding the tree-sitter system lib was easy enough, but finding the C++
> definition object wasn't so easy.  Eventually I made it,  but it needed 
> compilation from source and a NodeJS toolchain that I didn't know 
> I needed for that.

No, you don't need a NodeJS toolchain to compile a grammar.  You only
need to compile the C/C++ source files that are part of the grammar,
and then link them into a shared library.  I use a simple Makefile to
build all of them, as the structure of the files and the way to
compile and link them are identical and boilerplate.  And I definitely
don't have NodeJS installed here.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:36     ` Perry Smith
@ 2022-12-16 15:43       ` João Távora
  2022-12-16 17:56       ` Philip Kaludercic
  1 sibling, 0 replies; 138+ messages in thread
From: João Távora @ 2022-12-16 15:43 UTC (permalink / raw)
  To: Perry Smith; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 622 bytes --]

On Fri, Dec 16, 2022 at 3:36 PM Perry Smith <pedz@easesoftware.com> wrote:

>
> This is new news to me.  I downloaded this repository:
> https://github.com/casouri/tree-sitter-module
>
> And did: ./build.sh <lang>
>
> The same scripts are in the Emacs source down this path
> admin/notes/tree-sitter/build-module
>

_This_ is news to me :-D .  I was reading doc/parsing.texi and
it mentions the need for the language definitions, where it looks
for them, etc.  But it didn't bring me to this magic ./build.sh
script! Maybe it should?  That would have been helpful (haven't
actually tried it yet)

João

[-- Attachment #2: Type: text/html, Size: 1088 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:38     ` Eli Zaretskii
@ 2022-12-16 15:48       ` João Távora
  2022-12-16 15:53         ` Perry Smith
  2022-12-16 16:34         ` Eli Zaretskii
  2022-12-16 16:01       ` Manuel Giraud
  1 sibling, 2 replies; 138+ messages in thread
From: João Távora @ 2022-12-16 15:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: pedz, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]

On Fri, Dec 16, 2022 at 3:38 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: João Távora <joaotavora@gmail.com>
> > Date: Fri, 16 Dec 2022 15:24:59 +0000
> > Cc: Perry Smith <pedz@easesoftware.com>, emacs-devel@gnu.org
> >
> > I agree this is a problem, especially the language specific parser
> > bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
> > Finding the tree-sitter system lib was easy enough, but finding the C++
> > definition object wasn't so easy.  Eventually I made it,  but it needed
> > compilation from source and a NodeJS toolchain that I didn't know
> > I needed for that.
>
> No, you don't need a NodeJS toolchain to compile a grammar.  You only
> need to compile the C/C++ source files that are part of the grammar,
> and then link them into a shared library.  I use a simple Makefile to
> build all of them, as the structure of the files and the way to
> compile and link them are identical and boilerplate.  And I definitely
> don't have NodeJS installed here.
>

I used:

   https://aur.archlinux.org/packages/tree-sitter-cpp-git

which builds with the makepkg tool, and am pretty sure
it used NodeJS somewhere down the line.  The language
definition it seems to use is https://github.com/tree-sitter/tree-sitter-cpp
which also contains a log of JS stuff.

Is that where you get your C++ grammar from? Or am I
looking at an alternate outlet for slightly different grammar?
If so where do you get your grammars from and can we
bundle some version of them with Emacs?

João

[-- Attachment #2: Type: text/html, Size: 2396 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:48       ` João Távora
@ 2022-12-16 15:53         ` Perry Smith
  2022-12-16 16:02           ` João Távora
  2022-12-16 16:34         ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Perry Smith @ 2022-12-16 15:53 UTC (permalink / raw)
  To: João Távora; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1814 bytes --]



> On Dec 16, 2022, at 09:48, João Távora <joaotavora@gmail.com> wrote:
> 
> On Fri, Dec 16, 2022 at 3:38 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: João Távora <joaotavora@gmail.com>
> > Date: Fri, 16 Dec 2022 15:24:59 +0000
> > Cc: Perry Smith <pedz@easesoftware.com>, emacs-devel@gnu.org
> >
> > I agree this is a problem, especially the language specific parser
> > bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
> > Finding the tree-sitter system lib was easy enough, but finding the C++
> > definition object wasn't so easy.  Eventually I made it,  but it needed
> > compilation from source and a NodeJS toolchain that I didn't know
> > I needed for that.
> 
> No, you don't need a NodeJS toolchain to compile a grammar.  You only
> need to compile the C/C++ source files that are part of the grammar,
> and then link them into a shared library.  I use a simple Makefile to
> build all of them, as the structure of the files and the way to
> compile and link them are identical and boilerplate.  And I definitely
> don't have NodeJS installed here.
> 
> I used:
> 
>    https://aur.archlinux.org/packages/tree-sitter-cpp-git
> 
> which builds with the makepkg tool, and am pretty sure
> it used NodeJS somewhere down the line.  The language
> definition it seems to use is https://github.com/tree-sitter/tree-sitter-cpp
> which also contains a log of JS stuff.
> 
> Is that where you get your C++ grammar from? Or am I
> looking at an alternate outlet for slightly different grammar?
> If so where do you get your grammars from and can we
> bundle some version of them with Emacs?

This file is in the Emacs source and outlines a lot of good stuff
https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/notes/tree-sitter/starter-guide



[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:24   ` João Távora
  2022-12-16 15:36     ` Perry Smith
  2022-12-16 15:38     ` Eli Zaretskii
@ 2022-12-16 15:53     ` Manuel Giraud
  2022-12-16 15:56       ` João Távora
  2022-12-16 16:39       ` Eli Zaretskii
  2 siblings, 2 replies; 138+ messages in thread
From: Manuel Giraud @ 2022-12-16 15:53 UTC (permalink / raw)
  To: João Távora; +Cc: Eli Zaretskii, Perry Smith, emacs-devel

João Távora <joaotavora@gmail.com> writes:

[...]

> I agree this is a problem, especially the language specific parser
> bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
> Finding the tree-sitter system lib was easy enough, but finding the C++
> definition object wasn't so easy.  Eventually I made it,  but it needed
> compilation from source and a NodeJS toolchain that I didn't know
> I needed for that.

Wait, what?  I thought that emacs would come with such tree-sitter
language definitions (at least for languages supported by
emacs/treesitter).  I'm on openbsd myself and those language specific
parsers don't seem to be packaged here: "pkg_info -Q sitter" returns
only "tree-sitter-0.20.6p2".  I'm not sure I want to play with NodeJS.

IMHO, should tree-sitter become the new official way to have programming
modes it have to be easier to the end user, no?
-- 
Manuel Giraud



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:53     ` Manuel Giraud
@ 2022-12-16 15:56       ` João Távora
  2022-12-16 16:39       ` Eli Zaretskii
  1 sibling, 0 replies; 138+ messages in thread
From: João Távora @ 2022-12-16 15:56 UTC (permalink / raw)
  To: Manuel Giraud; +Cc: Eli Zaretskii, Perry Smith, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1127 bytes --]

On Fri, Dec 16, 2022 at 3:53 PM Manuel Giraud <manuel@ledu-giraud.fr> wrote:

> João Távora <joaotavora@gmail.com> writes:
>
> [...]
>
> > I agree this is a problem, especially the language specific parser
> > bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
> > Finding the tree-sitter system lib was easy enough, but finding the C++
> > definition object wasn't so easy.  Eventually I made it,  but it needed
> > compilation from source and a NodeJS toolchain that I didn't know
> > I needed for that.
>
> Wait, what?  I thought that emacs would come with such tree-sitter
> language definitions (at least for languages supported by
> emacs/treesitter).  I'm on openbsd myself and those language specific
> parsers don't seem to be packaged here: "pkg_info -Q sitter" returns
> only "tree-sitter-0.20.6p2".  I'm not sure I want to play with NodeJS.
>
> IMHO, should tree-sitter become the new official way to have programming
> modes it have to be easier to the end user, no?
>

I agree completely, but wait, maybe not.  Maybe I was seriously
overcomplicating, see Eli's response.

[-- Attachment #2: Type: text/html, Size: 1588 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:38     ` Eli Zaretskii
  2022-12-16 15:48       ` João Távora
@ 2022-12-16 16:01       ` Manuel Giraud
  2022-12-16 16:40         ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Manuel Giraud @ 2022-12-16 16:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: João Távora, pedz, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

[...]

> No, you don't need a NodeJS toolchain to compile a grammar.  You only
> need to compile the C/C++ source files that are part of the grammar,
> and then link them into a shared library.  I use a simple Makefile to
> build all of them, as the structure of the files and the way to
> compile and link them are identical and boilerplate.  And I definitely
> don't have NodeJS installed here.

And does this will become part of the emacs build process?
-- 
Manuel Giraud



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:53         ` Perry Smith
@ 2022-12-16 16:02           ` João Távora
  2022-12-18  9:59             ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: João Távora @ 2022-12-16 16:02 UTC (permalink / raw)
  To: Perry Smith, Manuel Giraud, Yuan Fu; +Cc: Eli Zaretskii, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 491 bytes --]

On Fri, Dec 16, 2022 at 3:53 PM Perry Smith <pedz@easesoftware.com> wrote:

> This file is in the Emacs source and outlines a lot of good stuff
>
https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/notes/tree-sitter/starter-guide

The most important bit for end users and not mode authors is
"* Install language definitions".  I think that section should be in the
manual even if it means some links to the internet/github for
grabbing Yuan Fu's tool and the grammars.

João

[-- Attachment #2: Type: text/html, Size: 1059 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:48       ` João Távora
  2022-12-16 15:53         ` Perry Smith
@ 2022-12-16 16:34         ` Eli Zaretskii
  2022-12-17  0:03           ` Tim Cross
  1 sibling, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 16:34 UTC (permalink / raw)
  To: João Távora; +Cc: pedz, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Date: Fri, 16 Dec 2022 15:48:03 +0000
> Cc: pedz@easesoftware.com, emacs-devel@gnu.org
> 
>  No, you don't need a NodeJS toolchain to compile a grammar.  You only
>  need to compile the C/C++ source files that are part of the grammar,
>  and then link them into a shared library.  I use a simple Makefile to
>  build all of them, as the structure of the files and the way to
>  compile and link them are identical and boilerplate.  And I definitely
>  don't have NodeJS installed here.
> 
> I used:
> 
>    https://aur.archlinux.org/packages/tree-sitter-cpp-git
> 
> which builds with the makepkg tool, and am pretty sure
> it used NodeJS somewhere down the line.  The language
> definition it seems to use is https://github.com/tree-sitter/tree-sitter-cpp
> which also contains a log of JS stuff.

It might contain JS stuff, but you only need to compile and link the
C/C++ files in the src subdirectory.  You don't need to even look at
the rest.

> Is that where you get your C++ grammar from?

Yes.

> If so where do you get your grammars from and can we 
> bundle some version of them with Emacs?

No, we won't bundle grammar libraries with Emacs.  It is not in the
scope of the Emacs project to provide external libraries; that's for
distros to arrange and for the individual users to install by
themselves.

There are limits to what Emacs as a project can do about using
external libraries and tools.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:53     ` Manuel Giraud
  2022-12-16 15:56       ` João Távora
@ 2022-12-16 16:39       ` Eli Zaretskii
  2022-12-16 17:15         ` Manuel Giraud
  1 sibling, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 16:39 UTC (permalink / raw)
  To: Manuel Giraud; +Cc: joaotavora, pedz, emacs-devel

> From: Manuel Giraud <manuel@ledu-giraud.fr>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Perry Smith <pedz@easesoftware.com>,
>   emacs-devel@gnu.org
> Date: Fri, 16 Dec 2022 16:53:45 +0100
> 
> Wait, what?  I thought that emacs would come with such tree-sitter
> language definitions (at least for languages supported by
> emacs/treesitter).

No, we won't provide language grammar libraries with Emacs.  You will
have to download and install them yourself.  They are shared libraries
written in C/C++, and maintained and distributed by people who have no
affiliation with the Emacs project.

> I'm on openbsd myself and those language specific
> parsers don't seem to be packaged here: "pkg_info -Q sitter" returns
> only "tree-sitter-0.20.6p2".

Then please take this up with your distro folks.  Or build the
libraries yourself, it isn't hard.

> I'm not sure I want to play with NodeJS.

You don't need to.

> IMHO, should tree-sitter become the new official way to have programming
> modes it have to be easier to the end user, no?

Easier, yes.  But up to a point.  We cannot be expected to distribute
third-party libraries instead of the distro.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 16:01       ` Manuel Giraud
@ 2022-12-16 16:40         ` Eli Zaretskii
  2022-12-16 16:47           ` Perry Smith
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 16:40 UTC (permalink / raw)
  To: Manuel Giraud; +Cc: joaotavora, pedz, emacs-devel

> From: Manuel Giraud <manuel@ledu-giraud.fr>
> Cc: João Távora <joaotavora@gmail.com>,
>   pedz@easesoftware.com,
>   emacs-devel@gnu.org
> Date: Fri, 16 Dec 2022 17:01:49 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> [...]
> 
> > No, you don't need a NodeJS toolchain to compile a grammar.  You only
> > need to compile the C/C++ source files that are part of the grammar,
> > and then link them into a shared library.  I use a simple Makefile to
> > build all of them, as the structure of the files and the way to
> > compile and link them are identical and boilerplate.  And I definitely
> > don't have NodeJS installed here.
> 
> And does this will become part of the emacs build process?

No.  Just like building librsvg or GnuTLS aren't part of the Emacs
build process.  they are external libraries that you need to install
separately.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 16:40         ` Eli Zaretskii
@ 2022-12-16 16:47           ` Perry Smith
  2022-12-16 17:21             ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Perry Smith @ 2022-12-16 16:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Manuel Giraud, João Távora, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 504 bytes --]


> On Dec 16, 2022, at 10:40, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> No.  Just like building librsvg or GnuTLS aren't part of the Emacs
> build process.  they are external libraries that you need to install
> separately.

Am I correct in assuming that the libtree-sitter library is conditionally
loaded so Emacs could be built with tree-sitter and distributed but if
the end user’s machine does not have libtree-sitter, Emacs will still
load but (treesit-available-p) will return false?


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 16:39       ` Eli Zaretskii
@ 2022-12-16 17:15         ` Manuel Giraud
  2022-12-16 17:23           ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Manuel Giraud @ 2022-12-16 17:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: joaotavora, pedz, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

[...]

> Then please take this up with your distro folks.  Or build the
> libraries yourself, it isn't hard.

It might not be hard but, as always with those things, it is another
burden on the user.

[...]

> Easier, yes.  But up to a point.  We cannot be expected to distribute
> third-party libraries instead of the distro.

Ok but with tree-sitter, it feels like there is two level of
third-party:
        - the tree-sitter library
        - the «language specicic part» library

The former seems to be easily available in distro's packages (as is
librsvg, libcairo, etc.) but the latter does not seem to be as
accessible.  I hope it will change with time and adoption of tree-sitter
otherwise tree-sitter usage (at least in emacs) will end up being
«expert» matter.
-- 
Manuel Giraud



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 16:47           ` Perry Smith
@ 2022-12-16 17:21             ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 17:21 UTC (permalink / raw)
  To: Perry Smith; +Cc: manuel, joaotavora, emacs-devel

> From: Perry Smith <pedz@easesoftware.com>
> Date: Fri, 16 Dec 2022 10:47:32 -0600
> Cc: Manuel Giraud <manuel@ledu-giraud.fr>,
>  João Távora <joaotavora@gmail.com>,
>  emacs-devel <emacs-devel@gnu.org>
> 
> Am I correct in assuming that the libtree-sitter library is conditionally
> loaded so Emacs could be built with tree-sitter and distributed but if
> the end user’s machine does not have libtree-sitter, Emacs will still
> load but (treesit-available-p) will return false?

That is correct for the language grammar libraries, but not for the
tree-sitter library itself.  Emacs can be built either with or without
the tree-sitter library, just like we do with libpng and other
optional image libraries.

If Emacs was build with the tree-sitter library, then it will attempt
to load the language grammar libraries (whcih are separate libraries)
dynamically when the corresponding language mode is activated.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 17:15         ` Manuel Giraud
@ 2022-12-16 17:23           ` Eli Zaretskii
  2022-12-16 20:22             ` Ken Brown
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 17:23 UTC (permalink / raw)
  To: Manuel Giraud; +Cc: joaotavora, pedz, emacs-devel

> From: Manuel Giraud <manuel@ledu-giraud.fr>
> Cc: joaotavora@gmail.com,  pedz@easesoftware.com,  emacs-devel@gnu.org
> Date: Fri, 16 Dec 2022 18:15:38 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Easier, yes.  But up to a point.  We cannot be expected to distribute
> > third-party libraries instead of the distro.
> 
> Ok but with tree-sitter, it feels like there is two level of
> third-party:
>         - the tree-sitter library
>         - the «language specicic part» library
> 
> The former seems to be easily available in distro's packages (as is
> librsvg, libcairo, etc.) but the latter does not seem to be as
> accessible.  I hope it will change with time and adoption of tree-sitter
> otherwise tree-sitter usage (at least in emacs) will end up being
> «expert» matter.

I think it indeed will change very soon, as soon as the distros
realize that Emacs 29 needs that to be able to use the -ts- modes.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:06 ` Eli Zaretskii
  2022-12-16 15:24   ` João Távora
@ 2022-12-16 17:23   ` Perry Smith
  2022-12-16 17:31     ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Perry Smith @ 2022-12-16 17:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 705 bytes --]



> On Dec 16, 2022, at 09:06, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> 
>> If not, and others agree, how can I help create one?  I am thinking the entire page should be small and brief with references to more elaborate details on each of the four steps if needed.
> 
> My main problem is where to put this stuff.

There are tree sitter Info pages in the ELisp topic but I would add some to Emacs topic as well.  Perhaps under
  Advanced Features
    Tree Sitter
      Setup and Getting Started
      Font lock — describe the “features” concept, treesit-font-lock-level and treesit-font-lock-recompute-features

The other topics seem more appropriate in ELisp which already exist.


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 17:23   ` Perry Smith
@ 2022-12-16 17:31     ` Eli Zaretskii
  2022-12-16 19:08       ` Perry Smith
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 17:31 UTC (permalink / raw)
  To: Perry Smith; +Cc: emacs-devel

> From: Perry Smith <pedz@easesoftware.com>
> Date: Fri, 16 Dec 2022 11:23:05 -0600
> Cc: emacs-devel@gnu.org
> 
> > My main problem is where to put this stuff.
> 
> There are tree sitter Info pages in the ELisp topic but I would add some to Emacs topic as well.  Perhaps under
>   Advanced Features
>     Tree Sitter
>       Setup and Getting Started
>       Font lock — describe the “features” concept, treesit-font-lock-level and treesit-font-lock-recompute-features
> 
> The other topics seem more appropriate in ELisp which already exist.

This is not relevant to ELisp, so the ELisp Reference manual is not an
appropriate place for this stuff.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 15:36     ` Perry Smith
  2022-12-16 15:43       ` João Távora
@ 2022-12-16 17:56       ` Philip Kaludercic
  1 sibling, 0 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-16 17:56 UTC (permalink / raw)
  To: Perry Smith; +Cc: João Távora, Eli Zaretskii, emacs-devel

Perry Smith <pedz@easesoftware.com> writes:

>> On Dec 16, 2022, at 09:24, João Távora <joaotavora@gmail.com> wrote:
>> 
>> On Fri, Dec 16, 2022 at 3:06 PM Eli Zaretskii <eliz@gnu.org> wrote:
>> >
>> > > From: Perry Smith <pedz@easesoftware.com>
>> > > Date: Fri, 16 Dec 2022 08:47:18 -0600
>> > >
>> > > There are (I believe) four pieces to get Tree Sitter major modes to work.
>> > >
>> > > Emacs needs to be compiled with tree-sitter enabled
>> > > The tree sitter binary needs to be installed
>> > > The tree sitter language specific parser needs to be installed
>> > > The appropriate major mode needs to be loaded and enabled
>> >
>> > That's correct.
>> 
>> I agree this is a problem, especially the language specific parser
>> bits.  Yesterday I tried out tree sitter Emacs on my Arch system.
>> Finding the tree-sitter system lib was easy enough, but finding the C++
>> definition object wasn't so easy.  Eventually I made it,  but it needed
>> compilation from source and a NodeJS toolchain that I didn't know
>> I needed for that.
>
> This is new news to me.  I downloaded this repository:
> https://github.com/casouri/tree-sitter-module
>
> And did: ./build.sh <lang>

Certainly this is not the intended way to get tree-sitter working, at
least in the long term?  I am not up-to-date with tree-sitter
development in general, but is it planned for distributions to package
the grammars and have them installed on a language-level?  I don't
suppose it would make sense for them to be distributed along with Emacs,
or does it?

> The same scripts are in the Emacs source down this path
> admin/notes/tree-sitter/build-module
>
> Perry



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 17:31     ` Eli Zaretskii
@ 2022-12-16 19:08       ` Perry Smith
  2022-12-16 19:37         ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Perry Smith @ 2022-12-16 19:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1283 bytes --]

On Dec 16, 2022, at 11:31, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Perry Smith <pedz@easesoftware.com>
>> Date: Fri, 16 Dec 2022 11:23:05 -0600
>> Cc: emacs-devel@gnu.org
>> 
>>> My main problem is where to put this stuff.
>> 
>> There are tree sitter Info pages in the ELisp topic but I would add some to Emacs topic as well.  Perhaps under
>>  Advanced Features
>>    Tree Sitter
>>      Setup and Getting Started
>>      Font lock — describe the “features” concept, treesit-font-lock-level and treesit-font-lock-recompute-features
>> 
>> The other topics seem more appropriate in ELisp which already exist.
> 
> This is not relevant to ELisp, so the ELisp Reference manual is not an
> appropriate place for this stuff.

We might be talking about two different things.  In
admin/notes/tree-sitter/html-manual are 10 html documents.  For
example one is titled "Accessing Node Information".  All of those come
from nodes within the ELisp Info doc. e.g.

https://git.savannah.gnu.org/cgit/emacs.git/tree/doc/lispref/parsing.texi#n875

That seems like the right place for those nodes since they are rather
low level.

The higher level user interface concepts I would think would need to
be in the Emacs Info tree as I suggested.

Perry


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 19:08       ` Perry Smith
@ 2022-12-16 19:37         ` Eli Zaretskii
  2022-12-16 20:05           ` Perry Smith
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-16 19:37 UTC (permalink / raw)
  To: Perry Smith; +Cc: emacs-devel

> From: Perry Smith <pedz@easesoftware.com>
> Date: Fri, 16 Dec 2022 13:08:45 -0600
> Cc: emacs-devel@gnu.org
> 
> >>> My main problem is where to put this stuff.
> >> 
> >> There are tree sitter Info pages in the ELisp topic but I would add some to Emacs topic as well.  Perhaps under
> >>  Advanced Features
> >>    Tree Sitter
> >>      Setup and Getting Started
> >>      Font lock — describe the “features” concept, treesit-font-lock-level and treesit-font-lock-recompute-features
> >> 
> >> The other topics seem more appropriate in ELisp which already exist.
> > 
> > This is not relevant to ELisp, so the ELisp Reference manual is not an
> > appropriate place for this stuff.
> 
> We might be talking about two different things.  In
> admin/notes/tree-sitter/html-manual are 10 html documents.  For
> example one is titled "Accessing Node Information".  All of those come
> from nodes within the ELisp Info doc. e.g.
> 
> https://git.savannah.gnu.org/cgit/emacs.git/tree/doc/lispref/parsing.texi#n875
> 
> That seems like the right place for those nodes since they are rather
> low level.
> 
> The higher level user interface concepts I would think would need to
> be in the Emacs Info tree as I suggested.

This all started (for me, anyway), when you wrote:

> Emacs needs to be compiled with tree-sitter enabled
> The tree sitter binary needs to be installed
> The tree sitter language specific parser needs to be installed
> The appropriate major mode needs to be loaded and enabled
> 
> Is there a page either in Info or on the web that contains all these steps?

I interpreted "all these steps" as meaning the description of how to
do all of the above: compile Emacs with tree-sitter enabled, install
the tree-sitter library, install the language grammar libraries.  By
contrast, the information to which you point is about writing code
that accesses parsing results provided by the tree-sitter library, a
very different kind of topic.  That topic _is_ described in the ELisp
manual, see the file doc/lispref/parsing.texi.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 19:37         ` Eli Zaretskii
@ 2022-12-16 20:05           ` Perry Smith
  0 siblings, 0 replies; 138+ messages in thread
From: Perry Smith @ 2022-12-16 20:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2317 bytes --]


> On Dec 16, 2022, at 13:37, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Perry Smith <pedz@easesoftware.com>
>> Date: Fri, 16 Dec 2022 13:08:45 -0600
>> Cc: emacs-devel@gnu.org
>> 
>>>>> My main problem is where to put this stuff.
>>>> 
>>>> There are tree sitter Info pages in the ELisp topic but I would add some to Emacs topic as well.  Perhaps under
>>>> Advanced Features
>>>>   Tree Sitter
>>>>     Setup and Getting Started
>>>>     Font lock — describe the “features” concept, treesit-font-lock-level and treesit-font-lock-recompute-features
>>>> 
>>>> The other topics seem more appropriate in ELisp which already exist.
>>> 
>>> This is not relevant to ELisp, so the ELisp Reference manual is not an
>>> appropriate place for this stuff.
>> 
>> We might be talking about two different things.  In
>> admin/notes/tree-sitter/html-manual are 10 html documents.  For
>> example one is titled "Accessing Node Information".  All of those come
>> from nodes within the ELisp Info doc. e.g.
>> 
>> https://git.savannah.gnu.org/cgit/emacs.git/tree/doc/lispref/parsing.texi#n875
>> 
>> That seems like the right place for those nodes since they are rather
>> low level.
>> 
>> The higher level user interface concepts I would think would need to
>> be in the Emacs Info tree as I suggested.
> 
> This all started (for me, anyway), when you wrote:
> 
>> Emacs needs to be compiled with tree-sitter enabled
>> The tree sitter binary needs to be installed
>> The tree sitter language specific parser needs to be installed
>> The appropriate major mode needs to be loaded and enabled
>> 
>> Is there a page either in Info or on the web that contains all these steps?
> 
> I interpreted "all these steps" as meaning the description of how to
> do all of the above: compile Emacs with tree-sitter enabled, install
> the tree-sitter library, install the language grammar libraries.  By
> contrast, the information to which you point is about writing code
> that accesses parsing results provided by the tree-sitter library, a
> very different kind of topic.  That topic _is_ described in the ELisp
> manual, see the file doc/lispref/parsing.texi.


Ah… I see.  I took the “This” of “This is not relevant …” to be about “The other topics …”


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 17:23           ` Eli Zaretskii
@ 2022-12-16 20:22             ` Ken Brown
  2022-12-17  4:06               ` Tim Cross
  0 siblings, 1 reply; 138+ messages in thread
From: Ken Brown @ 2022-12-16 20:22 UTC (permalink / raw)
  To: Eli Zaretskii, Manuel Giraud; +Cc: joaotavora, pedz, emacs-devel

On 12/16/2022 12:23 PM, Eli Zaretskii wrote:
>> From: Manuel Giraud <manuel@ledu-giraud.fr>
>> Ok but with tree-sitter, it feels like there is two level of
>> third-party:
>>          - the tree-sitter library
>>          - the «language specicic part» library
>>
>> The former seems to be easily available in distro's packages (as is
>> librsvg, libcairo, etc.) but the latter does not seem to be as
>> accessible.  I hope it will change with time and adoption of tree-sitter
>> otherwise tree-sitter usage (at least in emacs) will end up being
>> «expert» matter.
> 
> I think it indeed will change very soon, as soon as the distros
> realize that Emacs 29 needs that to be able to use the -ts- modes.

I wonder how well known this is among distro Emacs maintainers.  I did a quick 
internet search and didn't find any indication that any distros have done it 
yet.  Can anyone point me to an example?

Ken



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 16:34         ` Eli Zaretskii
@ 2022-12-17  0:03           ` Tim Cross
  2022-12-17  8:42             ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Tim Cross @ 2022-12-17  0:03 UTC (permalink / raw)
  To: emacs-devel


Eli Zaretskii <eliz@gnu.org> writes:

>> From: João Távora <joaotavora@gmail.com>
>> Date: Fri, 16 Dec 2022 15:48:03 +0000
>> Cc: pedz@easesoftware.com, emacs-devel@gnu.org
>> 
>>  No, you don't need a NodeJS toolchain to compile a grammar.  You only
>>  need to compile the C/C++ source files that are part of the grammar,
>>  and then link them into a shared library.  I use a simple Makefile to
>>  build all of them, as the structure of the files and the way to
>>  compile and link them are identical and boilerplate.  And I definitely
>>  don't have NodeJS installed here.
>> 
>> I used:
>> 
>>    https://aur.archlinux.org/packages/tree-sitter-cpp-git
>> 
>> which builds with the makepkg tool, and am pretty sure
>> it used NodeJS somewhere down the line.  The language
>> definition it seems to use is https://github.com/tree-sitter/tree-sitter-cpp
>> which also contains a log of JS stuff.
>
> It might contain JS stuff, but you only need to compile and link the
> C/C++ files in the src subdirectory.  You don't need to even look at
> the rest.
>
>> Is that where you get your C++ grammar from?
>
> Yes.
>
>> If so where do you get your grammars from and can we 
>> bundle some version of them with Emacs?
>
> No, we won't bundle grammar libraries with Emacs.  It is not in the
> scope of the Emacs project to provide external libraries; that's for
> distros to arrange and for the individual users to install by
> themselves.
>
> There are limits to what Emacs as a project can do about using
> external libraries and tools.

Given the installation of language grammars is reasonably straight
forward (from what you posted earlier), what about adding a package to
GNU ELPA which could facilitate/do the installation of a set of
language grammars (similar to what the
admin/notes/tree-sitter/build-modules/batch.sh script does.

In addition to making it easier to install required dependencies and
avoid the long tail that will likely exist for many distributions in
implementing packages to install these grammars, it would also help
ensure people install grammars with an acceptable license and
avoid/reduce unwittingly installing non-free licensed code.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 20:22             ` Ken Brown
@ 2022-12-17  4:06               ` Tim Cross
  2022-12-17 15:42                 ` Stefan Monnier
  0 siblings, 1 reply; 138+ messages in thread
From: Tim Cross @ 2022-12-17  4:06 UTC (permalink / raw)
  To: emacs-devel


Ken Brown <kbrown@cornell.edu> writes:

> On 12/16/2022 12:23 PM, Eli Zaretskii wrote:
>>> From: Manuel Giraud <manuel@ledu-giraud.fr>
>>> Ok but with tree-sitter, it feels like there is two level of
>>> third-party:
>>>          - the tree-sitter library
>>>          - the «language specicic part» library
>>>
>>> The former seems to be easily available in distro's packages (as is
>>> librsvg, libcairo, etc.) but the latter does not seem to be as
>>> accessible.  I hope it will change with time and adoption of tree-sitter
>>> otherwise tree-sitter usage (at least in emacs) will end up being
>>> «expert» matter.
>> I think it indeed will change very soon, as soon as the distros
>> realize that Emacs 29 needs that to be able to use the -ts- modes.
>
> I wonder how well known this is among distro Emacs maintainers.  I did a quick internet
> search and didn't find any indication that any distros have done it yet.  Can anyone point
> me to an example?
>
> Ken

From what I've been able to find out, some distributions have the tree
sitter libraries as packages, but none have the grammar definition libs
yet.

Given the lag for some distros in adding new packages, it will likely be
at least 12 month lag before distros will have packaged versions of
language definition libs. For distros with LTS releases, it could mean 2
or 3 years before users see them in their package repos.

Distributions with a rolling release model, like Arch, will likely see
the language definition libs as packages much sooner than the other
distros.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
@ 2022-12-17  4:50 Payas Relekar
  0 siblings, 0 replies; 138+ messages in thread
From: Payas Relekar @ 2022-12-17  4:50 UTC (permalink / raw)
  To: Ken Brown; +Cc: Eli Zaretskii, Manuel Giraud, joaotavora, pedz, emacs-devel

Ken Brown <kbrown@cornell.edu> writes:

> On 12/16/2022 12:23 PM, Eli Zaretskii wrote:
>> I think it indeed will change very soon, as soon as the distros
>> realize that Emacs 29 needs that to be able to use the -ts- modes.
>
> I wonder how well known this is among distro Emacs maintainers.  I did a quick
> internet search and didn't find any indication that any distros have done it
> yet.  Can anyone point me to an example?

There is a very good chance NixOS will include tree-sitter grammars (at
least for the modes supported by Emacs at time of release) with Emacs
build, with user-configurable way to add more grammars as needed.

Current maintainers for building Emacs master for Nix/NixOS are doing a
good job keeping track of upstream here :
https://github.com/nix-community/emacs-overlay

Thanks,
Payas

--



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17  0:03           ` Tim Cross
@ 2022-12-17  8:42             ` Eli Zaretskii
  2022-12-17 10:40               ` João Távora
  2022-12-18  0:40               ` Tim Cross
  0 siblings, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-17  8:42 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-devel

> From: Tim Cross <theophilusx@gmail.com>
> Date: Sat, 17 Dec 2022 11:03:50 +1100
> 
> Given the installation of language grammars is reasonably straight
> forward (from what you posted earlier), what about adding a package to
> GNU ELPA which could facilitate/do the installation of a set of
> language grammars (similar to what the
> admin/notes/tree-sitter/build-modules/batch.sh script does.

I don't mind, but then I don't take care of ELPA, so I'm not the guy
to talk to about this.

Up front, it would be a strange kind of "package" for what ELPA is
supposed to hold, and I also don't understand what would package.el do
with such a "package".  But that's me.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17  8:42             ` Eli Zaretskii
@ 2022-12-17 10:40               ` João Távora
  2022-12-17 11:00                 ` Eli Zaretskii
  2022-12-18  0:40               ` Tim Cross
  1 sibling, 1 reply; 138+ messages in thread
From: João Távora @ 2022-12-17 10:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tim Cross, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1242 bytes --]

I've just run admin/notes/tree-sitter/build-modules/batch.sh and
it worked flawlessly and built many .so.

Could probably have used Make instead of Bash, but worked fine
regardless, and reasonably quickly.

There's one thing that I think could be added which is that the
admin/notes/tree-sitter/build-modules/dist path relative to the
Emacs source directory be pre-set in the variable
treesit-extra-load-path, or something to that effect. Then this could
be even smoother.

But could it be smoother yet?  After all, all that build.sh does
is download and compile some C/C++ code after downloading it
from the internet. Why can't we bundle this code with the Emacs
source distribution and build the shared objects as part of the
normal build process?

Bundling is also one way to help us pin the grammar version,
a dependency of our major mode source code.  The way it is
right now, it seems that if the upstream repository introduces
an incompatible change in the Foo grammar, our foo-ts-mode
will break.

The grammar repositories seem to be MIT licensed, and we
already promote its download anyway. I'm not an expert, but
it seems MIT is compatible with GPL, so we could carry some
MIT code in our repo.

João

[-- Attachment #2: Type: text/html, Size: 1656 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17 10:40               ` João Távora
@ 2022-12-17 11:00                 ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-17 11:00 UTC (permalink / raw)
  To: João Távora; +Cc: theophilusx, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Date: Sat, 17 Dec 2022 10:40:35 +0000
> Cc: Tim Cross <theophilusx@gmail.com>, emacs-devel@gnu.org
> 
> Why can't we bundle this code with the Emacs
> source distribution and build the shared objects as part of the 
> normal build process?  

For the same reason we don't do that for any other optional library
that Emacs can be built with.  And no other GNU project I'm aware of
does something like that.

So no, let's not go there.  Once the distros realize Emacs uses these
grammar libraries, they will get their act together and provide those
libraries as packages for users to download and install.  So there's
no significant problem here I see that we need to solve.  I'm not
interested in adding this burden to what we as a project need to do.

> Bundling is also one way to help us pin the grammar version,
> a dependency of our major mode source code.  The way it is
> right now, it seems that if the upstream repository introduces
> an incompatible change in the Foo grammar, our foo-ts-mode
> will break.

I don't think this is a real concern.  People who maintain these
grammar libraries are aware of the dependencies, and should not be
expected to make incompatible changes.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17  4:06               ` Tim Cross
@ 2022-12-17 15:42                 ` Stefan Monnier
  2022-12-17 17:41                   ` T.V Raman
  2022-12-26 22:42                   ` Dmitry Gutov
  0 siblings, 2 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-17 15:42 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-devel

> Distributions with a rolling release model, like Arch, will likely see
> the language definition libs as packages much sooner than the other
> distros.

Why would we think so?  Those grammars have been in use for a while now
(by other editors), AFAIK, so why would the situation change just because
Emacs starts to use them as well?

I do want the situation to change, mind you, so my question here is
how can we encourage the distributions to change: I don't think we can
expect them to change "just because".


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17 15:42                 ` Stefan Monnier
@ 2022-12-17 17:41                   ` T.V Raman
  2022-12-26 22:42                   ` Dmitry Gutov
  1 sibling, 0 replies; 138+ messages in thread
From: T.V Raman @ 2022-12-17 17:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Tim Cross, emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030, Size: 570 bytes --]

I think step 1 is to set up the Emacs codebase so that  
people packaging emacs for the various linux distros build an Emacs
out-of-the-box with Tree Sitter bits included. That necessarily means
bundling some of the grammars in the interim. If the  Emacs that comes
with the various distros doesn't support Treesitter in a usable form,
the functionality will remain limited to hackers who build Emacs from
source and in the bigger picture --- that is small.

-- 

Thanks,

--Raman(I Search, I Find, I Misplace, I Research)
7©4 Id: kg:/m/0285kf1  •0Ü8



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17  8:42             ` Eli Zaretskii
  2022-12-17 10:40               ` João Távora
@ 2022-12-18  0:40               ` Tim Cross
  1 sibling, 0 replies; 138+ messages in thread
From: Tim Cross @ 2022-12-18  0:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


Eli Zaretskii <eliz@gnu.org> writes:

>> From: Tim Cross <theophilusx@gmail.com>
>> Date: Sat, 17 Dec 2022 11:03:50 +1100
>> 
>> Given the installation of language grammars is reasonably straight
>> forward (from what you posted earlier), what about adding a package to
>> GNU ELPA which could facilitate/do the installation of a set of
>> language grammars (similar to what the
>> admin/notes/tree-sitter/build-modules/batch.sh script does.
>
> I don't mind, but then I don't take care of ELPA, so I'm not the guy
> to talk to about this.
>
> Up front, it would be a strange kind of "package" for what ELPA is
> supposed to hold, and I also don't understand what would package.el do
> with such a "package".  But that's me.

I get your points. My thoughts are that this could be an easy temporary
solution that would provide a smoother initial transition until
distributions do bundle the language grammars.

However, given other editors apart from Emacs use tree sitter, I wonder
why none of the distributions I've looked at are bundling language
grammars? Most of them seem to have packages for the main tree sitter
libs, but none have the language grammars (with possible exception of
Arch, where I think they are available via the AUR, which technically,
isn't part of the distribution).

Wasn't there a MELPA package which provided these language grammars? If
so, I guess we know the idea can work. I also note there are other
packages which will automatically install some non-elisp dependencies
i.e. lsp-mode, ps-tools, all-the-icons etc. 

I'd like to stress my point that I would see this as a temporary and
easy to deprecate approach that would help in the adoption of tree
sitter. There are some options on how it would work - it could do a
direct install of pre-compiled binaries, clone and build from git
repositories, provide an install script the user then has to run or some
other combination.

Once the major distributions have packaged the grammars, it would be
expected they would be automatic or possibly optional dependencies for
the main emacs package. At this point, we might decide to deprecate the
package in favour of using distribution based packages (or possibly just
keep it for platforms which don't provide them?).



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-18  6:32 Pedro Andres Aranda Gutierrez
@ 2022-12-18  8:07 ` Eli Zaretskii
  2022-12-18 10:39   ` Pedro Andres Aranda Gutierrez
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-18  8:07 UTC (permalink / raw)
  To: Pedro Andres Aranda Gutierrez; +Cc: emacs-devel

> From: Pedro Andres Aranda Gutierrez <paaguti@gmail.com>
> Date: Sun, 18 Dec 2022 07:32:50 +0100
> 
> 1.- where will they sit at the end of the day? 

It is IMO best to put them in the same place where you have the rest
of the shared libraries loaded by Emacs.  There's also the special
path treesit-extra-load-path, but my recommendation is to use that
only if you cannot place the grammar libraries together with the rest
of your shared libraries.

> 2.- should I include *all* plug-ins or just the plug-ins I use (mainly Python to get used to it)

Each grammar library is only loaded when the corresponding Emacs mode
is activated.  So if you never turn on some treesit-based mode, you
don't need to have the corresponding grammar library available.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-16 16:02           ` João Távora
@ 2022-12-18  9:59             ` Eli Zaretskii
  2022-12-18 14:07               ` Perry Smith
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-18  9:59 UTC (permalink / raw)
  To: João Távora; +Cc: pedz, manuel, casouri, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Date: Fri, 16 Dec 2022 16:02:30 +0000
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> 
> On Fri, Dec 16, 2022 at 3:53 PM Perry Smith <pedz@easesoftware.com> wrote:
>  
> > This file is in the Emacs source and outlines a lot of good stuff
> > https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/notes/tree-sitter/starter-guide
> 
> The most important bit for end users and not mode authors is 
> "* Install language definitions".  I think that section should be in the 
> manual even if it means some links to the internet/github for
> grabbing Yuan Fu's tool and the grammars.

The manual doesn't explain installation, so it isn't the right place.

I will add the necessary information to NEWS, though.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-18  8:07 ` Eli Zaretskii
@ 2022-12-18 10:39   ` Pedro Andres Aranda Gutierrez
  2022-12-18 11:44     ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Pedro Andres Aranda Gutierrez @ 2022-12-18 10:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1456 bytes --]

Hi Eli,

thanks a lot for confirming :-)

I'm keeping them in treesit-extra-load-path for my experiments and will
follow whatever comes out of this experiment in the sense of
1) hopefully seeing the parser compilation integrated in the general
compilation process
2) same for packaging.

BTW, your answer may be good for a FAQ ;-)

Thanks again, /PA

On Sun, 18 Dec 2022 at 09:07, Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Pedro Andres Aranda Gutierrez <paaguti@gmail.com>
> > Date: Sun, 18 Dec 2022 07:32:50 +0100
> >
> > 1.- where will they sit at the end of the day?
>
> It is IMO best to put them in the same place where you have the rest
> of the shared libraries loaded by Emacs.  There's also the special
> path treesit-extra-load-path, but my recommendation is to use that
> only if you cannot place the grammar libraries together with the rest
> of your shared libraries.
>
> > 2.- should I include *all* plug-ins or just the plug-ins I use (mainly
> Python to get used to it)
>
> Each grammar library is only loaded when the corresponding Emacs mode
> is activated.  So if you never turn on some treesit-based mode, you
> don't need to have the corresponding grammar library available.
>


-- 
Fragen sind nicht da um beantwortet zu werden,
Fragen sind da um gestellt zu werden
Georg Kreisler

Headaches with a Juju log:
unit-basic-16: 09:17:36 WARNING juju.worker.uniter.operation we should run
a leader-deposed hook here, but we can't yet

[-- Attachment #2: Type: text/html, Size: 2209 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-18 10:39   ` Pedro Andres Aranda Gutierrez
@ 2022-12-18 11:44     ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-18 11:44 UTC (permalink / raw)
  To: Pedro Andres Aranda Gutierrez; +Cc: emacs-devel

> From: Pedro Andres Aranda Gutierrez <paaguti@gmail.com>
> Date: Sun, 18 Dec 2022 11:39:10 +0100
> Cc: emacs-devel@gnu.org
> 
> I'm keeping them in treesit-extra-load-path for my experiments and will follow whatever comes out of this
> experiment in the sense of 
> 1) hopefully seeing the parser compilation integrated in the general compilation process

I don't think we should compile these libraries as part of building
Emacs, no.

> BTW, your answer may be good for a FAQ ;-)

IME, no one reads the Emacs FAQ.

But I have added this to NEWS.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-18  9:59             ` Eli Zaretskii
@ 2022-12-18 14:07               ` Perry Smith
  2022-12-18 17:18                 ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Perry Smith @ 2022-12-18 14:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: João Távora, Manuel Giraud, Yuan Fu, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]



> On Dec 18, 2022, at 03:59, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: João Távora <joaotavora@gmail.com>
>> Date: Fri, 16 Dec 2022 16:02:30 +0000
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
>> 
>> On Fri, Dec 16, 2022 at 3:53 PM Perry Smith <pedz@easesoftware.com> wrote:
>> 
>>> This file is in the Emacs source and outlines a lot of good stuff
>>> https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/notes/tree-sitter/starter-guide
>> 
>> The most important bit for end users and not mode authors is
>> "* Install language definitions".  I think that section should be in the
>> manual even if it means some links to the internet/github for
>> grabbing Yuan Fu's tool and the grammars.
> 
> The manual doesn't explain installation, so it isn't the right place.
> 
> I will add the necessary information to NEWS, though.

Temporarily, I put a lot of it in the ruby-ts-mode.el file.  It might help you
a bit:

https://github.com/pedz/ruby-ts-mode/blob/master/ruby-ts-mode.el#L32



[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-18 14:07               ` Perry Smith
@ 2022-12-18 17:18                 ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-18 17:18 UTC (permalink / raw)
  To: Perry Smith; +Cc: joaotavora, manuel, casouri, emacs-devel

> From: Perry Smith <pedz@easesoftware.com>
> Date: Sun, 18 Dec 2022 08:07:02 -0600
> Cc: João Távora <joaotavora@gmail.com>,
>  Manuel Giraud <manuel@ledu-giraud.fr>,
>  Yuan Fu <casouri@gmail.com>,
>  emacs-devel@gnu.org
> 
> > On Dec 18, 2022, at 03:59, Eli Zaretskii <eliz@gnu.org> wrote:
> > 
> >> From: João Távora <joaotavora@gmail.com>
> >> Date: Fri, 16 Dec 2022 16:02:30 +0000
> >> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> >> 
> >> On Fri, Dec 16, 2022 at 3:53 PM Perry Smith <pedz@easesoftware.com> wrote:
> >> 
> >>> This file is in the Emacs source and outlines a lot of good stuff
> >>> https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/notes/tree-sitter/starter-guide
> >> 
> >> The most important bit for end users and not mode authors is
> >> "* Install language definitions".  I think that section should be in the
> >> manual even if it means some links to the internet/github for
> >> grabbing Yuan Fu's tool and the grammars.
> > 
> > The manual doesn't explain installation, so it isn't the right place.
> > 
> > I will add the necessary information to NEWS, though.
> 
> Temporarily, I put a lot of it in the ruby-ts-mode.el file.  It might help you
> a bit:
> 
> https://github.com/pedz/ruby-ts-mode/blob/master/ruby-ts-mode.el#L32

Thanks.  The text in NEWS is already written and committed, please
take a look.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-17 15:42                 ` Stefan Monnier
  2022-12-17 17:41                   ` T.V Raman
@ 2022-12-26 22:42                   ` Dmitry Gutov
  2022-12-27 12:11                     ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Dmitry Gutov @ 2022-12-26 22:42 UTC (permalink / raw)
  To: Stefan Monnier, Tim Cross; +Cc: emacs-devel

On 17/12/2022 17:42, Stefan Monnier wrote:
> Those grammars have been in use for a while now
> (by other editors), AFAIK, so why would the situation change just because
> Emacs starts to use them as well?

I guess "other editors" that use tree-sitter do bundle the grammars?

Or include them in the optional language packages. Or provide built-in 
recipes to install them.

E.g. nvim-treesitter seems to be doing the latter: 
https://github.com/nvim-treesitter/nvim-treesitter#language-parsers

The corresponding recipes look like:

list.ruby = {
   install_info = {
     url = "https://github.com/tree-sitter/tree-sitter-ruby",
     files = { "src/parser.c", "src/scanner.cc" },
   },
   maintainers = { "@TravonteD" },
}



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-26 22:42                   ` Dmitry Gutov
@ 2022-12-27 12:11                     ` Eli Zaretskii
  2022-12-27 12:43                       ` Dmitry Gutov
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 12:11 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: monnier, theophilusx, emacs-devel

> Date: Tue, 27 Dec 2022 00:42:55 +0200
> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 17/12/2022 17:42, Stefan Monnier wrote:
> > Those grammars have been in use for a while now
> > (by other editors), AFAIK, so why would the situation change just because
> > Emacs starts to use them as well?
> 
> I guess "other editors" that use tree-sitter do bundle the grammars?
> 
> Or include them in the optional language packages. Or provide built-in 
> recipes to install them.

WDYT about what we have in NEWS about this?

> E.g. nvim-treesitter seems to be doing the latter: 
> https://github.com/nvim-treesitter/nvim-treesitter#language-parsers
> 
> The corresponding recipes look like:
> 
> list.ruby = {
>    install_info = {
>      url = "https://github.com/tree-sitter/tree-sitter-ruby",
>      files = { "src/parser.c", "src/scanner.cc" },
>    },
>    maintainers = { "@TravonteD" },
> }

It sounds like a non-trivial maintenance burden to keep this kind of
DB up-to-date.  So I'm not sure we should do this in the upstream
project.

But if Someone(TM) wants to provide Emacs commands to download,
compile, and install a grammar library, I see no reason not to add
that to Emacs.  This could be part of treesit.el, for example.

One condition, though: please implement the commands in Emacs Lisp,
without invoking any shell scripting features (which shouldn't be
needed to begin with), just by using compiler, Emacs commands and
functions that deal with files, and (possibly) Git.

Also, if we provide some list of grammar libraries we support
officially and their respective sites, that list should include only
libraries with a Free license.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 12:11                     ` Eli Zaretskii
@ 2022-12-27 12:43                       ` Dmitry Gutov
  2022-12-27 13:38                         ` Eli Zaretskii
                                           ` (2 more replies)
  0 siblings, 3 replies; 138+ messages in thread
From: Dmitry Gutov @ 2022-12-27 12:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theophilusx, emacs-devel

On 27/12/2022 14:11, Eli Zaretskii wrote:
>> Date: Tue, 27 Dec 2022 00:42:55 +0200
>> Cc: emacs-devel@gnu.org
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>> On 17/12/2022 17:42, Stefan Monnier wrote:
>>> Those grammars have been in use for a while now
>>> (by other editors), AFAIK, so why would the situation change just because
>>> Emacs starts to use them as well?
>>
>> I guess "other editors" that use tree-sitter do bundle the grammars?
>>
>> Or include them in the optional language packages. Or provide built-in
>> recipes to install them.
> 
> WDYT about what we have in NEWS about this?

Those instructions seem to be written foremost with distro maintainers 
in mind. Definitely better to have them than not, but I'd hate to 
present them to the average user.

Do we expect all (most?) distros to compile all the popular grammars?

That would still leave out the users of the less popular languages whose 
grammars were not included. Or grammars which saw updates since the 
distro-distributed version (so it's useful to install the newer version).

>> E.g. nvim-treesitter seems to be doing the latter:
>> https://github.com/nvim-treesitter/nvim-treesitter#language-parsers
>>
>> The corresponding recipes look like:
>>
>> list.ruby = {
>>     install_info = {
>>       url = "https://github.com/tree-sitter/tree-sitter-ruby",
>>       files = { "src/parser.c", "src/scanner.cc" },
>>     },
>>     maintainers = { "@TravonteD" },
>> }
> 
> It sounds like a non-trivial maintenance burden to keep this kind of
> DB up-to-date.  So I'm not sure we should do this in the upstream
> project.
> 
> But if Someone(TM) wants to provide Emacs commands to download,
> compile, and install a grammar library, I see no reason not to add
> that to Emacs.  This could be part of treesit.el, for example.

I wouldn't worry too much about the maintenance burden (keeping the list 
of urls up-to-date?), especially since we could refer to such lists by 
other projects.

I think ELPA is a better place for this feature, though. Because we 
always want the user to get the latest version of the recipes.

Or if we put this into treesit.el, it would be better to keep the set of 
recipes as a separate package, so the updates to it can be put into ELPA 
(as a :core package).

> One condition, though: please implement the commands in Emacs Lisp,
> without invoking any shell scripting features (which shouldn't be
> needed to begin with), just by using compiler, Emacs commands and
> functions that deal with files, and (possibly) Git.
> 
> Also, if we provide some list of grammar libraries we support
> officially and their respective sites, that list should include only
> libraries with a Free license.

Makes sense.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 12:43                       ` Dmitry Gutov
@ 2022-12-27 13:38                         ` Eli Zaretskii
  2022-12-27 14:11                           ` Dmitry Gutov
  2022-12-27 13:51                         ` tomas
  2022-12-27 15:58                         ` Stefan Monnier
  2 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 13:38 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: monnier, theophilusx, emacs-devel

> Date: Tue, 27 Dec 2022 14:43:06 +0200
> Cc: monnier@iro.umontreal.ca, theophilusx@gmail.com, emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > WDYT about what we have in NEWS about this?
> 
> Those instructions seem to be written foremost with distro maintainers 
> in mind.

Or users who build and install Emacs by themselves.

Frankly, I don't see why we, the upstream project, need to worry about
anyone else.  It isn't our job.  That's what distros are there for.

> Definitely better to have them than not, but I'd hate to present
> them to the average user.

"Hate"?  That's a strong word.  The questions that the NEWS entries
answer were asked here and elsewhere several times, so presumably that
information has some non-trivial value.

> Do we expect all (most?) distros to compile all the popular grammars?

I honestly don't know.  On the one hand, there aren't many Emacs modes
which use tree-sitter, but OTOH they could start growing like
mushrooms once Emacs 29 hits the streets.  I do expect them to offer
the ones they consider useful/needed, for some value of that.  I
really don't see any significant difference in this regard between
grammar libraries and, say, librsvg.  Both are used in Emacs, and the
lack of either disables useful Emacs features.  So it's a no-brainer
for me.  But then I'm not a distro maintainer, and never have been.

> That would still leave out the users of the less popular languages whose 
> grammars were not included. Or grammars which saw updates since the 
> distro-distributed version (so it's useful to install the newer version).

What's the solution?  All the "solutions" I saw until now require a
working and well-configured C/C++ compiler (sometimes both C and C++),
linker, and C/C++ runtimes.  A user who has them already installed can
easily build a grammar library with two simple commands.  A user who
doesn't have a C/C++ development environment will not find those
"solutions" useful at all.  And asking us to distribute binaries for
half a dozen popular systems is IMNSHO unreasonable.

> > It sounds like a non-trivial maintenance burden to keep this kind of
> > DB up-to-date.  So I'm not sure we should do this in the upstream
> > project.
> > 
> > But if Someone(TM) wants to provide Emacs commands to download,
> > compile, and install a grammar library, I see no reason not to add
> > that to Emacs.  This could be part of treesit.el, for example.
> 
> I wouldn't worry too much about the maintenance burden (keeping the list 
> of urls up-to-date?), especially since we could refer to such lists by 
> other projects.

I cannot disagree more.  Look at this from my POV: once the list
becomes even semi-official, people will expect it to be of the same
high quality as all the rest of Emacs, and they _will_ complain and
report inaccuracies.  It's a nuisance, especially for such a "hot"
feature set.

And which "other projects"? who can track those and know which ones
have the most accurate, up-to-date, and comprehensive list?  I'm a bit
interested in this (and have several dozens of grammar libraries built
locally), and I discover another project with a useful list of
grammars almost every day.  These things are highly dynamic: I see
some of the grammars get updates every couple of days.  Some languages
have more than one grammar library maintained by different people --
who will figure out which one is better for us and keep that
information up-to-date?

> I think ELPA is a better place for this feature, though. Because we 
> always want the user to get the latest version of the recipes.

That solves only part of the problem.  (And not an important part: our
Git repository is public, so people can track it and download updates
for files as easily as they track ELPA.)  The hard part -- keeping the
information accurate and up-to-date -- still needs a motivated
volunteer.  And we hardly have resources to work on our code and docs,
let alone help people install external software.

(Of course, if such a motivated volunteer steps forward, he or she
will be most welcome.)



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 12:43                       ` Dmitry Gutov
  2022-12-27 13:38                         ` Eli Zaretskii
@ 2022-12-27 13:51                         ` tomas
  2022-12-27 15:58                         ` Stefan Monnier
  2 siblings, 0 replies; 138+ messages in thread
From: tomas @ 2022-12-27 13:51 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 402 bytes --]

On Tue, Dec 27, 2022 at 02:43:06PM +0200, Dmitry Gutov wrote:

[...]

> Those instructions seem to be written foremost with distro maintainers in
> mind. Definitely better to have them than not, but I'd hate to present them
> to the average user.

(1) Don't underestimate the "average user"
(2) Give them the chance to grow
(2a) (the strong version) Tickle their curiosity

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 13:38                         ` Eli Zaretskii
@ 2022-12-27 14:11                           ` Dmitry Gutov
  2022-12-27 14:32                             ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Dmitry Gutov @ 2022-12-27 14:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, theophilusx, emacs-devel

On 27/12/2022 15:38, Eli Zaretskii wrote:
>> Date: Tue, 27 Dec 2022 14:43:06 +0200
>> Cc: monnier@iro.umontreal.ca, theophilusx@gmail.com, emacs-devel@gnu.org
>> From: Dmitry Gutov <dgutov@yandex.ru>
>>
>>> WDYT about what we have in NEWS about this?
>>
>> Those instructions seem to be written foremost with distro maintainers
>> in mind.
> 
> Or users who build and install Emacs by themselves.
> 
> Frankly, I don't see why we, the upstream project, need to worry about
> anyone else.  It isn't our job.  That's what distros are there for.

Previously all one needed for a language support mode is to download and 
load an .el file. As we drift to the idea of using externally-maintained 
grammars, and the "native" modes become less useful (possibly deprecated 
in 5-10), it seems like it will become more of our responsibility to 
streamline.

>> Definitely better to have them than not, but I'd hate to present
>> them to the average user.
> 
> "Hate"?  That's a strong word.

Also a figure of speech.

> The questions that the NEWS entries
> answer were asked here and elsewhere several times, so presumably that
> information has some non-trivial value.

Of course.

>> Do we expect all (most?) distros to compile all the popular grammars?
> 
> I honestly don't know.  On the one hand, there aren't many Emacs modes
> which use tree-sitter, but OTOH they could start growing like
> mushrooms once Emacs 29 hits the streets.  I do expect them to offer
> the ones they consider useful/needed, for some value of that.  I
> really don't see any significant difference in this regard between
> grammar libraries and, say, librsvg.  Both are used in Emacs, and the
> lack of either disables useful Emacs features.  So it's a no-brainer
> for me.  But then I'm not a distro maintainer, and never have been.

On the flip side, the third-party modes can provide their own 
download-compile-install instructions, which will make it easier to end 
users.

The barrier to creating such a mode, though, is now higher.

>> That would still leave out the users of the less popular languages whose
>> grammars were not included. Or grammars which saw updates since the
>> distro-distributed version (so it's useful to install the newer version).
> 
> What's the solution?  All the "solutions" I saw until now require a
> working and well-configured C/C++ compiler (sometimes both C and C++),
> linker, and C/C++ runtimes.  A user who has them already installed can
> easily build a grammar library with two simple commands.  A user who
> doesn't have a C/C++ development environment will not find those
> "solutions" useful at all.  And asking us to distribute binaries for
> half a dozen popular systems is IMNSHO unreasonable.

I think it's common enough for a user to have build tools installed, but 
not know well enough how to set up a C project. Think junior-middle 
developers in a number of languages which are not C.

Or just first grade students.

>> I wouldn't worry too much about the maintenance burden (keeping the list
>> of urls up-to-date?), especially since we could refer to such lists by
>> other projects.
> 
> I cannot disagree more.  Look at this from my POV: once the list
> becomes even semi-official, people will expect it to be of the same
> high quality as all the rest of Emacs, and they _will_ complain and
> report inaccuracies.  It's a nuisance, especially for such a "hot"
> feature set.

They will report inaccuracies, which will be helpful to fixing them. 
That is certainly a workload, but still small compared to the current 
flow of bug reports, I think. Or the many hours one would spend fixing a 
font- or redisplay-related problem.

Anyway, my point was not to put this burden on you specifically. If you 
might recall, I've always advocated toward "smaller core with many 
plugins" as a model of Emacs development.

> And which "other projects"? who can track those and know which ones
> have the most accurate, up-to-date, and comprehensive list?  I'm a bit
> interested in this (and have several dozens of grammar libraries built
> locally), and I discover another project with a useful list of
> grammars almost every day.  These things are highly dynamic: I see
> some of the grammars get updates every couple of days.  Some languages
> have more than one grammar library maintained by different people --
> who will figure out which one is better for us and keep that
> information up-to-date?

The Neovim repo will likely be a good resource for this in the near 
future. This file in particular: 
https://github.com/nvim-treesitter/nvim-treesitter/blob/f2b1d727e6ad46238baa84c4d1f968a297e415ab/lua/nvim-treesitter/parsers.lua

But it brings me to another concern, showcased by this commit: 
https://github.com/nvim-treesitter/nvim-treesitter/commit/0cb637ca9f4389172933e5aba36387ab8430b6fb

The AST for one version of a grammar might be incompatible enough with a 
newer one, making the TS queries, font-lock and indentation rules 
obsolete or at least slightly broken. nvim-treesitter works around this 
by locking the repository version of a grammar corresponding to the 
current language support code.

How much this will be a problem in practice for us? I'm not sure. 
Perhaps most popular grammars have had enough time to mature by now.

>> I think ELPA is a better place for this feature, though. Because we
>> always want the user to get the latest version of the recipes.
> 
> That solves only part of the problem.  (And not an important part: our
> Git repository is public, so people can track it and download updates
> for files as easily as they track ELPA.)

One is not exactly like the other from an end user's POV.

> The hard part -- keeping the
> information accurate and up-to-date -- still needs a motivated
> volunteer.  And we hardly have resources to work on our code and docs,
> let alone help people install external software.
> 
> (Of course, if such a motivated volunteer steps forward, he or she
> will be most welcome.)

My guess is we have a few people here already who might be interested.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 14:11                           ` Dmitry Gutov
@ 2022-12-27 14:32                             ` Eli Zaretskii
  2022-12-27 16:36                               ` Stefan Monnier
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 14:32 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: monnier, theophilusx, emacs-devel

> Date: Tue, 27 Dec 2022 16:11:22 +0200
> Cc: monnier@iro.umontreal.ca, theophilusx@gmail.com, emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> > Frankly, I don't see why we, the upstream project, need to worry about
> > anyone else.  It isn't our job.  That's what distros are there for.
> 
> Previously all one needed for a language support mode is to download and 
> load an .el file. As we drift to the idea of using externally-maintained 
> grammars, and the "native" modes become less useful (possibly deprecated 
> in 5-10), it seems like it will become more of our responsibility to 
> streamline.

Please hold your horses, and let's keep things in their proper
perspective.  We've just introduced this feature, and have a
relatively small number of very "young" new modes using them.  We
haven't yet released even a single Emacs version with that feature,
and don't know whether migration to it will be slow and take years, or
fast as a bush-fire.  We don't know whether the fact we depend on
grammar libraries is a good or a bad thing for Emacs (we are not like
other projects, so it matters).  We don't know whether users will like
these new modes.

In this situation, rushing to some decisions that will have
long-standing maintenance consequences is premature at best.

> >> Do we expect all (most?) distros to compile all the popular grammars?
> > 
> > I honestly don't know.  On the one hand, there aren't many Emacs modes
> > which use tree-sitter, but OTOH they could start growing like
> > mushrooms once Emacs 29 hits the streets.  I do expect them to offer
> > the ones they consider useful/needed, for some value of that.  I
> > really don't see any significant difference in this regard between
> > grammar libraries and, say, librsvg.  Both are used in Emacs, and the
> > lack of either disables useful Emacs features.  So it's a no-brainer
> > for me.  But then I'm not a distro maintainer, and never have been.
> 
> On the flip side, the third-party modes can provide their own 
> download-compile-install instructions, which will make it easier to end 
> users.
> 
> The barrier to creating such a mode, though, is now higher.

Not to creating, to using it.  People who develop such modes are a few
and far in-between, and generally won't have any trouble finding and
building a grammar library.

The barrier is also somewhat higher when we introduce support for a
new image format, but we don't hesitate then, do we?

> > What's the solution?  All the "solutions" I saw until now require a
> > working and well-configured C/C++ compiler (sometimes both C and C++),
> > linker, and C/C++ runtimes.  A user who has them already installed can
> > easily build a grammar library with two simple commands.  A user who
> > doesn't have a C/C++ development environment will not find those
> > "solutions" useful at all.  And asking us to distribute binaries for
> > half a dozen popular systems is IMNSHO unreasonable.
> 
> I think it's common enough for a user to have build tools installed, but 
> not know well enough how to set up a C project. Think junior-middle 
> developers in a number of languages which are not C.

It doesn't need any project, it is literally two command lines.
Here's an example:

  gcc -O2 -I.   -c -o parser.o parser.c
  gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll

> > I cannot disagree more.  Look at this from my POV: once the list
> > becomes even semi-official, people will expect it to be of the same
> > high quality as all the rest of Emacs, and they _will_ complain and
> > report inaccuracies.  It's a nuisance, especially for such a "hot"
> > feature set.
> 
> They will report inaccuracies, which will be helpful to fixing them. 
> That is certainly a workload, but still small compared to the current 
> flow of bug reports, I think. Or the many hours one would spend fixing a 
> font- or redisplay-related problem.

Small, but not small enough.  Wading through sites, comparing
information, looking up licenses, tracking new grammar libraries --
this doesn't take hours, but it doesn't take seconds, either.  And
it's something you must do frequently enough to stay on top of
things.  It's a significant job.

> Anyway, my point was not to put this burden on you specifically. If you 
> might recall, I've always advocated toward "smaller core with many 
> plugins" as a model of Emacs development.

That's a different disagreement between us.

> The Neovim repo will likely be a good resource for this in the near 
> future.

What about "far future"?  Today it's neovim, tomorrow it's something
else.  Again, it's a burden we are better without.  Unless Someone
steps forward, of course.

> The AST for one version of a grammar might be incompatible enough with a 
> newer one, making the TS queries, font-lock and indentation rules 
> obsolete or at least slightly broken. nvim-treesitter works around this 
> by locking the repository version of a grammar corresponding to the 
> current language support code.
> 
> How much this will be a problem in practice for us? I'm not sure. 

Neither am I.  We'll have to wait and see.

> > (Of course, if such a motivated volunteer steps forward, he or she
> > will be most welcome.)
> 
> My guess is we have a few people here already who might be interested.

They will be welcome.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 12:43                       ` Dmitry Gutov
  2022-12-27 13:38                         ` Eli Zaretskii
  2022-12-27 13:51                         ` tomas
@ 2022-12-27 15:58                         ` Stefan Monnier
  2 siblings, 0 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-27 15:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, theophilusx, emacs-devel

> I think ELPA is a better place for this feature, though. Because we always
> want the user to get the latest version of the recipes.

A GNU ELPA package that helps install those grammars would make
sense, indeed.  At the same time there are some issues:
- This one package probably can't know about all the grammars needed by
  all the major modes out there.
- As you noted, the major mode is actually tied (typically via node
  names in the queries) to a particular grammar, so keeping that info
  separately from the major mode doesn't sound ideal.

I suspect we do need some `treesit-grammars` package that helps install
those grammars, and I think it makes a lot of sense for it to be a :core
package (or to be bundled in the Emacs tarball) so users don't need to
install that package before they get to install the grammars.

Maybe that package should also hold the locations of the grammars used
by the major modes that are in core.

But for major modes that aren't in core, it makes more sense to keep the
grammar location info with the major mode (and use the
`treesit-grammars` package to take care of fetching, compiling, and
storing the result at the appropriate place).


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 14:32                             ` Eli Zaretskii
@ 2022-12-27 16:36                               ` Stefan Monnier
  2022-12-27 16:44                                 ` Philip Kaludercic
  2022-12-27 17:10                                 ` Eli Zaretskii
  0 siblings, 2 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-27 16:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dmitry Gutov, theophilusx, emacs-devel

> It doesn't need any project, it is literally two command lines.
> Here's an example:
>
>   gcc -O2 -I.   -c -o parser.o parser.c
>   gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll

AFAIK `parser.c` is a file generated from the actual grammar's source,
itself written in Javascript.

So the above instructions are akin to downloading a precompiled binary
and installing it.  While it is the most convenient path for the
end-users, it's important w.r.t Freedom to make sure that grammars can
also be regenerated from source by the end users.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 16:36                               ` Stefan Monnier
@ 2022-12-27 16:44                                 ` Philip Kaludercic
  2022-12-27 17:16                                   ` Eli Zaretskii
  2022-12-30 11:06                                   ` Yuan Fu
  2022-12-27 17:10                                 ` Eli Zaretskii
  1 sibling, 2 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-27 16:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Dmitry Gutov, theophilusx, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> It doesn't need any project, it is literally two command lines.
>> Here's an example:
>>
>>   gcc -O2 -I.   -c -o parser.o parser.c
>>   gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll
>
> AFAIK `parser.c` is a file generated from the actual grammar's source,
> itself written in Javascript.
>
> So the above instructions are akin to downloading a precompiled binary
> and installing it.  While it is the most convenient path for the
> end-users, it's important w.r.t Freedom to make sure that grammars can
> also be regenerated from source by the end users.

I have asked the question before, but freedom or not, the above is a
nuisance to run for every language.  If the process is as automatic as
the above example demonstrates, shouldn't Emacs have a command to take a
grammar and compile+install it?  I guess this could be more complicated
if the grammar is generated using a custom tool-chain for each language
(or is it always Javascript?), but nothing impossible.

What would be even better is if the grammars were to be distributed
along with Emacs (either in a tarball or as a dependency a package
manage would install).



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 16:36                               ` Stefan Monnier
  2022-12-27 16:44                                 ` Philip Kaludercic
@ 2022-12-27 17:10                                 ` Eli Zaretskii
  2022-12-27 17:31                                   ` Stefan Monnier
  1 sibling, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 17:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dgutov, theophilusx, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Dmitry Gutov <dgutov@yandex.ru>,  theophilusx@gmail.com,
>   emacs-devel@gnu.org
> Date: Tue, 27 Dec 2022 11:36:35 -0500
> 
> > It doesn't need any project, it is literally two command lines.
> > Here's an example:
> >
> >   gcc -O2 -I.   -c -o parser.o parser.c
> >   gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll
> 
> AFAIK `parser.c` is a file generated from the actual grammar's source,
> itself written in Javascript.
> 
> So the above instructions are akin to downloading a precompiled binary
> and installing it.  While it is the most convenient path for the
> end-users, it's important w.r.t Freedom to make sure that grammars can
> also be regenerated from source by the end users.

When you clone the Git repository of those grammar libraries (which
AFAIK is the only way to get their sources), you get all the source
files, including the Javascript sources of the grammar, the corpus of
text that they used, the test files, etc.  You also get the C/C++
sources of the parser and the scanner (produced from the grammar
files), which you then need to compile and link into a library.

So in the above you are barking up the wrong tree, and you should know
me better than lecture me on software freedom and what it means.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 16:44                                 ` Philip Kaludercic
@ 2022-12-27 17:16                                   ` Eli Zaretskii
  2022-12-27 17:20                                     ` Philip Kaludercic
  2022-12-27 17:33                                     ` Stefan Monnier
  2022-12-30 11:06                                   ` Yuan Fu
  1 sibling, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 17:16 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: monnier, dgutov, theophilusx, emacs-devel

> From: Philip Kaludercic <philipk@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Dmitry Gutov <dgutov@yandex.ru>,
>   theophilusx@gmail.com,  emacs-devel@gnu.org
> Date: Tue, 27 Dec 2022 16:44:02 +0000
> 
> I have asked the question before, but freedom or not, the above is a
> nuisance to run for every language.

Which is why distros should make them available, like they do with
other dependencies.

> If the process is as automatic as the above example demonstrates,
> shouldn't Emacs have a command to take a grammar and compile+install
> it?

It _is_ as simple as above (modulo the need to detect C++ sources and
use g++ in that case).  And I already said that such a command will
have its place in Emacs (and Stefan agrees, AFAIU).  So there should
be no argument anymore about the possibility; it's just a matter of
Someone sitting down and coding the darn thing.  Like everything else
in Emacs.

> What would be even better is if the grammars were to be distributed
> along with Emacs (either in a tarball or as a dependency a package
> manage would install).

Not going to happen, for the reasons I already explained several
times.  In a nutshell: we cannot and will not distribute binaries, and
distributing sources is not useful enough to justify the maintenance
headaches of distribution someone else's code.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 17:16                                   ` Eli Zaretskii
@ 2022-12-27 17:20                                     ` Philip Kaludercic
  2022-12-27 18:06                                       ` Eli Zaretskii
  2022-12-27 17:33                                     ` Stefan Monnier
  1 sibling, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-27 17:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, dgutov, theophilusx, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Philip Kaludercic <philipk@posteo.net>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  Dmitry Gutov <dgutov@yandex.ru>,
>>   theophilusx@gmail.com,  emacs-devel@gnu.org
>> Date: Tue, 27 Dec 2022 16:44:02 +0000
>> 
>> I have asked the question before, but freedom or not, the above is a
>> nuisance to run for every language.
>
> Which is why distros should make them available, like they do with
> other dependencies.

OK, then I hope this will improve soon, because it seems that at least
under Debian (testing) no such packages exist right now.

>> If the process is as automatic as the above example demonstrates,
>> shouldn't Emacs have a command to take a grammar and compile+install
>> it?
>
> It _is_ as simple as above (modulo the need to detect C++ sources and
> use g++ in that case).  And I already said that such a command will
> have its place in Emacs (and Stefan agrees, AFAIU).  So there should
> be no argument anymore about the possibility; it's just a matter of
> Someone sitting down and coding the darn thing.  Like everything else
> in Emacs.

I'd be glad to help, if I knew what the interface/input/output should
be.  But I'm guessing that is half the work.

>> What would be even better is if the grammars were to be distributed
>> along with Emacs (either in a tarball or as a dependency a package
>> manage would install).
>
> Not going to happen, for the reasons I already explained several
> times.  In a nutshell: we cannot and will not distribute binaries, and
> distributing sources is not useful enough to justify the maintenance
> headaches of distribution someone else's code.

Great, thanks for the explanation.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 17:10                                 ` Eli Zaretskii
@ 2022-12-27 17:31                                   ` Stefan Monnier
  2022-12-27 18:08                                     ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Stefan Monnier @ 2022-12-27 17:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dgutov, theophilusx, emacs-devel

> When you clone the Git repository of those grammar libraries (which
> AFAIK is the only way to get their sources), you get all the source
> files, including the Javascript sources of the grammar, the corpus of
> text that they used, the test files, etc.  You also get the C/C++
> sources of the parser and the scanner (produced from the grammar
> files), which you then need to compile and link into a library.
>
> So in the above you are barking up the wrong tree, and you should know
> me better than lecture me on software freedom and what it means.

I did not intend to lecture you at all (I do know better than that :-)
I was pointing out that our helper functions should support installing
not just from the pregenerated `.c` file but also from the
actual source.

Whether they're stored at the same place or not isn't directly relevant.

[ The paranoid among us might point out that there's no guarantee the
  `.c` file actually matches the accompanying sources, and that maybe
  *we* should generate the `.c` file and distribute them from our how
  repository.  ]


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 17:16                                   ` Eli Zaretskii
  2022-12-27 17:20                                     ` Philip Kaludercic
@ 2022-12-27 17:33                                     ` Stefan Monnier
  1 sibling, 0 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-27 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philip Kaludercic, dgutov, theophilusx, emacs-devel

> Which is why distros should make them available, like they do with
> other dependencies.
[...]
> It _is_ as simple as above (modulo the need to detect C++ sources and
> use g++ in that case).  And I already said that such a command will
> have its place in Emacs (and Stefan agrees, AFAIU).  So there should

Indeed I agree, although it will reduce yet a bit more the pressure on
distros to step up to the plate.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 17:20                                     ` Philip Kaludercic
@ 2022-12-27 18:06                                       ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 18:06 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: monnier, dgutov, theophilusx, emacs-devel

> From: Philip Kaludercic <philipk@posteo.net>
> Cc: monnier@iro.umontreal.ca,  dgutov@yandex.ru,  theophilusx@gmail.com,
>   emacs-devel@gnu.org
> Date: Tue, 27 Dec 2022 17:20:29 +0000
> 
> > It _is_ as simple as above (modulo the need to detect C++ sources and
> > use g++ in that case).  And I already said that such a command will
> > have its place in Emacs (and Stefan agrees, AFAIU).  So there should
> > be no argument anymore about the possibility; it's just a matter of
> > Someone sitting down and coding the darn thing.  Like everything else
> > in Emacs.
> 
> I'd be glad to help, if I knew what the interface/input/output should
> be.  But I'm guessing that is half the work.

I already did that half ;-)  Here's a simple Makefile I use to build
(almost) all the grammar libraries:

  PARSER_NAME = $(notdir $(abspath $(CURDIR)/..))
  DLL = $(addsuffix .dll,$(addprefix lib,$(PARSER_NAME)))

  CPPSRC := $(filter-out schema.generated.cc,$(filter-out binding.cc,$(wildcard *.cc)))
  SRC := $(wildcard *.c)
  SRC += $(CPPSRC)
  OBJ := $(addsuffix .o,$(basename $(SRC)))

  CC = gcc
  ifeq (, $(CPPSRC))
	  CCLD = gcc
  else
	  CCLD = g++
  endif
  CFLAGS = -O2 -I.
  CXXFLAGS = -O2 -I.

  all: $(DLL)

  $(DLL): $(OBJ)
	  $(CCLD) $(LDFLAGS) -shared $^ $(LDLIBS) -ltree-sitter -o $@

This is for Windows, thus "DLL" and stuff, not .so.  Also, Posix
platforms need -fPIC compiler switch.

There are a couple of grammars that need a slightly more complex
approach, because their repositories produce more than one shared
library.  One notable example is tree-sitter-typescript (which Emacs
uses).

HTH



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 17:31                                   ` Stefan Monnier
@ 2022-12-27 18:08                                     ` Eli Zaretskii
  2022-12-27 18:44                                       ` Stefan Monnier
  2022-12-27 19:53                                       ` Dmitry Gutov
  0 siblings, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-27 18:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dgutov, theophilusx, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: dgutov@yandex.ru,  theophilusx@gmail.com,  emacs-devel@gnu.org
> Date: Tue, 27 Dec 2022 12:31:24 -0500
> 
> [ The paranoid among us might point out that there's no guarantee the
>   `.c` file actually matches the accompanying sources, and that maybe
>   *we* should generate the `.c` file and distribute them from our how
>   repository.  ]

Good luck with that!  It requires to install Node.js and other stiff,
which I personally won't touch with a 3-mile pole, and I'm unsure (but
never bothered to find out) whether free replacements even exist.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 18:08                                     ` Eli Zaretskii
@ 2022-12-27 18:44                                       ` Stefan Monnier
  2022-12-27 20:06                                         ` Philip Kaludercic
  2022-12-28 12:56                                         ` Gregory Heytings
  2022-12-27 19:53                                       ` Dmitry Gutov
  1 sibling, 2 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-27 18:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dgutov, theophilusx, emacs-devel

Eli Zaretskii [2022-12-27 20:08:32] wrote:
> Good luck with that!  It requires to install Node.js and other stiff,
> which I personally won't touch with a 3-mile pole, and I'm unsure (but
> never bothered to find out) whether free replacements even exist.

That's the part of Tree-sitter which makes me a bit uneasy.

In Emacs, we usually don't just stick to the letter of the Free Software
principles (e.g. release our code under the GPL) but we try to go the
extra miles so that our users can actually exercise their rights as
easily as possible (e.g. with things like `C-h k` that lets them jump
straight to the source, ...).

But with Tree-sitter, our users are currently a bit stuck.  The grammars
are Free, yes, but it takes a fair bit of extra work (and potentially
even some code with unclear licensing along the way) if you want to
modify their source code and use the result :-(

Maybe building our own `.c` grammars from source would be a way for us
to face up to this reality :-(


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 18:08                                     ` Eli Zaretskii
  2022-12-27 18:44                                       ` Stefan Monnier
@ 2022-12-27 19:53                                       ` Dmitry Gutov
  2023-01-01  3:03                                         ` Richard Stallman
  1 sibling, 1 reply; 138+ messages in thread
From: Dmitry Gutov @ 2022-12-27 19:53 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: theophilusx, emacs-devel

On 27/12/2022 20:08, Eli Zaretskii wrote:
> and I'm unsure (but
> never bothered to find out) whether free replacements even exist

Not 100% sure what you mean by that: Node.js uses the MIT/Expat license. 
That's Free software.

A specific grammar might use some proprietary modules, but that's just 
something for us to verify.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 18:44                                       ` Stefan Monnier
@ 2022-12-27 20:06                                         ` Philip Kaludercic
  2022-12-27 21:13                                           ` Stefan Monnier
  2022-12-28 12:56                                         ` Gregory Heytings
  1 sibling, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-27 20:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, dgutov, theophilusx, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Eli Zaretskii [2022-12-27 20:08:32] wrote:
>> Good luck with that!  It requires to install Node.js and other stiff,
>> which I personally won't touch with a 3-mile pole, and I'm unsure (but
>> never bothered to find out) whether free replacements even exist.
>
> That's the part of Tree-sitter which makes me a bit uneasy.

I took a look at an the C grammar, and it doesn't appear to use any
fancy Javascript stuff:

  https://github.com/tree-sitter/tree-sitter-c/blob/0720f9c2af2a97dcd0e9ed90324d1baba68b2849/grammar.js

Depending on who or what loads these files, it might be possible to use
something nice like quickjs (https://bellard.org/quickjs/) to generate
the intermediate C files.

The documentation at

  https://tree-sitter.github.io/tree-sitter/creating-parsers

indicates that the tree-sitter toolchain is written in Rust (yet another
build-time dependency), but unless I am mistaken, it just invokes "node"

  https://github.com/tree-sitter/tree-sitter/blob/9866674cf87fcd1cd7e424eecdbf260f8947a784/cli/src/generate/mod.rs#L169

and pipes this file

  https://github.com/tree-sitter/tree-sitter/blob/master/cli/src/generate/dsl.js

into the process.  The output of this script is JSON, and it seems like
pretty vanilla Javascript.  If I manage to build QuickJS, I'll try and
see if this can be used to process a grammar.

The remaining part that generates the C code is written in Rust, and
translates the JSON output into C:

  https://github.com/tree-sitter/tree-sitter/blob/9866674cf87fcd1cd7e424eecdbf260f8947a784/cli/src/generate/mod.rs#L116

and

  https://github.com/tree-sitter/tree-sitter/blob/9866674cf87fcd1cd7e424eecdbf260f8947a784/cli/src/generate/render.rs#L95

It should be possible to port this, but I question if it is worth the
effort.  As Stefan said, it really looks like something that
distributions should take care of -- though considering that Emacs isn't
the first editor with TreeSitter support, I wonder why this hasn't
happened yet. 



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 20:06                                         ` Philip Kaludercic
@ 2022-12-27 21:13                                           ` Stefan Monnier
  2022-12-28  2:52                                             ` Yuan Fu
  0 siblings, 1 reply; 138+ messages in thread
From: Stefan Monnier @ 2022-12-27 21:13 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: Eli Zaretskii, dgutov, theophilusx, emacs-devel

> It should be possible to port this, but I question if it is worth
> the effort.

I think it's worth the effort in order to help empower our users to make
changes to their grammars.  Otherwise we're back to grammars whose
source is legally-speaking Free but that most of ours users wouldn't
know how to change.

> though considering that Emacs isn't the first editor with TreeSitter
> support, I wonder why this hasn't happened yet. 

My guess is lack of motivation on one side (most editors using
Tree-sitter already provide built-in support to automatically install
relevant grammars, which is even simpler (but not empowering) for the
end users since they don't need administrators access to install the
relevant grammars).

On the other side is probably the difficulty of packaging Rust and JS
libraries which tend to be horribly misbehaved w.r.t what distributions
expect (with things like vendoring or dependencies on very specific
versions of libraries).


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 21:13                                           ` Stefan Monnier
@ 2022-12-28  2:52                                             ` Yuan Fu
  2022-12-28 13:10                                               ` Gregory Heytings
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 138+ messages in thread
From: Yuan Fu @ 2022-12-28  2:52 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Philip Kaludercic, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel



> On Dec 27, 2022, at 1:13 PM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>> It should be possible to port this, but I question if it is worth
>> the effort.
> 
> I think it's worth the effort in order to help empower our users to make
> changes to their grammars.  Otherwise we're back to grammars whose
> source is legally-speaking Free but that most of ours users wouldn't
> know how to change.

The “DSL” used to describe language grammar is reasonably straightforward, and our manual explains it to some degree (see the end of section 37.1 Tree-sitter Language Definitions). Though one probably need some additional knowledge on writing parsers to work on the grammar.

The cli doesn’t have any dependencies, so only node itself is required. One should only need to run

npm install tree-sitter-cli

and

tree-sitter generate

To convert a grammar.js to parser.c. (I didn’t try this, but this is what the documentation says.)

The javascript converter itself seems pretty straightforward, too, so it’s probably not very hard to port it to something else: https://github.com/tree-sitter/tree-sitter/blob/master/cli/npm/dsl.d.ts

> 
>> though considering that Emacs isn't the first editor with TreeSitter
>> support, I wonder why this hasn't happened yet. 
> 
> My guess is lack of motivation on one side (most editors using
> Tree-sitter already provide built-in support to automatically install
> relevant grammars, which is even simpler (but not empowering) for the
> end users since they don't need administrators access to install the
> relevant grammars).
> 
> On the other side is probably the difficulty of packaging Rust and JS
> libraries which tend to be horribly misbehaved w.r.t what distributions
> expect (with things like vendoring or dependencies on very specific
> versions of libraries).

For tree-sitter, the dependency is pretty sane, with just node and a C/C++ compiler you can convert grammar.js to a loadable library.

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 18:44                                       ` Stefan Monnier
  2022-12-27 20:06                                         ` Philip Kaludercic
@ 2022-12-28 12:56                                         ` Gregory Heytings
  2022-12-28 14:41                                           ` Stefan Monnier
  1 sibling, 1 reply; 138+ messages in thread
From: Gregory Heytings @ 2022-12-28 12:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, dgutov, theophilusx, emacs-devel


>> Good luck with that!  It requires to install Node.js and other stiff, 
>> which I personally won't touch with a 3-mile pole, and I'm unsure (but 
>> never bothered to find out) whether free replacements even exist.
>
> That's the part of Tree-sitter which makes me a bit uneasy.
>
> In Emacs, we usually don't just stick to the letter of the Free Software 
> principles (e.g. release our code under the GPL) but we try to go the 
> extra miles so that our users can actually exercise their rights as 
> easily as possible (e.g. with things like `C-h k` that lets them jump 
> straight to the source, ...).
>
> But with Tree-sitter, our users are currently a bit stuck.  The grammars 
> are Free, yes, but it takes a fair bit of extra work (and potentially 
> even some code with unclear licensing along the way) if you want to 
> modify their source code and use the result :-(
>

I think that's an exaggeration.  On a Debian-based system, building a 
grammar from source is as easy as typing three commands:

$ apt install gcc nodejs rust-all
$ cargo install tree-sitter-cli
$ tree-sitter generate grammar.js

Step 1 installs the necessary tools, all of which are free software (GPL 
for GCC, MIT for Node.js, dual MIT and Apache for Rust).

Step 2 downloads the source code of tree-sitter and of all its 
dependencies, and builds tree-sitter.  Again it is, and its dependencies 
are, free software (MIT for tree-sitter, and for its dependencies usually 
dual MIT and Apache like Rust itself, sometimes BSD or Zlib or public 
domain).

Step 3 converts the grammar.js file into the parser.c file.

Now you can use GCC to build the library.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-28  2:52                                             ` Yuan Fu
@ 2022-12-28 13:10                                               ` Gregory Heytings
  2022-12-28 13:38                                               ` Lynn Winebarger
  2022-12-29 11:14                                               ` Philip Kaludercic
  2 siblings, 0 replies; 138+ messages in thread
From: Gregory Heytings @ 2022-12-28 13:10 UTC (permalink / raw)
  To: Yuan Fu
  Cc: Stefan Monnier, Philip Kaludercic, Eli Zaretskii, Dmitry Gutov,
	Tim Cross, emacs-devel


>
> The cli doesn't have any dependencies, so only node itself is required. 
> One should only need to run
>
> npm install tree-sitter-cli
>

Doing that is easier (it is not necessary to install the Rust toolchain), 
but it downloads and installs a binary executable of tree-sitter, e.g. 
from

https://github.com/tree-sitter/tree-sitter/releases/download/v0.20.7/tree-sitter-linux-x64.gz




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-28  2:52                                             ` Yuan Fu
  2022-12-28 13:10                                               ` Gregory Heytings
@ 2022-12-28 13:38                                               ` Lynn Winebarger
  2022-12-28 14:41                                                 ` Danny Freeman
  2022-12-29 11:14                                               ` Philip Kaludercic
  2 siblings, 1 reply; 138+ messages in thread
From: Lynn Winebarger @ 2022-12-28 13:38 UTC (permalink / raw)
  To: Yuan Fu
  Cc: Stefan Monnier, Philip Kaludercic, Eli Zaretskii, Dmitry Gutov,
	Tim Cross, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

On Tue, Dec 27, 2022, 9:53 PM Yuan Fu <casouri@gmail.com> wrote:

>
>
> > On Dec 27, 2022, at 1:13 PM, Stefan Monnier <monnier@iro.umontreal.ca>
> wrote:
> >
> >> It should be possible to port this, but I question if it is worth
> >> the effort.
> >
> > I think it's worth the effort in order to help empower our users to make
> > changes to their grammars.  Otherwise we're back to grammars whose
> > source is legally-speaking Free but that most of ours users wouldn't
> > know how to change.
>
> The “DSL” used to describe language grammar is reasonably straightforward,
> and our manual explains it to some degree (see the end of section 37.1
> Tree-sitter Language Definitions). Though one probably need some additional
> knowledge on writing parsers to work on the grammar.
>

The problem is that the "cli" written in Rust is the parser generator.  I
looked into trying tree-sitter last summer, but gave up when I discovered
the Rust tool chain isn't available for cygwin.
A cursory inspection doesn't show me why the author goes to the length of
using JavaScript and nodejs, since all the real work appears to be in
Rust.
I suspect that Stefan and Eli would prefer a solution that decoupled the
use of libtreesitter from the tool that generates the shared library with
the parser tables (and whatever else is required) that the tree sitter
library loads.

Generating GLR automata is well-understood, but tree-sitter appears to have
some additional functionality in its parser generation.  How much of that
is required for libtreesitter to function is another question that would
need to be understood.

This is where the culture of free software, where the process of building
software is expected to be inclusive of the user-developer's local system,
and corporate open-source, which requires repeatable processes and
controlled build environments, are at logger heads.  Chromium (and its
component projects) is not encumbered by a software license, but it's build
process is a nightmare for any individual that wants to validate or
control, at least in principle, all software running on their system.  I'm
only using Chromium as an example with which I'm somewhat familiar.  It
sounds like Rust incorporates some bias towards "continuous integration"
builds as well.

Lynn

[-- Attachment #2: Type: text/html, Size: 3031 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-28 13:38                                               ` Lynn Winebarger
@ 2022-12-28 14:41                                                 ` Danny Freeman
  0 siblings, 0 replies; 138+ messages in thread
From: Danny Freeman @ 2022-12-28 14:41 UTC (permalink / raw)
  To: Lynn Winebarger
  Cc: Yuan Fu, Stefan Monnier, Philip Kaludercic, Eli Zaretskii,
	Dmitry Gutov, Tim Cross, emacs-devel


Lynn Winebarger <owinebar@gmail.com> writes:
>
> The problem is that the "cli" written in Rust is the parser generator.  I
> looked into trying tree-sitter last summer, but gave up when I discovered
> the Rust tool chain isn't available for cygwin.
> A cursory inspection doesn't show me why the author goes to the length of
> using JavaScript and nodejs, since all the real work appears to be in
> Rust.
> I suspect that Stefan and Eli would prefer a solution that decoupled the
> use of libtreesitter from the tool that generates the shared library with
> the parser tables (and whatever else is required) that the tree sitter
> library loads.
>
> Generating GLR automata is well-understood, but tree-sitter appears to have
> some additional functionality in its parser generation.  How much of that
> is required for libtreesitter to function is another question that would
> need to be understood.
>
> This is where the culture of free software, where the process of building
> software is expected to be inclusive of the user-developer's local system,
> and corporate open-source, which requires repeatable processes and
> controlled build environments, are at logger heads.  Chromium (and its
> component projects) is not encumbered by a software license, but it's build
> process is a nightmare for any individual that wants to validate or
> control, at least in principle, all software running on their system.  I'm
> only using Chromium as an example with which I'm somewhat familiar.  It
> sounds like Rust incorporates some bias towards "continuous integration"
> builds as well.
>
> Lynn


Something I haven't seen mentioned here that might be relevant to the
conversation is the tree-sitter 1.0 checklist from last year:
https://github.com/tree-sitter/tree-sitter/issues/930

It's not been updated in a while, so I'm not sure what the status is,
but one of the items on the list is:

> - Mergeable Git Repos - Make it easier to collaborate on grammars by removing generated files from version control.

which means anyone cloning the repository with the intention of
installing it would be required to have more than just a C compiler to
get started. They would need tree-sitter-cli and it's dependencies
installed to make things work.

I'm assuming that tree-sitter maintainers will work their way through
this checklist one day, so maybe it's best to operate under the
assumption that this change is coming to the grammar repos in the near
future.


Also, nixos packages tree-sitter grammars, and their distribution of the
Emacs master branch uses these successfully. As an end user, I find it 
convenient to have the distro provide these.
https://search.nixos.org/packages?query=tree-sitter-grammars
Although I might be worried about breaking changes to the grammars
and distros getting out of sync with what the treesitter major modes
expect.

-- 
Danny Freeman



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-28 12:56                                         ` Gregory Heytings
@ 2022-12-28 14:41                                           ` Stefan Monnier
  0 siblings, 0 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-28 14:41 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: Eli Zaretskii, dgutov, theophilusx, emacs-devel

> I think that's an exaggeration.  On a Debian-based system, building
> a grammar from source is as easy as typing three commands:
>
> $ apt install gcc nodejs rust-all
> $ cargo install tree-sitter-cli
> $ tree-sitter generate grammar.js

Great!  Then it should be easy to write the thingy I suggested.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-28  2:52                                             ` Yuan Fu
  2022-12-28 13:10                                               ` Gregory Heytings
  2022-12-28 13:38                                               ` Lynn Winebarger
@ 2022-12-29 11:14                                               ` Philip Kaludercic
  2022-12-29 15:27                                                 ` Gregory Heytings
  2022-12-30  1:01                                                 ` Gregory Heytings
  2 siblings, 2 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 11:14 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

Yuan Fu <casouri@gmail.com> writes:

>>> though considering that Emacs isn't the first editor with TreeSitter
>>> support, I wonder why this hasn't happened yet. 
>> 
>> My guess is lack of motivation on one side (most editors using
>> Tree-sitter already provide built-in support to automatically install
>> relevant grammars, which is even simpler (but not empowering) for the
>> end users since they don't need administrators access to install the
>> relevant grammars).
>> 
>> On the other side is probably the difficulty of packaging Rust and JS
>> libraries which tend to be horribly misbehaved w.r.t what distributions
>> expect (with things like vendoring or dependencies on very specific
>> versions of libraries).
>
> For tree-sitter, the dependency is pretty sane, with just node and a
> C/C++ compiler you can convert grammar.js to a loadable library.

Do you know how strong the dependency on node is?  As I said before, it
seems that it is possible to evaluate the grammar files that use the DSL
using something like quickjs as well, which is easier to build (or at
least I have bad experiences with installing tools around the Node
culture).  If grammar specifications really just stick to the DSL, then
this should be fine, but it appears that it should be possible for them
to also load arbitrary node.js libraries as well -- the node invocation
doesn't appear to inhibit this -- and complicate the process as well as
the building procedure as well.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 11:14                                               ` Philip Kaludercic
@ 2022-12-29 15:27                                                 ` Gregory Heytings
  2022-12-29 15:40                                                   ` Lynn Winebarger
                                                                     ` (2 more replies)
  2022-12-30  1:01                                                 ` Gregory Heytings
  1 sibling, 3 replies; 138+ messages in thread
From: Gregory Heytings @ 2022-12-29 15:27 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel


>
> Do you know how strong the dependency on node is?  As I said before, it 
> seems that it is possible to evaluate the grammar files that use the DSL 
> using something like quickjs as well
>

That's not possible, no, at least not without a lot of complications that 
do not seem worth the price, compared to installing Node.js.  And note 
that even if that were feasible, it would only solve the first half of the 
problem: to transform a grammar.js file into its corresponding parser.c 
file, you also need the tree-sitter command line program.

>
> but it appears that it should be possible for them to also load 
> arbitrary node.js libraries as well
>

Indeed, grammar authors are not limited to the standard Node.js API, they 
can import other libraries.  For example, tree-sitter-toml requires the 
regexp-util library.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 15:27                                                 ` Gregory Heytings
@ 2022-12-29 15:40                                                   ` Lynn Winebarger
  2022-12-29 21:50                                                     ` [SPAM UNSURE] " Stephen Leake
  2022-12-29 15:45                                                   ` Tree-sitter introduction documentation Philip Kaludercic
  2022-12-29 16:32                                                   ` Eli Zaretskii
  2 siblings, 1 reply; 138+ messages in thread
From: Lynn Winebarger @ 2022-12-29 15:40 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: Philip Kaludercic, Yuan Fu, Stefan Monnier, Eli Zaretskii,
	Dmitry Gutov, Tim Cross, emacs-devel

On Thu, Dec 29, 2022 at 10:28 AM Gregory Heytings <gregory@heytings.org> wrote:
> > Do you know how strong the dependency on node is?  As I said before, it
> > seems that it is possible to evaluate the grammar files that use the DSL
> > using something like quickjs as well
> >
>
> That's not possible, no, at least not without a lot of complications that
> do not seem worth the price, compared to installing Node.js.  And note
> that even if that were feasible, it would only solve the first half of the
> problem: to transform a grammar.js file into its corresponding parser.c
> file, you also need the tree-sitter command line program.

Maybe a better question is - is it possible to adapt the semantic
parser generators (or others in emacs) to create the ".c" files for
use with libtreesitter?
The functionality of libtreesitter is probably useful independent of
the tool used to create the module it loads, as long as it satisfies
the functional requirements.  Would the treesitter authors be amenable
to establishing a documented ABI for that component so other
parser-generators could target it?

Lynn



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 15:27                                                 ` Gregory Heytings
  2022-12-29 15:40                                                   ` Lynn Winebarger
@ 2022-12-29 15:45                                                   ` Philip Kaludercic
  2022-12-29 17:00                                                     ` Gregory Heytings
  2022-12-29 16:32                                                   ` Eli Zaretskii
  2 siblings, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 15:45 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

Gregory Heytings <gregory@heytings.org> writes:

>>
>> Do you know how strong the dependency on node is?  As I said before,
>> it seems that it is possible to evaluate the grammar files that use
>> the DSL using something like quickjs as well
>>
>
> That's not possible, no, at least not without a lot of complications
> that do not seem worth the price, compared to installing Node.js.  And
> note that even if that were feasible, it would only solve the first
> half of the problem: to transform a grammar.js file into its
> corresponding parser.c file, you also need the tree-sitter command
> line program.

Not necessarily, that could also be ported to JavaScript.  That being
said, I don't imagine it to be an easy process.  I am probably
underestimating how much of the code is shared in a library and how much
is generated.

>> but it appears that it should be possible for them to also load
>> arbitrary node.js libraries as well
>>
>
> Indeed, grammar authors are not limited to the standard Node.js API,
> they can import other libraries.  For example, tree-sitter-toml
> requires the regexp-util library.

How common is this in practice?  Is it encouraged?  The example you cite
would be trivial to fix:

--8<---------------cut here---------------start------------->8---
const { Charset } = require("regexp-util");

const getInverseRegex = charset =>
  new RegExp(`[^${charset.toString().slice(1, -1)}]`);

const control_chars = new Charset([0x0, 0x1f], 0x7f);
--8<---------------cut here---------------end--------------->8---

Just replace the "new Charset([0x0, 0x1f], 0x7f)" with the result of
evaluating the expression.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 15:27                                                 ` Gregory Heytings
  2022-12-29 15:40                                                   ` Lynn Winebarger
  2022-12-29 15:45                                                   ` Tree-sitter introduction documentation Philip Kaludercic
@ 2022-12-29 16:32                                                   ` Eli Zaretskii
  2022-12-29 16:53                                                     ` Philip Kaludercic
  2022-12-29 17:04                                                     ` Gregory Heytings
  2 siblings, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-29 16:32 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: philipk, casouri, monnier, dgutov, theophilusx, emacs-devel

> Date: Thu, 29 Dec 2022 15:27:12 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Yuan Fu <casouri@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>, 
>     Eli Zaretskii <eliz@gnu.org>, Dmitry Gutov <dgutov@yandex.ru>, 
>     Tim Cross <theophilusx@gmail.com>, emacs-devel@gnu.org
> 
> > but it appears that it should be possible for them to also load 
> > arbitrary node.js libraries as well
> >
> 
> Indeed, grammar authors are not limited to the standard Node.js API, they 
> can import other libraries.  For example, tree-sitter-toml requires the 
> regexp-util library.

The shared library I built for tree-sitter-toml has no such
dependency.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 16:32                                                   ` Eli Zaretskii
@ 2022-12-29 16:53                                                     ` Philip Kaludercic
  2022-12-29 16:59                                                       ` Eli Zaretskii
  2022-12-29 17:03                                                       ` Stefan Monnier
  2022-12-29 17:04                                                     ` Gregory Heytings
  1 sibling, 2 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 16:53 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Gregory Heytings, casouri, monnier, dgutov, theophilusx,
	emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Thu, 29 Dec 2022 15:27:12 +0000
>> From: Gregory Heytings <gregory@heytings.org>
>> cc: Yuan Fu <casouri@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>, 
>>     Eli Zaretskii <eliz@gnu.org>, Dmitry Gutov <dgutov@yandex.ru>, 
>>     Tim Cross <theophilusx@gmail.com>, emacs-devel@gnu.org
>> 
>> > but it appears that it should be possible for them to also load 
>> > arbitrary node.js libraries as well
>> >
>> 
>> Indeed, grammar authors are not limited to the standard Node.js API, they 
>> can import other libraries.  For example, tree-sitter-toml requires the 
>> regexp-util library.
>
> The shared library I built for tree-sitter-toml has no such
> dependency.

The shared object shouldn't depend on that library, it is a build-time
or rather "pre-processing"-time dependency.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 16:53                                                     ` Philip Kaludercic
@ 2022-12-29 16:59                                                       ` Eli Zaretskii
  2022-12-29 17:01                                                         ` Philip Kaludercic
  2022-12-29 17:03                                                       ` Stefan Monnier
  1 sibling, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-29 16:59 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: gregory, casouri, monnier, dgutov, theophilusx, emacs-devel

> From: Philip Kaludercic <philipk@posteo.net>
> Cc: Gregory Heytings <gregory@heytings.org>,  casouri@gmail.com,
>   monnier@iro.umontreal.ca,  dgutov@yandex.ru,  theophilusx@gmail.com,
>   emacs-devel@gnu.org
> Date: Thu, 29 Dec 2022 16:53:55 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Date: Thu, 29 Dec 2022 15:27:12 +0000
> >> From: Gregory Heytings <gregory@heytings.org>
> >> cc: Yuan Fu <casouri@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>, 
> >>     Eli Zaretskii <eliz@gnu.org>, Dmitry Gutov <dgutov@yandex.ru>, 
> >>     Tim Cross <theophilusx@gmail.com>, emacs-devel@gnu.org
> >> 
> >> > but it appears that it should be possible for them to also load 
> >> > arbitrary node.js libraries as well
> >> >
> >> 
> >> Indeed, grammar authors are not limited to the standard Node.js API, they 
> >> can import other libraries.  For example, tree-sitter-toml requires the 
> >> regexp-util library.
> >
> > The shared library I built for tree-sitter-toml has no such
> > dependency.
> 
> The shared object shouldn't depend on that library, it is a build-time
> or rather "pre-processing"-time dependency.

Yes, neither is it a build-time dependency.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 15:45                                                   ` Tree-sitter introduction documentation Philip Kaludercic
@ 2022-12-29 17:00                                                     ` Gregory Heytings
  2022-12-29 17:12                                                       ` Philip Kaludercic
  0 siblings, 1 reply; 138+ messages in thread
From: Gregory Heytings @ 2022-12-29 17:00 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel


>> That's not possible, no, at least not without a lot of complications 
>> that do not seem worth the price, compared to installing Node.js.  And 
>> note that even if that were feasible, it would only solve the first 
>> half of the problem: to transform a grammar.js file into its 
>> corresponding parser.c file, you also need the tree-sitter command line 
>> program.
>
> Not necessarily, that could also be ported to JavaScript.
>

I'm puzzled.  What would be the benefit of doing that?  Installing Node.js 
and tree-sitter is easy.

>
> That being said, I don't imagine it to be an easy process.
>

Indeed.  The generator is about 13500 lines of non-trivial Rust code.

>> Indeed, grammar authors are not limited to the standard Node.js API, 
>> they can import other libraries.
>
> How common is this in practice?  Is it encouraged?
>

I don't know.  I'd guess it is not frequent, but neither encouraged nor 
discouraged.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 16:59                                                       ` Eli Zaretskii
@ 2022-12-29 17:01                                                         ` Philip Kaludercic
  0 siblings, 0 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 17:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gregory, casouri, monnier, dgutov, theophilusx, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Philip Kaludercic <philipk@posteo.net>
>> Cc: Gregory Heytings <gregory@heytings.org>,  casouri@gmail.com,
>>   monnier@iro.umontreal.ca,  dgutov@yandex.ru,  theophilusx@gmail.com,
>>   emacs-devel@gnu.org
>> Date: Thu, 29 Dec 2022 16:53:55 +0000
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> Date: Thu, 29 Dec 2022 15:27:12 +0000
>> >> From: Gregory Heytings <gregory@heytings.org>
>> >> cc: Yuan Fu <casouri@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>, 
>> >>     Eli Zaretskii <eliz@gnu.org>, Dmitry Gutov <dgutov@yandex.ru>, 
>> >>     Tim Cross <theophilusx@gmail.com>, emacs-devel@gnu.org
>> >> 
>> >> > but it appears that it should be possible for them to also load 
>> >> > arbitrary node.js libraries as well
>> >> >
>> >> 
>> >> Indeed, grammar authors are not limited to the standard Node.js API, they 
>> >> can import other libraries.  For example, tree-sitter-toml requires the 
>> >> regexp-util library.
>> >
>> > The shared library I built for tree-sitter-toml has no such
>> > dependency.
>> 
>> The shared object shouldn't depend on that library, it is a build-time
>> or rather "pre-processing"-time dependency.
>
> Yes, neither is it a build-time dependency.

But it is what I called a "pre-processing"-time dependency, and that
matters if you want to change the actual grammar.js file instead of the
generated parser.c.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 16:53                                                     ` Philip Kaludercic
  2022-12-29 16:59                                                       ` Eli Zaretskii
@ 2022-12-29 17:03                                                       ` Stefan Monnier
  2022-12-29 17:12                                                         ` Gregory Heytings
  2022-12-29 17:13                                                         ` Philip Kaludercic
  1 sibling, 2 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-29 17:03 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Eli Zaretskii, Gregory Heytings, casouri, dgutov, theophilusx,
	emacs-devel

>> The shared library I built for tree-sitter-toml has no such
>> dependency.
> The shared object shouldn't depend on that library, it is a build-time
> or rather "pre-processing"-time dependency.

Indeed, IIUC, Tree-sitter uses Javascript as a preprocessor, so the
source grammar is a Javascript program which returns a grammar
represented as a JS object (could probably be easily mapped to JSON).
which is then turned into a `.c` file by a Rust program.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 16:32                                                   ` Eli Zaretskii
  2022-12-29 16:53                                                     ` Philip Kaludercic
@ 2022-12-29 17:04                                                     ` Gregory Heytings
  1 sibling, 0 replies; 138+ messages in thread
From: Gregory Heytings @ 2022-12-29 17:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: philipk, casouri, monnier, dgutov, theophilusx, emacs-devel


>>> but it appears that it should be possible for them to also load 
>>> arbitrary node.js libraries as well
>>
>> Indeed, grammar authors are not limited to the standard Node.js API, 
>> they can import other libraries.  For example, tree-sitter-toml 
>> requires the regexp-util library.
>
> The shared library I built for tree-sitter-toml has no such dependency.
>

That (Node.js) library is needed only to produce the parser.c file.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 17:00                                                     ` Gregory Heytings
@ 2022-12-29 17:12                                                       ` Philip Kaludercic
  2022-12-29 17:31                                                         ` Gregory Heytings
  0 siblings, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 17:12 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

Gregory Heytings <gregory@heytings.org> writes:

>>> That's not possible, no, at least not without a lot of
>>> complications that do not seem worth the price, compared to
>>> installing Node.js.  And note that even if that were feasible, it
>>> would only solve the first half of the problem: to transform a
>>> grammar.js file into its corresponding parser.c file, you also need
>>> the tree-sitter command line program.
>>
>> Not necessarily, that could also be ported to JavaScript.
>>
>
> I'm puzzled.  What would be the benefit of doing that?  Installing
> Node.js and tree-sitter is easy.

Not always, I always have issues with Node.js on Debian Stable.
Especially when external dependencies are added to the mix.

The advantage would be a simpler toolchain that would require less
effort for the user to get running, instead of dealing with version
mismatches and dependency resolution.

>>
>> That being said, I don't imagine it to be an easy process.
>>
>
> Indeed.  The generator is about 13500 lines of non-trivial Rust code.

I was under the impression that the main part of generating C code was
bundled in here:

  https://github.com/tree-sitter/tree-sitter/blob/master/cli/src/generate/render.rs

But I see that it appears to include some other modules, which is
probably where a lot of the logic happens :/

>>> Indeed, grammar authors are not limited to the standard Node.js
>>> API, they can import other libraries.
>>
>> How common is this in practice?  Is it encouraged?
>
> I don't know.  I'd guess it is not frequent, but neither encouraged
> nor discouraged.

This would be a nice thing to clarify.  I have also found out that there
is a Javascript interpreter written in Rust that could be used to remove
the Node.js dependency:  https://github.com/boa-dev/boa.  It would be
interesting to suggest this upstream and see if something like this
could be used at some point.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 17:03                                                       ` Stefan Monnier
@ 2022-12-29 17:12                                                         ` Gregory Heytings
  2022-12-29 17:13                                                         ` Philip Kaludercic
  1 sibling, 0 replies; 138+ messages in thread
From: Gregory Heytings @ 2022-12-29 17:12 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Philip Kaludercic, Eli Zaretskii, casouri, dgutov, theophilusx,
	emacs-devel


>
> Indeed, IIUC, Tree-sitter uses Javascript as a preprocessor, so the 
> source grammar is a Javascript program which returns a grammar 
> represented as a JS object (could probably be easily mapped to JSON). 
> which is then turned into a `.c` file by a Rust program.
>

The whole process is:

{grammar.js} -> [Node.js] -> {grammar.json} -> [tree-sitter] -> {parser.c} -> [cc or c++] -> {library file}
{scanner.c or scanner.cc} ---------------------------------------------------------^

where {} are files and [] are programs.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 17:03                                                       ` Stefan Monnier
  2022-12-29 17:12                                                         ` Gregory Heytings
@ 2022-12-29 17:13                                                         ` Philip Kaludercic
  1 sibling, 0 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 17:13 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Eli Zaretskii, Gregory Heytings, casouri, dgutov, theophilusx,
	emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> The shared library I built for tree-sitter-toml has no such
>>> dependency.
>> The shared object shouldn't depend on that library, it is a build-time
>> or rather "pre-processing"-time dependency.
>
> Indeed, IIUC, Tree-sitter uses Javascript as a preprocessor, so the
> source grammar is a Javascript program which returns a grammar
> represented as a JS object (could probably be easily mapped to JSON).
> which is then turned into a `.c` file by a Rust program.

Yes, that is my understanding for reading the tree-sitter source code as
well.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 17:12                                                       ` Philip Kaludercic
@ 2022-12-29 17:31                                                         ` Gregory Heytings
  2022-12-29 18:12                                                           ` Philip Kaludercic
  0 siblings, 1 reply; 138+ messages in thread
From: Gregory Heytings @ 2022-12-29 17:31 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel


>> I'm puzzled.  What would be the benefit of doing that?  Installing 
>> Node.js and tree-sitter is easy.
>
> Not always, I always have issues with Node.js on Debian Stable. 
> Especially when external dependencies are added to the mix.
>

I don't use Debian stable, so I can't comment on that.  On Debian testing 
(which is pretty stable!) installing the complete toolchain requires only 
two commands:

$ apt install gcc g++ nodejs rust-all
$ cargo install tree-sitter-cli

That takes a couple of minutes at most.

>
> The advantage would be a simpler toolchain that would require less 
> effort for the user to get running, instead of dealing with version 
> mismatches and dependency resolution.
>

As you know, regular users do not need to install the complete toolchain, 
a C and C++ compiler is enough.  It is only those few users that want to 
change the grammars or create new grammars that need a complete toolchain.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 17:31                                                         ` Gregory Heytings
@ 2022-12-29 18:12                                                           ` Philip Kaludercic
  2022-12-29 18:28                                                             ` Eli Zaretskii
  2022-12-29 18:32                                                             ` Stefan Monnier
  0 siblings, 2 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-29 18:12 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

Gregory Heytings <gregory@heytings.org> writes:

>>> I'm puzzled.  What would be the benefit of doing that?  Installing
>>> Node.js and tree-sitter is easy.
>>
>> Not always, I always have issues with Node.js on Debian
>> Stable. Especially when external dependencies are added to the mix.
>>
>
> I don't use Debian stable, so I can't comment on that.  On Debian
> testing (which is pretty stable!) installing the complete toolchain
> requires only two commands:
>
> $ apt install gcc g++ nodejs rust-all
> $ cargo install tree-sitter-cli

... assuming that the grammar has no additional dependencies, which
would mean that you'd have to deal with npm.

> That takes a couple of minutes at most.

For me and you, since we have spent time reading up on with tree sitter
and are familiar with the technologies it makes use of.  But it is by no
means obvious.

>>
>> The advantage would be a simpler toolchain that would require less
>> effort for the user to get running, instead of dealing with version
>> mismatches and dependency resolution.
>>
>
> As you know, regular users do not need to install the complete
> toolchain, a C and C++ compiler is enough.  It is only those few users
> that want to change the grammars or create new grammars that need a
> complete toolchain.

I should clarify that these are the users I am concerned with, and
without a reason to, I do not distinguish from "regular" users.

My main worry with these changes, along with the popularity of LSP is
that while they are technological improvements, they all happen at the
deterioration of Emacs' introspectability, increasing the effort it
takes for the user to make changes.  IIUC you can't reload a .el file or
just a singular expression if you want to change how completion via
Eglot or how imenu works via Tree Sitter.  A simple hack becomes a
weekend project.  This is not an unconditional good.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 18:12                                                           ` Philip Kaludercic
@ 2022-12-29 18:28                                                             ` Eli Zaretskii
  2022-12-29 18:44                                                               ` Stefan Monnier
  2022-12-29 18:32                                                             ` Stefan Monnier
  1 sibling, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-29 18:28 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: gregory, casouri, monnier, dgutov, theophilusx, emacs-devel

> From: Philip Kaludercic <philipk@posteo.net>
> Cc: Yuan Fu <casouri@gmail.com>,  Stefan Monnier <monnier@iro.umontreal.ca>,
>   Eli Zaretskii <eliz@gnu.org>,  Dmitry Gutov <dgutov@yandex.ru>,  Tim
>  Cross <theophilusx@gmail.com>,  emacs-devel@gnu.org
> Date: Thu, 29 Dec 2022 18:12:26 +0000
> 
> > As you know, regular users do not need to install the complete
> > toolchain, a C and C++ compiler is enough.  It is only those few users
> > that want to change the grammars or create new grammars that need a
> > complete toolchain.
> 
> I should clarify that these are the users I am concerned with, and
> without a reason to, I do not distinguish from "regular" users.
> 
> My main worry with these changes, along with the popularity of LSP is
> that while they are technological improvements, they all happen at the
> deterioration of Emacs' introspectability, increasing the effort it
> takes for the user to make changes.  IIUC you can't reload a .el file or
> just a singular expression if you want to change how completion via
> Eglot or how imenu works via Tree Sitter.  A simple hack becomes a
> weekend project.  This is not an unconditional good.

Yes, TANSTAAFL.

And I think you exaggerate quite a lot.  It is wrong not to
distinguish between users who tinker with language grammars and the
rest of them.  Modifying a parser's grammar requires non-trivial
knowledge, and thus most users will not go there.  Just like most
users will not try hacking the Emacs display code or GC.  So the
possibility to modify the grammar should exist, of course, but having
to install a bunch of packages and their dependencies is nowhere near
a serious problem.  Because if it is, then we have similar problems
with librsvg, for example, because you need a Rust installation to
modify it.  And there are other similar difficulties with other
optional libraries we use.

So let's stay focused on letting our "normal" users use the benefits
of these technologies first, and care about those who want to change
the grammars second.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 18:12                                                           ` Philip Kaludercic
  2022-12-29 18:28                                                             ` Eli Zaretskii
@ 2022-12-29 18:32                                                             ` Stefan Monnier
  1 sibling, 0 replies; 138+ messages in thread
From: Stefan Monnier @ 2022-12-29 18:32 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Gregory Heytings, Yuan Fu, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

> My main worry with these changes, along with the popularity of LSP is
> that while they are technological improvements, they all happen at the
> deterioration of Emacs' introspectability, increasing the effort it
> takes for the user to make changes.  IIUC you can't reload a .el file or
> just a singular expression if you want to change how completion via
> Eglot or how imenu works via Tree Sitter.  A simple hack becomes a
> weekend project.  This is not an unconditional good.

Agreed.  Tree-sitter is actually not that bad in this respect: beside
the fact that it requires additional tools to work on the grammar,
"everything else" is under the control of ELisp: Tree-sitter only takes
care of parsing and giving a parse tree without imposing any particular
way to use this information.

So the only thing on which we need to work is making it easier for our
users to hack on the source grammar.  That means helping them fetch that
grammar and helping them figure out which commands to run to generate the
`.c` file from it.  Maybe we could also try and write code that jumps
from a particular position in a buffer to the BNF rule of the
corresponding Tree-sitter node.

LSP is much more problematic because the protocol is structured in a way
that gives a lot of decision making on the server side and the Emacs
side is somewhat limited to supporting or not supporting
a particular feature.  I think this is a large part of the reason why
some languages see a proliferation of different LSP servers: what should
ideally be configured on the editor side is instead "configured" on the
server side by choosing which server you use :-(


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 18:28                                                             ` Eli Zaretskii
@ 2022-12-29 18:44                                                               ` Stefan Monnier
  2022-12-29 19:34                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Stefan Monnier @ 2022-12-29 18:44 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Philip Kaludercic, gregory, casouri, dgutov, theophilusx,
	emacs-devel

> a serious problem.  Because if it is, then we have similar problems
> with librsvg, for example, because you need a Rust installation to

I don't think the two can really be compared.
Major modes have been parsing our files (and Emacs users have been
modifying those parsers) from the very early days of Emacs.

> So let's stay focused on letting our "normal" users use the benefits
> of these technologies first, and care about those who want to change
> the grammars second.

Agreed.  But I think the first is well on its way, so we can start
thinking about the second.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 18:44                                                               ` Stefan Monnier
@ 2022-12-29 19:34                                                                 ` Eli Zaretskii
  2022-12-29 19:48                                                                   ` Stefan Monnier
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-29 19:34 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: philipk, gregory, casouri, dgutov, theophilusx, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Philip Kaludercic <philipk@posteo.net>,  gregory@heytings.org,
>   casouri@gmail.com,  dgutov@yandex.ru,  theophilusx@gmail.com,
>   emacs-devel@gnu.org
> Date: Thu, 29 Dec 2022 13:44:01 -0500
> 
> > a serious problem.  Because if it is, then we have similar problems
> > with librsvg, for example, because you need a Rust installation to
> 
> I don't think the two can really be compared.
> Major modes have been parsing our files (and Emacs users have been
> modifying those parsers) from the very early days of Emacs.

Yes, in ad-hoc-ish way that is not scalable.  We want to do better by
using real parsers.

> > So let's stay focused on letting our "normal" users use the benefits
> > of these technologies first, and care about those who want to change
> > the grammars second.
> 
> Agreed.  But I think the first is well on its way

I wish.  Witness the other thread which just started.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 19:34                                                                 ` Eli Zaretskii
@ 2022-12-29 19:48                                                                   ` Stefan Monnier
  2022-12-29 19:59                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Stefan Monnier @ 2022-12-29 19:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: philipk, gregory, casouri, dgutov, theophilusx, emacs-devel

>> I don't think the two can really be compared.
>> Major modes have been parsing our files (and Emacs users have been
>> modifying those parsers) from the very early days of Emacs.
> Yes, in ad-hoc-ish way that is not scalable.  We want to do better by
> using real parsers.

I don't think the need to modify the parsers was exclusively due to the
fact that they were ad-hoc and not scalable :-)

>> > So let's stay focused on letting our "normal" users use the benefits
>> > of these technologies first, and care about those who want to change
>> > the grammars second.
>> Agreed.  But I think the first is well on its way
> I wish.  Witness the other thread which just started.

Funny, I see it as evidence in favor of the fact that it's well on
its way.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 19:48                                                                   ` Stefan Monnier
@ 2022-12-29 19:59                                                                     ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-29 19:59 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: philipk, gregory, casouri, dgutov, theophilusx, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: philipk@posteo.net,  gregory@heytings.org,  casouri@gmail.com,
>   dgutov@yandex.ru,  theophilusx@gmail.com,  emacs-devel@gnu.org
> Date: Thu, 29 Dec 2022 14:48:45 -0500
> 
> >> I don't think the two can really be compared.
> >> Major modes have been parsing our files (and Emacs users have been
> >> modifying those parsers) from the very early days of Emacs.
> > Yes, in ad-hoc-ish way that is not scalable.  We want to do better by
> > using real parsers.
> 
> I don't think the need to modify the parsers was exclusively due to the
> fact that they were ad-hoc and not scalable :-)

Of course not.  We simply didn't have a parser before.

> >> > So let's stay focused on letting our "normal" users use the benefits
> >> > of these technologies first, and care about those who want to change
> >> > the grammars second.
> >> Agreed.  But I think the first is well on its way
> > I wish.  Witness the other thread which just started.
> 
> Funny, I see it as evidence in favor of the fact that it's well on
> its way.

Let's just agree to disagree here.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [SPAM UNSURE] Re: Tree-sitter introduction documentation
  2022-12-29 15:40                                                   ` Lynn Winebarger
@ 2022-12-29 21:50                                                     ` Stephen Leake
  2022-12-29 22:37                                                       ` Lynn Winebarger
  2022-12-30 14:10                                                       ` Lynn Winebarger
  0 siblings, 2 replies; 138+ messages in thread
From: Stephen Leake @ 2022-12-29 21:50 UTC (permalink / raw)
  To: Lynn Winebarger
  Cc: Gregory Heytings, Philip Kaludercic, Yuan Fu, Stefan Monnier,
	Eli Zaretskii, Dmitry Gutov, Tim Cross, emacs-devel

Lynn Winebarger <owinebar@gmail.com> writes:

> On Thu, Dec 29, 2022 at 10:28 AM Gregory Heytings <gregory@heytings.org> wrote:
>> > Do you know how strong the dependency on node is?  As I said before, it
>> > seems that it is possible to evaluate the grammar files that use the DSL
>> > using something like quickjs as well
>> >
>>
>> That's not possible, no, at least not without a lot of complications that
>> do not seem worth the price, compared to installing Node.js.  And note
>> that even if that were feasible, it would only solve the first half of the
>> problem: to transform a grammar.js file into its corresponding parser.c
>> file, you also need the tree-sitter command line program.
>
> Maybe a better question is - is it possible to adapt the semantic
> parser generators (or others in emacs) to create the ".c" files for
> use with libtreesitter?

This is possible in principle; I've thought about doing it with the
wisitoken parser generator.

However, the format/struture/details of the output is not documented,
and may change in future tree-sitter releases.

> The functionality of libtreesitter is probably useful independent of
> the tool used to create the module it loads, as long as it satisfies
> the functional requirements. Would the treesitter authors be amenable
> to establishing a documented ABI for that component so other
> parser-generators could target it?

That's worth filing an issue on the tree-sitter development site. I
looked briefly, and did not see a similar issue.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [SPAM UNSURE] Re: Tree-sitter introduction documentation
  2022-12-29 21:50                                                     ` [SPAM UNSURE] " Stephen Leake
@ 2022-12-29 22:37                                                       ` Lynn Winebarger
  2022-12-30 14:10                                                       ` Lynn Winebarger
  1 sibling, 0 replies; 138+ messages in thread
From: Lynn Winebarger @ 2022-12-29 22:37 UTC (permalink / raw)
  To: Stephen Leake
  Cc: Gregory Heytings, Philip Kaludercic, Yuan Fu, Stefan Monnier,
	Eli Zaretskii, Dmitry Gutov, Tim Cross, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2837 bytes --]

On Thu, Dec 29, 2022, 4:50 PM Stephen Leake <stephen_leake@stephe-leake.org>
wrote:

> Lynn Winebarger <owinebar@gmail.com> writes:
>
> > On Thu, Dec 29, 2022 at 10:28 AM Gregory Heytings <gregory@heytings.org>
> wrote:
> >> > Do you know how strong the dependency on node is?  As I said before,
> it
> >> > seems that it is possible to evaluate the grammar files that use the
> DSL
> >> > using something like quickjs as well
> >> >
> >>
> >> That's not possible, no, at least not without a lot of complications
> that
> >> do not seem worth the price, compared to installing Node.js.  And note
> >> that even if that were feasible, it would only solve the first half of
> the
> >> problem: to transform a grammar.js file into its corresponding parser.c
> >> file, you also need the tree-sitter command line program.
> >
> > Maybe a better question is - is it possible to adapt the semantic
> > parser generators (or others in emacs) to create the ".c" files for
> > use with libtreesitter?
>
> This is possible in principle; I've thought about doing it with the
> wisitoken parser generator.
>
> However, the format/struture/details of the output is not documented,
> and may change in future tree-sitter releases.
>


True, *but* each parser has as an "ABI version" constant encoded into it,
and the source states the library is supposed to be backwards compatible
(currently 14, the the grammar.c files I saw are at 13).  So if we get
something that complies with version 13, it should be fine.

I've looked over a couple of the parser.c files, and they appear to be
pretty standard implementations of.LR stack automata. The CLI tool supports
ambiguous grammars, but if you start from one an existing LR-type of parser
generator (bison or wisent, say) can handle, then that piece should be
relatively straightforward.
The novel feature appears to be the automatic derivation of a "flattened"
AST data structure, which the library uses to construct ASTs for a given
action.

I did come up with my own calculation of a flattened AST structure a year
or two ago, though I wasn't thrilled with some of the results I got -
automatically picking a symbol to use to "break" recursive inclusion by
reverting back to a pointer was tricky.  I believe I went with a heuristic
of converting the symbol that had the most incoming edges to a reference in
each place it appeared as a field, then repeated until there were no data
structures that were required to physically contain themselves.  I used
that rule because it seemed like it would minimize the number of symbols
not allowed to appear as fields, thus maximizing the flattening effect (too
well, in fact - I had to introduce some constraints to keep some of the
nonlinear structure).

So it might be interesting to go through tree-sitters' algorithm to see
what they came up with.

Lynn

[-- Attachment #2: Type: text/html, Size: 3790 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-29 11:14                                               ` Philip Kaludercic
  2022-12-29 15:27                                                 ` Gregory Heytings
@ 2022-12-30  1:01                                                 ` Gregory Heytings
  2022-12-30 11:00                                                   ` Philip Kaludercic
  1 sibling, 1 reply; 138+ messages in thread
From: Gregory Heytings @ 2022-12-30  1:01 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel


>
> As I said before, it seems that it is possible to evaluate the grammar 
> files that use the DSL using something like quickjs as well, which is 
> easier to build
>

You asked for it, so here it is:

global = {}
module = {}
process = { env: { TREE_SITTER_GRAMMAR_PATH: './grammar.js' } }
function require(s) { std.loadScript(s); return module.exports; }
std.loadScript('/path/to/the/script/dsl.js')

Put these five lines in a file, say "gen.js", cd in the directory of a 
tree-sitter-<lang> repository, and type

qjs --std /path/to/the/script/gen.js > src/grammar.json

For simple grammars (bash, c, cmake, csharp, css, dockerfile, go, go-mod, 
java, js, json, python, rust, yaml), this will work out of the box.  For 
more complex ones (c++, toml, tsx, typescript), you'll need to edit the 
grammar.js file.  That is left as an exercise for the reader.

Disclaimer: this is NOT what I recommend anyone to do!  TRT is to install 
and use Node.js, at least if you want to spare yourself headaches.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30  1:01                                                 ` Gregory Heytings
@ 2022-12-30 11:00                                                   ` Philip Kaludercic
  2022-12-30 12:07                                                     ` Gregory Heytings
  0 siblings, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-30 11:00 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

Gregory Heytings <gregory@heytings.org> writes:

>>
>> As I said before, it seems that it is possible to evaluate the
>> grammar files that use the DSL using something like quickjs as well,
>> which is easier to build
>>
>
> You asked for it, so here it is:
>
> global = {}
> module = {}
> process = { env: { TREE_SITTER_GRAMMAR_PATH: './grammar.js' } }
> function require(s) { std.loadScript(s); return module.exports; }
> std.loadScript('/path/to/the/script/dsl.js')
>
> Put these five lines in a file, say "gen.js", cd in the directory of a
> tree-sitter-<lang> repository, and type
>
> qjs --std /path/to/the/script/gen.js > src/grammar.json
>
> For simple grammars (bash, c, cmake, csharp, css, dockerfile, go,
> go-mod, java, js, json, python, rust, yaml), this will work out of the
> box.  For more complex ones (c++, toml, tsx, typescript), you'll need
> to edit the grammar.js file.  That is left as an exercise for the
> reader.
>
> Disclaimer: this is NOT what I recommend anyone to do!  TRT is to
> install and use Node.js, at least if you want to spare yourself
> headaches.

I should have clarified this further up in the thread, but I did try
this out and confirmed that it does work (or at least I hope I did say
that).  I didn't have a nice script like you give here, but that this is
possible was clear.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 16:44                                 ` Philip Kaludercic
  2022-12-27 17:16                                   ` Eli Zaretskii
@ 2022-12-30 11:06                                   ` Yuan Fu
  2022-12-30 11:25                                     ` Philip Kaludercic
                                                       ` (2 more replies)
  1 sibling, 3 replies; 138+ messages in thread
From: Yuan Fu @ 2022-12-30 11:06 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Stefan Monnier, Eli Zaretskii, Dmitry Gutov, theophilusx,
	emacs-devel



> On Dec 27, 2022, at 8:44 AM, Philip Kaludercic <philipk@posteo.net> wrote:
> 
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
> 
>>> It doesn't need any project, it is literally two command lines.
>>> Here's an example:
>>> 
>>>  gcc -O2 -I.   -c -o parser.o parser.c
>>>  gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll
>> 
>> AFAIK `parser.c` is a file generated from the actual grammar's source,
>> itself written in Javascript.
>> 
>> So the above instructions are akin to downloading a precompiled binary
>> and installing it.  While it is the most convenient path for the
>> end-users, it's important w.r.t Freedom to make sure that grammars can
>> also be regenerated from source by the end users.
> 
> I have asked the question before, but freedom or not, the above is a
> nuisance to run for every language.  If the process is as automatic as
> the above example demonstrates, shouldn't Emacs have a command to take a
> grammar and compile+install it?  I guess this could be more complicated
> if the grammar is generated using a custom tool-chain for each language
> (or is it always Javascript?), but nothing impossible.

Though the magic of programming, such command now exists: treesit-install-language-grammar. It needs recipes to work, though. The recipe would involve https://github.com, which I guess is probably too heretical to include in Emacs source, so I left the recipes empty. I tested the install command with these recipes:

(setq treesit-language-source-alist
      '((python "https://github.com/tree-sitter/tree-sitter-python.git")
        (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
                    "typescript/src" "typescript")))

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:06                                   ` Yuan Fu
@ 2022-12-30 11:25                                     ` Philip Kaludercic
  2022-12-30 11:54                                       ` tomas
  2022-12-30 23:33                                       ` Yuan Fu
  2022-12-30 15:31                                     ` Eli Zaretskii
  2023-01-01  3:03                                     ` Richard Stallman
  2 siblings, 2 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-30 11:25 UTC (permalink / raw)
  To: Yuan Fu
  Cc: Stefan Monnier, Eli Zaretskii, Dmitry Gutov, theophilusx,
	emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1973 bytes --]

Yuan Fu <casouri@gmail.com> writes:

>> On Dec 27, 2022, at 8:44 AM, Philip Kaludercic <philipk@posteo.net> wrote:
>> 
>> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> 
>>>> It doesn't need any project, it is literally two command lines.
>>>> Here's an example:
>>>> 
>>>>  gcc -O2 -I.   -c -o parser.o parser.c
>>>>  gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll
>>> 
>>> AFAIK `parser.c` is a file generated from the actual grammar's source,
>>> itself written in Javascript.
>>> 
>>> So the above instructions are akin to downloading a precompiled binary
>>> and installing it.  While it is the most convenient path for the
>>> end-users, it's important w.r.t Freedom to make sure that grammars can
>>> also be regenerated from source by the end users.
>> 
>> I have asked the question before, but freedom or not, the above is a
>> nuisance to run for every language.  If the process is as automatic as
>> the above example demonstrates, shouldn't Emacs have a command to take a
>> grammar and compile+install it?  I guess this could be more complicated
>> if the grammar is generated using a custom tool-chain for each language
>> (or is it always Javascript?), but nothing impossible.
>
> Though the magic of programming, such command now exists: treesit-install-language-grammar. It needs recipes to work, though. The recipe would involve https://github.com, which I guess is probably too heretical to include in Emacs source, so I left the recipes empty. I tested the install command with these recipes:
>
> (setq treesit-language-source-alist
>       '((python "https://github.com/tree-sitter/tree-sitter-python.git")
>         (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
>                     "typescript/src" "typescript")))
>
> Yuan

If acceptable, it looks good.  I could imagine that it should be OK if
we point to GitHub, since we are just using it as a Git host.  Here are
a few suggestions


[-- Attachment #2: Type: text/plain, Size: 3482 bytes --]

diff --git a/lisp/treesit.el b/lisp/treesit.el
index b120ca68c5..651898e948 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -99,6 +99,15 @@ treesit
   :group 'tools
   :version "29.1")
 
+(defcustom treesit-enabled-modes nil
+  "List of modes to enable tree-sitter support if available.
+When initialising a major mode with potential tree-sitter
+support, this variable is consulted.  The special value t will
+enable tree-sitter support whenever possible."
+  :type '(choice (const :tag "Whenever possible" t)
+                 (repeat :tag "Specific modes" function))
+  :version "29.1")
+
 (defcustom treesit-max-buffer-size
   (let ((mb (* 1024 1024)))
     ;; 40MB for 64-bit systems, 15 for 32-bit.
@@ -2690,20 +2699,19 @@ treesit--install-language-grammar-1
 For LANG, URL, SOURCE-DIR, GRAMMAR-DIR, CC, C++, see
 `treesit-language-source-alist'.  If anything goes wrong, this
 function signals an error."
-  (let* ((lang (symbol-name lang))
-         (default-directory "/tmp")
-         (workdir (expand-file-name "treesit-workdir-00893133134"))
+  (let* ((default-directory (make-temp-file "treesit-workdir" t))
+         (workdir (expand-file-name "repo"))
          (source-dir (expand-file-name (or source-dir "src") workdir))
          (grammar-dir (expand-file-name (or grammar-dir "") workdir))
-         (cc (or cc "cc"))
-         (c++ (or c++ "c++"))
+         (cc (or cc (seq-find #'executable-find '("cc" "gcc" "c99"))
+                 (error "No C compiler found")))
+         (c++ (or c++ (seq-find #'executable-find '("c++" "g++"))))
          (soext (pcase system-type
                   ('darwin "dylib")
                   ((or 'ms-dos 'cywin 'windows-nt) "dll")
                   (_ "so")))
          (out-dir (or (and out-dir (expand-file-name out-dir))
-                      (expand-file-name
-                       "tree-sitter" user-emacs-directory)))
+                      (locate-user-emacs-file "tree-sitter")))
          (lib-name (format "libtree-sitter-%s.%s" lang soext)))
     (unwind-protect
         (with-temp-buffer
@@ -2713,8 +2721,8 @@ treesit--install-language-grammar-1
            "git" nil t nil "clone" url "--depth" "1" "--quiet"
            workdir)
           ;; cp "${grammardir}"/grammar.js "${sourcedir}"
-          (copy-file (concat grammar-dir "/grammar.js")
-                     (concat source-dir "/grammar.js"))
+          (copy-file (file-name-concat grammar-dir "grammar.js")
+                     (file-name-concat source-dir "grammar.js"))
           ;; cd "${sourcedir}"
           (setq default-directory source-dir)
           (message "Compiling library")
@@ -2723,6 +2731,7 @@ treesit--install-language-grammar-1
            cc nil t nil "-fPIC" "-c" "-I." "parser.c")
           ;; cc -fPIC -c -I. scanner.c
           (when (file-exists-p "scanner.c")
+            (unless c++ (error "No C++ compiler found"))
             (treesit--call-process-signal
              cc nil t nil "-fPIC" "-c" "-I." "scanner.c"))
           ;; c++ -fPIC -I. -c scanner.cc
@@ -2739,7 +2748,7 @@ treesit--install-language-grammar-1
                       (rx bos (+ anychar) ".o" eos))
                    "-o" ,lib-name))
           ;; Copy out.
-          (copy-file lib-name (concat out-dir "/") t)
+          (copy-file lib-name (file-name-as-directory out-dir) t)
           (message "Library installed to %s/%s" out-dir lib-name))
       (when (file-exists-p workdir)
         (delete-directory workdir t)))))

^ permalink raw reply related	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:25                                     ` Philip Kaludercic
@ 2022-12-30 11:54                                       ` tomas
  2022-12-30 11:59                                         ` Philip Kaludercic
  2022-12-30 23:33                                       ` Yuan Fu
  1 sibling, 1 reply; 138+ messages in thread
From: tomas @ 2022-12-30 11:54 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 421 bytes --]

On Fri, Dec 30, 2022 at 11:25:25AM +0000, Philip Kaludercic wrote:

> If acceptable, it looks good.  I could imagine that it should be OK if
> we point to GitHub, since we are just using it as a Git host.  Here are
> a few suggestions

I won't object, as long as I can excise it with my own
hands. I don't like to see how That Company silently
slithers into more and more basic infrastructure.

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:54                                       ` tomas
@ 2022-12-30 11:59                                         ` Philip Kaludercic
  2022-12-30 12:27                                           ` tomas
  0 siblings, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-30 11:59 UTC (permalink / raw)
  To: tomas; +Cc: emacs-devel

<tomas@tuxteam.de> writes:

> On Fri, Dec 30, 2022 at 11:25:25AM +0000, Philip Kaludercic wrote:
>
>> If acceptable, it looks good.  I could imagine that it should be OK if
>> we point to GitHub, since we are just using it as a Git host.  Here are
>> a few suggestions
>
> I won't object, as long as I can excise it with my own
> hands. I don't like to see how That Company silently
> slithers into more and more basic infrastructure.

What do you mean by "exercise"?  Install the rules manually?

Perhaps it would be better to download tarballs instead of cloning a
repository of the depth 1?  They are easier to configure and usually
faster to download.  The result should be the same anyway.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:00                                                   ` Philip Kaludercic
@ 2022-12-30 12:07                                                     ` Gregory Heytings
  2022-12-30 13:10                                                       ` Philip Kaludercic
  0 siblings, 1 reply; 138+ messages in thread
From: Gregory Heytings @ 2022-12-30 12:07 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel


>
> I did try this out and confirmed that it does work (or at least I hope I 
> did say that).  I didn't have a nice script like you give here, but that 
> this is possible was clear.
>

You didn't, no, and it wasn't clear.  You merely said "it might be 
possible" (tree days ago) and "it seems that it is possible" (yesterday). 
To which I replied that it isn't possible "without a lot of 
complications".  The script I sent is meant to clarify that point: to show 
how and to what extent it is possible, and what the complications (having 
to modify the source grammars manually) are.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:59                                         ` Philip Kaludercic
@ 2022-12-30 12:27                                           ` tomas
  2022-12-30 12:45                                             ` Philip Kaludercic
  2022-12-30 14:26                                             ` Dmitry Gutov
  0 siblings, 2 replies; 138+ messages in thread
From: tomas @ 2022-12-30 12:27 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]

On Fri, Dec 30, 2022 at 11:59:39AM +0000, Philip Kaludercic wrote:
> <tomas@tuxteam.de> writes:
> 
> > On Fri, Dec 30, 2022 at 11:25:25AM +0000, Philip Kaludercic wrote:
> >
> >> If acceptable, it looks good.  I could imagine that it should be OK if
> >> we point to GitHub, since we are just using it as a Git host.  Here are
> >> a few suggestions
> >
> > I won't object, as long as I can excise it with my own
> > hands. I don't like to see how That Company silently
> > slithers into more and more basic infrastructure.
> 
> What do you mean by "exercise"?  Install the rules manually?

I said "excise". More traumatic :-)

> Perhaps it would be better to download tarballs instead of cloning a
> repository of the depth 1?  They are easier to configure and usually
> faster to download.  The result should be the same anyway.

Yes, perhaps. Personally, I try as hard as I can to keep a safe
distance between github and myself. Probably off-topic here going
into too much detail why.

Cheers & thanks nevertheless for your hard work :)

-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 12:27                                           ` tomas
@ 2022-12-30 12:45                                             ` Philip Kaludercic
  2022-12-30 14:26                                             ` Dmitry Gutov
  1 sibling, 0 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-30 12:45 UTC (permalink / raw)
  To: tomas; +Cc: emacs-devel, Yuan Fu

tomas@tuxteam.de writes:

> On Fri, Dec 30, 2022 at 11:59:39AM +0000, Philip Kaludercic wrote:
>> <tomas@tuxteam.de> writes:
>> 
>> > On Fri, Dec 30, 2022 at 11:25:25AM +0000, Philip Kaludercic wrote:
>> >
>> >> If acceptable, it looks good.  I could imagine that it should be OK if
>> >> we point to GitHub, since we are just using it as a Git host.  Here are
>> >> a few suggestions
>> >
>> > I won't object, as long as I can excise it with my own
>> > hands. I don't like to see how That Company silently
>> > slithers into more and more basic infrastructure.
>> 
>> What do you mean by "exercise"?  Install the rules manually?
>
> I said "excise". More traumatic :-)

Whoops, that makes more sense.

>> Perhaps it would be better to download tarballs instead of cloning a
>> repository of the depth 1?  They are easier to configure and usually
>> faster to download.  The result should be the same anyway.
>
> Yes, perhaps. Personally, I try as hard as I can to keep a safe
> distance between github and myself. Probably off-topic here going
> into too much detail why.

That is totally relatable.

> Cheers & thanks nevertheless for your hard work :)

I did nothing here, just wrote a 10-line diff and nagged a lot :)



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 12:07                                                     ` Gregory Heytings
@ 2022-12-30 13:10                                                       ` Philip Kaludercic
  2022-12-30 15:23                                                         ` Gregory Heytings
  0 siblings, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-30 13:10 UTC (permalink / raw)
  To: Gregory Heytings
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel

Gregory Heytings <gregory@heytings.org> writes:

>>
>> I did try this out and confirmed that it does work (or at least I
>> hope I did say that).  I didn't have a nice script like you give
>> here, but that this is possible was clear.
>>
>
> You didn't, no, and it wasn't clear.  You merely said "it might be
> possible" (tree days ago) and "it seems that it is possible"
> (yesterday). To which I replied that it isn't possible "without a lot
> of complications".  The script I sent is meant to clarify that point:
> to show how and to what extent it is possible, and what the
> complications (having to modify the source grammars manually) are.

You are right, sorry about that.  I tested it out but didn't report back
on my results.

What I am considering doing is contacting the tree-sitter developers and
arguing in favour of "specifying" that a grammar has to be written in
standardised EMCAScript, instead of node.js.  The adjustment would be
relatively minor on their end, but make the system more portable.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [SPAM UNSURE] Re: Tree-sitter introduction documentation
  2022-12-29 21:50                                                     ` [SPAM UNSURE] " Stephen Leake
  2022-12-29 22:37                                                       ` Lynn Winebarger
@ 2022-12-30 14:10                                                       ` Lynn Winebarger
  2022-12-30 16:25                                                         ` Targeting libtreesitter from wisent and other parser generators for emacs Lynn Winebarger
  1 sibling, 1 reply; 138+ messages in thread
From: Lynn Winebarger @ 2022-12-30 14:10 UTC (permalink / raw)
  To: Stephen Leake
  Cc: Gregory Heytings, Philip Kaludercic, Yuan Fu, Stefan Monnier,
	Eli Zaretskii, Dmitry Gutov, Tim Cross, emacs-devel

On Thu, Dec 29, 2022 at 4:50 PM Stephen Leake
<stephen_leake@stephe-leake.org> wrote:
> > The functionality of libtreesitter is probably useful independent of
> > the tool used to create the module it loads, as long as it satisfies
> > the functional requirements. Would the treesitter authors be amenable
> > to establishing a documented ABI for that component so other
> > parser-generators could target it?
>
> That's worth filing an issue on the tree-sitter development site. I
> looked briefly, and did not see a similar issue.

Done - https://github.com/tree-sitter/tree-sitter/issues/2006

Lynn



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 12:27                                           ` tomas
  2022-12-30 12:45                                             ` Philip Kaludercic
@ 2022-12-30 14:26                                             ` Dmitry Gutov
  1 sibling, 0 replies; 138+ messages in thread
From: Dmitry Gutov @ 2022-12-30 14:26 UTC (permalink / raw)
  To: tomas, Philip Kaludercic; +Cc: emacs-devel

On 30/12/2022 14:27, tomas@tuxteam.de wrote:
>> Perhaps it would be better to download tarballs instead of cloning a
>> repository of the depth 1?  They are easier to configure and usually
>> faster to download.  The result should be the same anyway.
> Yes, perhaps. Personally, I try as hard as I can to keep a safe
> distance between github and myself. Probably off-topic here going
> into too much detail why.

I suppose someone might take it upon themselves to provide a mirror for 
all grammars on some more neutral host.

But this would be an ongoing effort, so unless it's backed by an entity 
with some reputation, the Emacs project wouldn't be able to rely on it 
(in the default list).



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 13:10                                                       ` Philip Kaludercic
@ 2022-12-30 15:23                                                         ` Gregory Heytings
  0 siblings, 0 replies; 138+ messages in thread
From: Gregory Heytings @ 2022-12-30 15:23 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Yuan Fu, Stefan Monnier, Eli Zaretskii, Dmitry Gutov, Tim Cross,
	emacs-devel


>> The script I sent is meant to clarify that point: to show how and to 
>> what extent it is possible, and what the complications (having to 
>> modify the source grammars manually) are.
>
> You are right, sorry about that.
>

No worries.

I extended the hack a bit further:

global = {}
module = {}
process = { env: { TREE_SITTER_GRAMMAR_PATH: 'grammar.js' } }
function require(file) {
   const pref = [ "", "./", "../", "../../", "../../../" ];
   const suff = [ "", ".js" ];
   for (let i in pref)
     for (let j in suff) {
       const f = pref[i] + file + suff[j];
       if (std.open(f, "r") !== null) {
 	os.chdir(f.match(/.*\//));
 	eval(std.loadFile(f.replace(/.*\//, '')));
 	return module.exports;
       }
     }
   throw Error('File ' + file + ' not found');
}
std.loadScript('/path/to/the/script/dsl.js')

With that script, if you clone the 63 repositories listed on 
https://tree-sitter.github.io/tree-sitter/, and the tree-sitter-clojure 
repository (which is used by tree-sitter-commonlisp), in the same 
directory, you can generate 64 out of the 66 grammar.json files (two 
repositories contain two grammars: tree-sitter-typescript and 
tree-sitter-wasm) without any modification with:

qjs --std /path/to/that/script.js > src/grammar.json

The two exceptions are swift (in the the grammar.json file produced by 
qjs, two values seem to be misplaced) and toml (because it depends on 
regexp-util).




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:06                                   ` Yuan Fu
  2022-12-30 11:25                                     ` Philip Kaludercic
@ 2022-12-30 15:31                                     ` Eli Zaretskii
  2022-12-30 15:54                                       ` Philip Kaludercic
                                                         ` (3 more replies)
  2023-01-01  3:03                                     ` Richard Stallman
  2 siblings, 4 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-30 15:31 UTC (permalink / raw)
  To: Yuan Fu; +Cc: philipk, monnier, dgutov, theophilusx, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Fri, 30 Dec 2022 03:06:37 -0800
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>  Eli Zaretskii <eliz@gnu.org>,
>  Dmitry Gutov <dgutov@yandex.ru>,
>  theophilusx@gmail.com,
>  emacs-devel@gnu.org
> 
> > I have asked the question before, but freedom or not, the above is a
> > nuisance to run for every language.  If the process is as automatic as
> > the above example demonstrates, shouldn't Emacs have a command to take a
> > grammar and compile+install it?  I guess this could be more complicated
> > if the grammar is generated using a custom tool-chain for each language
> > (or is it always Javascript?), but nothing impossible.
> 
> Though the magic of programming, such command now exists: treesit-install-language-grammar. It needs recipes to work, though. The recipe would involve https://github.com, which I guess is probably too heretical to include in Emacs source, so I left the recipes empty. I tested the install command with these recipes:
> 
> (setq treesit-language-source-alist
>       '((python "https://github.com/tree-sitter/tree-sitter-python.git")
>         (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
>                     "typescript/src" "typescript")))

Thanks.  I did some minor fixes to the doc strings, but this command
still "needs work"(TM).  See my comments below:

  This command requires Git, a C compiler and (sometimes) a C++ compiler,
  and the linker to be installed and on PATH.  It also requires that the
  recipe for LANG exists in `treesit-language-source-alist'.

I don't think treesit-language-source-alist is a good idea, especially
if we don't intend populating it, at least not as a user-facing
feature.  Instead, the command should ask the user for the relevant
values, and offer recording the values on some file that would be read
next time the user wants to install an updated library.

  OUT-DIR is the directory to put the compiled library file, it
  defaults to ~/.emacs.d/tree-sitter.

I don't understand what "defaults" means here, since OUT-DIR is not an
optional argument of treesit--install-language-grammar-1.

  (let* ((lang (symbol-name lang))
         (default-directory "/tmp")

A literal "/tmp" is not portable and un-Emacsy; please use
temporary-file-directory instead.

         (soext (pcase system-type
                  ('darwin "dylib")
                  ((or 'ms-dos 'cywin 'windows-nt) "dll")

MS-DOS doesn't use DLL files.  Please use dynamic-library-suffixes
instead, it's already set up correctly.  And the code should be ready
for that variable having a nil value.

          (message "Cloning repository")
          ;; git clone xxx --depth 1 --quiet workdir
          (treesit--call-process-signal
           "git" nil t nil "clone" url "--depth" "1" "--quiet"
           workdir)

Why "--depth 1"?  This should be a defcustom, and the default should
be to clone the full repository, IMO.  Also, what about updating the
library when it is already installed, and the Git repository already
exists for it?  Or are we going to clone anew each time and them
remove the repository? that could make its cloning be slow in some
cases.

          ;; cp "${grammardir}"/grammar.js "${sourcedir}"
          (copy-file (concat grammar-dir "/grammar.js")
                     (concat source-dir "/grammar.js"))

Why is this part needed?  In any case, please don't use concat to
produce file names, use expand-file-name instead.  Also, we should
call copy-file with 4th argument non-nil, I think.

          (treesit--call-process-signal
           cc nil t nil "-fPIC" "-c" "-I." "parser.c")

I wonder why we don't use 'compile' here.  That would show the
compiler output without any extra efforts.

          ;; Copy out.
          (copy-file lib-name (concat out-dir "/") t)

See above: don't use concat here.

This command should also be mentioned in NEWS, where we describe how
to install the grammar libraries.

Bottom line: I think we need first to discuss how we want such a
facility to work, and only then implement it.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 15:31                                     ` Eli Zaretskii
@ 2022-12-30 15:54                                       ` Philip Kaludercic
  2022-12-30 16:17                                         ` Eli Zaretskii
  2022-12-31  0:06                                         ` Yuan Fu
  2022-12-31  0:03                                       ` Yuan Fu
                                                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-30 15:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Yuan Fu, monnier, dgutov, theophilusx, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>           (message "Cloning repository")
>           ;; git clone xxx --depth 1 --quiet workdir
>           (treesit--call-process-signal
>            "git" nil t nil "clone" url "--depth" "1" "--quiet"
>            workdir)
>
> Why "--depth 1"?  This should be a defcustom, and the default should
> be to clone the full repository, IMO.  Also, what about updating the
> library when it is already installed, and the Git repository already
> exists for it?  Or are we going to clone anew each time and them
> remove the repository? that could make its cloning be slow in some
> cases.

I have proposed just downloading a tarball.  GitHub provides these for
each tag, and the tree-sitter developers appear to tag versions on a
regular basis.  The file could then be downloaded via url.el instead of
using Git.

  https://github.com/tree-sitter/tree-sitter-c/archive/refs/tags/v0.20.2.tar.gz



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 15:54                                       ` Philip Kaludercic
@ 2022-12-30 16:17                                         ` Eli Zaretskii
  2022-12-31  0:06                                         ` Yuan Fu
  1 sibling, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-30 16:17 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: casouri, monnier, dgutov, theophilusx, emacs-devel

> From: Philip Kaludercic <philipk@posteo.net>
> Cc: Yuan Fu <casouri@gmail.com>,  monnier@iro.umontreal.ca,
>   dgutov@yandex.ru,  theophilusx@gmail.com,  emacs-devel@gnu.org
> Date: Fri, 30 Dec 2022 15:54:11 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >           (message "Cloning repository")
> >           ;; git clone xxx --depth 1 --quiet workdir
> >           (treesit--call-process-signal
> >            "git" nil t nil "clone" url "--depth" "1" "--quiet"
> >            workdir)
> >
> > Why "--depth 1"?  This should be a defcustom, and the default should
> > be to clone the full repository, IMO.  Also, what about updating the
> > library when it is already installed, and the Git repository already
> > exists for it?  Or are we going to clone anew each time and them
> > remove the repository? that could make its cloning be slow in some
> > cases.
> 
> I have proposed just downloading a tarball.

That could be a good solution.

But again, we need a broader view and concept on how this kind of
feature should work.  We don't have anything similar to it in Emacs,
AFAIK, so we are in uncharted land here, kind of.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Targeting libtreesitter from wisent and other parser generators for emacs
  2022-12-30 14:10                                                       ` Lynn Winebarger
@ 2022-12-30 16:25                                                         ` Lynn Winebarger
  2022-12-31  8:25                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Lynn Winebarger @ 2022-12-30 16:25 UTC (permalink / raw)
  To: Stephen Leake
  Cc: Gregory Heytings, Philip Kaludercic, Yuan Fu, Stefan Monnier,
	Eli Zaretskii, Dmitry Gutov, Tim Cross, emacs-devel

On Fri, Dec 30, 2022 at 9:10 AM Lynn Winebarger <owinebar@gmail.com> wrote:
>
> On Thu, Dec 29, 2022 at 4:50 PM Stephen Leake
> <stephen_leake@stephe-leake.org> wrote:
> > > The functionality of libtreesitter is probably useful independent of
> > > the tool used to create the module it loads, as long as it satisfies
> > > the functional requirements. Would the treesitter authors be amenable
> > > to establishing a documented ABI for that component so other
> > > parser-generators could target it?
> >
> > That's worth filing an issue on the tree-sitter development site. I
> > looked briefly, and did not see a similar issue.
>
> Done - https://github.com/tree-sitter/tree-sitter/issues/2006

If I tried to do something quick and dirty with wisent to see if I
could get something working, can anyone provide some test cases so I
could tell whether the code produced behaves correctly with
libtreesitter?  I've never actually used tree-sitter, in emacs or
otherwise, so I just don't know what I would be testing for.
It's probably best to start with something simple that is easy to
write equivalent grammar specifications for both tree-sitter-cli and
wisent/bison.  JSON is one with an existing tree-sitter grammar. A
spec for EBNF type grammar spec would be straightforward to write and
useful as well.
Would anyone be willing to supply me with an emacs mode for one of
these (or similar) that I could use to compare the behavior of
tree-sitter-cli generated library to the behavior of a wisent/bison
generated library?

Lynn



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:25                                     ` Philip Kaludercic
  2022-12-30 11:54                                       ` tomas
@ 2022-12-30 23:33                                       ` Yuan Fu
  1 sibling, 0 replies; 138+ messages in thread
From: Yuan Fu @ 2022-12-30 23:33 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Stefan Monnier, Eli Zaretskii, Dmitry Gutov, theophilusx,
	emacs-devel



> On Dec 30, 2022, at 3:25 AM, Philip Kaludercic <philipk@posteo.net> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>>> On Dec 27, 2022, at 8:44 AM, Philip Kaludercic <philipk@posteo.net> wrote:
>>> 
>>> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>> 
>>>>> It doesn't need any project, it is literally two command lines.
>>>>> Here's an example:
>>>>> 
>>>>> gcc -O2 -I.   -c -o parser.o parser.c
>>>>> gcc  -shared parser.o scanner.o  -ltree-sitter -o libtree-sitter-c-sharp.dll
>>>> 
>>>> AFAIK `parser.c` is a file generated from the actual grammar's source,
>>>> itself written in Javascript.
>>>> 
>>>> So the above instructions are akin to downloading a precompiled binary
>>>> and installing it.  While it is the most convenient path for the
>>>> end-users, it's important w.r.t Freedom to make sure that grammars can
>>>> also be regenerated from source by the end users.
>>> 
>>> I have asked the question before, but freedom or not, the above is a
>>> nuisance to run for every language.  If the process is as automatic as
>>> the above example demonstrates, shouldn't Emacs have a command to take a
>>> grammar and compile+install it?  I guess this could be more complicated
>>> if the grammar is generated using a custom tool-chain for each language
>>> (or is it always Javascript?), but nothing impossible.
>> 
>> Though the magic of programming, such command now exists: treesit-install-language-grammar. It needs recipes to work, though. The recipe would involve https://github.com, which I guess is probably too heretical to include in Emacs source, so I left the recipes empty. I tested the install command with these recipes:
>> 
>> (setq treesit-language-source-alist
>>      '((python "https://github.com/tree-sitter/tree-sitter-python.git")
>>        (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
>>                    "typescript/src" "typescript")))
>> 
>> Yuan
> 
> If acceptable, it looks good.  I could imagine that it should be OK if
> we point to GitHub, since we are just using it as a Git host.  Here are
> a few suggestions

Thanks, I made some changes according to your diff.

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 15:31                                     ` Eli Zaretskii
  2022-12-30 15:54                                       ` Philip Kaludercic
@ 2022-12-31  0:03                                       ` Yuan Fu
  2022-12-31  0:25                                         ` Stefan Monnier
  2022-12-31  9:24                                         ` Eli Zaretskii
  2022-12-31  0:44                                       ` Gregory Heytings
  2023-01-03  4:08                                       ` Richard Stallman
  3 siblings, 2 replies; 138+ messages in thread
From: Yuan Fu @ 2022-12-31  0:03 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Philip Kaludercic, monnier, dgutov, theophilusx, emacs-devel



> On Dec 30, 2022, at 7:31 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Fri, 30 Dec 2022 03:06:37 -0800
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>> Eli Zaretskii <eliz@gnu.org>,
>> Dmitry Gutov <dgutov@yandex.ru>,
>> theophilusx@gmail.com,
>> emacs-devel@gnu.org
>> 
>>> I have asked the question before, but freedom or not, the above is a
>>> nuisance to run for every language.  If the process is as automatic as
>>> the above example demonstrates, shouldn't Emacs have a command to take a
>>> grammar and compile+install it?  I guess this could be more complicated
>>> if the grammar is generated using a custom tool-chain for each language
>>> (or is it always Javascript?), but nothing impossible.
>> 
>> Though the magic of programming, such command now exists: treesit-install-language-grammar. It needs recipes to work, though. The recipe would involve https://github.com, which I guess is probably too heretical to include in Emacs source, so I left the recipes empty. I tested the install command with these recipes:
>> 
>> (setq treesit-language-source-alist
>>      '((python "https://github.com/tree-sitter/tree-sitter-python.git")
>>        (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
>>                    "typescript/src" "typescript")))
> 
> Thanks.  I did some minor fixes to the doc strings, but this command
> still "needs work"(TM).  See my comments below:
> 
>  This command requires Git, a C compiler and (sometimes) a C++ compiler,
>  and the linker to be installed and on PATH.  It also requires that the
>  recipe for LANG exists in `treesit-language-source-alist'.
> 
> I don't think treesit-language-source-alist is a good idea, especially
> if we don't intend populating it, at least not as a user-facing
> feature.  Instead, the command should ask the user for the relevant
> values, and offer recording the values on some file that would be read
> next time the user wants to install an updated library.

I consider this as a fallback method for installing language grammars. Because distress might not end up bundle language grammar for us, and even if they do, they can’t cover every grammar so some user would end up needing to install some grammar by themselves. If we don’t include this feature, someone will definitely write something like this and make it a third-party package (indeed, someone already has). So we might have it in Emacs and do it right.

This is the use case that I had in mind when writing this function: some major mode xxx-mode requires language grammar for xxx, so it has the following instruction in its readme:

Add installation recipe of tree-sitter-xxx to your config, and run treesit-install-language-grammar:

(add-to-list 'treesit-language-source-alist
             '(xxx "https://github.com/xxx/tree-sitter-xxx.git"))

> 
>  OUT-DIR is the directory to put the compiled library file, it
>  defaults to ~/.emacs.d/tree-sitter.
> 
> I don't understand what "defaults" means here, since OUT-DIR is not an
> optional argument of treesit--install-language-grammar-1.

Ah yes, fixed.

> 
>  (let* ((lang (symbol-name lang))
>         (default-directory "/tmp")
> 
> A literal "/tmp" is not portable and un-Emacsy; please use
> temporary-file-directory instead.
> 
>         (soext (pcase system-type
>                  ('darwin "dylib")
>                  ((or 'ms-dos 'cywin 'windows-nt) "dll")
> 
> MS-DOS doesn't use DLL files.  Please use dynamic-library-suffixes
> instead, it's already set up correctly.  And the code should be ready
> for that variable having a nil value.

Fixed those, thanks.

> 
>          (message "Cloning repository")
>          ;; git clone xxx --depth 1 --quiet workdir
>          (treesit--call-process-signal
>           "git" nil t nil "clone" url "--depth" "1" "--quiet"
>           workdir)
> 
> Why "--depth 1"?  This should be a defcustom, and the default should
> be to clone the full repository, IMO.  Also, what about updating the
> library when it is already installed, and the Git repository already
> exists for it?  Or are we going to clone anew each time and them
> remove the repository? that could make its cloning be slow in some
> cases.

Since the purpose of this command is to install the grammar, why would we want a full clone? For an “average user”, all they need is the library. If they wants to hack on the grammar, it makes more sense to install the toolchain and clone the repository themselves. And yes, this command clone anew each time and removes the repository. 

> 
>          ;; cp "${grammardir}"/grammar.js "${sourcedir}"
>          (copy-file (concat grammar-dir "/grammar.js")
>                     (concat source-dir "/grammar.js"))
> 
> Why is this part needed?  In any case, please don't use concat to
> produce file names, use expand-file-name instead.  Also, we should
> call copy-file with 4th argument non-nil, I think.

To be honest I don’t remember, it is in build.sh so I copied it verbatim. I’ll see what it’s for. (But I kept it for now.)

> 
>          (treesit--call-process-signal
>           cc nil t nil "-fPIC" "-c" "-I." "parser.c")
> 
> I wonder why we don't use 'compile' here.  That would show the
> compiler output without any extra efforts.

I wanted to keep it simple, synchronous, and quiet, and didn’t thought much about it.

> 
>          ;; Copy out.
>          (copy-file lib-name (concat out-dir "/") t)
> 
> See above: don't use concat here.
> 
> This command should also be mentioned in NEWS, where we describe how
> to install the grammar libraries.

I’ll do that if we decide this function is desirable and good.

> Bottom line: I think we need first to discuss how we want such a
> facility to work, and only then implement it.

I agree. I was worried about the feature freeze thing :-)

Yuan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 15:54                                       ` Philip Kaludercic
  2022-12-30 16:17                                         ` Eli Zaretskii
@ 2022-12-31  0:06                                         ` Yuan Fu
  2022-12-31  0:12                                           ` Philip Kaludercic
  1 sibling, 1 reply; 138+ messages in thread
From: Yuan Fu @ 2022-12-31  0:06 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Eli Zaretskii, monnier, dgutov, theophilusx, emacs-devel



> On Dec 30, 2022, at 7:54 AM, Philip Kaludercic <philipk@posteo.net> wrote:
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
>>          (message "Cloning repository")
>>          ;; git clone xxx --depth 1 --quiet workdir
>>          (treesit--call-process-signal
>>           "git" nil t nil "clone" url "--depth" "1" "--quiet"
>>           workdir)
>> 
>> Why "--depth 1"?  This should be a defcustom, and the default should
>> be to clone the full repository, IMO.  Also, what about updating the
>> library when it is already installed, and the Git repository already
>> exists for it?  Or are we going to clone anew each time and them
>> remove the repository? that could make its cloning be slow in some
>> cases.
> 
> I have proposed just downloading a tarball.  GitHub provides these for
> each tag, and the tree-sitter developers appear to tag versions on a
> regular basis.  The file could then be downloaded via url.el instead of
> using Git.
> 
>  https://github.com/tree-sitter/tree-sitter-c/archive/refs/tags/v0.20.2.tar.gz

Not all language grammars would bother to make a release[1]. The fallback method better support as many cases as possible.

[1] https://github.com/elixir-lang/tree-sitter-elixir

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  0:06                                         ` Yuan Fu
@ 2022-12-31  0:12                                           ` Philip Kaludercic
  2023-01-01  1:18                                             ` Yuan Fu
  0 siblings, 1 reply; 138+ messages in thread
From: Philip Kaludercic @ 2022-12-31  0:12 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, monnier, dgutov, theophilusx, emacs-devel

Yuan Fu <casouri@gmail.com> writes:

>> On Dec 30, 2022, at 7:54 AM, Philip Kaludercic <philipk@posteo.net> wrote:
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>>>          (message "Cloning repository")
>>>          ;; git clone xxx --depth 1 --quiet workdir
>>>          (treesit--call-process-signal
>>>           "git" nil t nil "clone" url "--depth" "1" "--quiet"
>>>           workdir)
>>> 
>>> Why "--depth 1"?  This should be a defcustom, and the default should
>>> be to clone the full repository, IMO.  Also, what about updating the
>>> library when it is already installed, and the Git repository already
>>> exists for it?  Or are we going to clone anew each time and them
>>> remove the repository? that could make its cloning be slow in some
>>> cases.
>> 
>> I have proposed just downloading a tarball.  GitHub provides these for
>> each tag, and the tree-sitter developers appear to tag versions on a
>> regular basis.  The file could then be downloaded via url.el instead of
>> using Git.
>> 
>>  https://github.com/tree-sitter/tree-sitter-c/archive/refs/tags/v0.20.2.tar.gz
>
> Not all language grammars would bother to make a release[1]. The fallback method better support as many cases as possible.
>
> [1] https://github.com/elixir-lang/tree-sitter-elixir

That doesn't have to be a blocker.  We can download a tarball for each
commit, if an explicit release is missing.

  https://codeload.github.com/elixir-lang/tree-sitter-elixir/tar.gz/b20eaa75565243c50be5e35e253d8beb58f45d56



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  0:03                                       ` Yuan Fu
@ 2022-12-31  0:25                                         ` Stefan Monnier
  2023-01-01  1:16                                           ` Yuan Fu
  2022-12-31  9:24                                         ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Stefan Monnier @ 2022-12-31  0:25 UTC (permalink / raw)
  To: Yuan Fu; +Cc: Eli Zaretskii, Philip Kaludercic, dgutov, theophilusx,
	emacs-devel

>> Why "--depth 1"?  This should be a defcustom, and the default should
>> be to clone the full repository, IMO.  Also, what about updating the
>> library when it is already installed, and the Git repository already
>> exists for it?  Or are we going to clone anew each time and them
>> remove the repository? that could make its cloning be slow in some
>> cases.
>
> Since the purpose of this command is to install the grammar, why would we
> want a full clone?

Maybe it doesn't matter very much but:
- I suspect most of those grammars have a fairly limited history, so I'd
  expect the savings from `--depth 1` to be rather small.
- And ... [...checking the previous point's theory...]

Wow, that's really weird: they do have a fairly short history (like ~200
commits) but the full clone is *much* larger than the `--depth 1` clone.
I guess it's because they store the generated `.c` file in there and the
small changes in the source cause much larger changes in the
generated file.

> I wanted to keep it simple, synchronous, and quiet, and didn’t thought
> much about it.

I think we should default to asynchonous operations as much as possible.


        Stefan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 15:31                                     ` Eli Zaretskii
  2022-12-30 15:54                                       ` Philip Kaludercic
  2022-12-31  0:03                                       ` Yuan Fu
@ 2022-12-31  0:44                                       ` Gregory Heytings
  2023-01-03  4:08                                       ` Richard Stallman
  3 siblings, 0 replies; 138+ messages in thread
From: Gregory Heytings @ 2022-12-31  0:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Yuan Fu, philipk, monnier, dgutov, theophilusx, emacs-devel


>
> Why "--depth 1"?
>

Because of cases like https://github.com/tree-sitter/tree-sitter-c-sharp, 
for which git clone takes ~3 minutes and git clone --depth 1 takes only 2 
seconds.  If the purpose is only to install a library, only the current 
version of the repository matters.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  6:59 Pedro Andres Aranda Gutierrez
@ 2022-12-31  7:47 ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-31  7:47 UTC (permalink / raw)
  To: Pedro Andres Aranda Gutierrez; +Cc: emacs-devel

> From: Pedro Andres Aranda Gutierrez <paaguti@gmail.com>
> Date: Sat, 31 Dec 2022 07:59:18 +0100
> 
> Philip Kaludercic <philipk@posteo.net> writes:
> 
> > My main worry with these changes, along with the popularity of LSP is
> > that while they are technological improvements, they all happen at the
> > deterioration of Emacs' introspectability, increasing the effort it
> > takes for the user to make changes.  IIUC you can't reload a .el file or
> > just a singular expression if you want to change how completion via
> > Eglot or how imenu works via Tree Sitter.  A simple hack becomes a
> > weekend project.  This is not an unconditional good.
> 
> That's a very good point. My .02 cents of experience with eglot/treesit:
> 
> while I'm happy it works now on my multi-OS setup and I can seamlessly switch computers, it took me a lot
> of time to understand and duck-duck-go and set up. On top of that, there are some things I still don't
> completely understand and can't explore on the *scratch* buffer and/or slime. 

Using technology that is implemented outside Emacs inevitably means we
have less transparency in Emacs itself for the related
functionalities.  However, hoping that everything can be implemented
by the Emacs project, and refusing to use external libraries for some
areas for that reason, is an evolutionary dead end for Emacs.  So we
have to do that to some degree where key technologies applicable to
Emacs features are available out there.  The job of the maintainers is
to identify those technologies, weigh their potential contributions to
future Emacs development, and decide whether those contributions
justify their use, with all the disadvantages in transparency that
will inevitably bring with it.

> And yes, I've also tried tree-sitter for Python on my Linux and it makes me wonder what the real gain is,
> because I'm using the plain python-mode on the other systems and I can't feel a compelling argument to
> switch.

Thank you for your feedback.  This is the reason we decided to keep
these modes separate and make trying the new tree-sitter based modes
as easy as possible.  This is also the kind of user feedback we will
need to collect when Emacs 29 is released, which will help us to
decide how to use tree-sitter based capabilities.  The decisions could
be different for different programming languages.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Targeting libtreesitter from wisent and other parser generators for emacs
  2022-12-30 16:25                                                         ` Targeting libtreesitter from wisent and other parser generators for emacs Lynn Winebarger
@ 2022-12-31  8:25                                                           ` Eli Zaretskii
  2022-12-31 13:07                                                             ` Lynn Winebarger
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-31  8:25 UTC (permalink / raw)
  To: Lynn Winebarger
  Cc: stephen_leake, gregory, philipk, casouri, monnier, dgutov,
	theophilusx, emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Fri, 30 Dec 2022 11:25:15 -0500
> Cc: Gregory Heytings <gregory@heytings.org>, Philip Kaludercic <philipk@posteo.net>, 
> 	Yuan Fu <casouri@gmail.com>, Stefan Monnier <monnier@iro.umontreal.ca>, 
> 	Eli Zaretskii <eliz@gnu.org>, Dmitry Gutov <dgutov@yandex.ru>, Tim Cross <theophilusx@gmail.com>, 
> 	emacs-devel <emacs-devel@gnu.org>
> 
> If I tried to do something quick and dirty with wisent to see if I
> could get something working, can anyone provide some test cases so I
> could tell whether the code produced behaves correctly with
> libtreesitter?

Not sure what you are asking about.  There's a test suite for
treesit.el in the tests directory.  If that is not what you want, we
had past discussions about specific C files that had problems with
tree-sitter (fixed since then), maybe you could use those files?

> I've never actually used tree-sitter, in emacs or
> otherwise, so I just don't know what I would be testing for.

Fontification, indentation, and navigation by defun are the main
features supported by tree-sitter.  In some modes also Imenu.

> Would anyone be willing to supply me with an emacs mode for one of
> these (or similar) that I could use to compare the behavior of
> tree-sitter-cli generated library to the behavior of a wisent/bison
> generated library?

All of the tree-sitter based modes are called SOMETHING-ts-mode, and
they are all called out in NEWS on the emacs-29 branch.  Does this
answer your question?



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  0:03                                       ` Yuan Fu
  2022-12-31  0:25                                         ` Stefan Monnier
@ 2022-12-31  9:24                                         ` Eli Zaretskii
  2022-12-31 22:14                                           ` Yuan Fu
  1 sibling, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2022-12-31  9:24 UTC (permalink / raw)
  To: Yuan Fu; +Cc: philipk, monnier, dgutov, theophilusx, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Fri, 30 Dec 2022 16:03:42 -0800
> Cc: Philip Kaludercic <philipk@posteo.net>,
>  monnier@iro.umontreal.ca,
>  dgutov@yandex.ru,
>  theophilusx@gmail.com,
>  emacs-devel@gnu.org
> 
> > I don't think treesit-language-source-alist is a good idea, especially
> > if we don't intend populating it, at least not as a user-facing
> > feature.  Instead, the command should ask the user for the relevant
> > values, and offer recording the values on some file that would be read
> > next time the user wants to install an updated library.
> 
> I consider this as a fallback method for installing language grammars. Because distress might not end up bundle language grammar for us, and even if they do, they can’t cover every grammar so some user would end up needing to install some grammar by themselves. If we don’t include this feature, someone will definitely write something like this and make it a third-party package (indeed, someone already has). So we might have it in Emacs and do it right.
> 
> This is the use case that I had in mind when writing this function: some major mode xxx-mode requires language grammar for xxx, so it has the following instruction in its readme:
> 
> Add installation recipe of tree-sitter-xxx to your config, and run treesit-install-language-grammar:
> 
> (add-to-list 'treesit-language-source-alist
>              '(xxx "https://github.com/xxx/tree-sitter-xxx.git"))

This is a user command, so it must comply to some minimal level of
usefulness and user-friendliness.  Right now, if you invoke the
command after just loading treesit.el, you will be stuck at the first
prompt, since Emacs says "No match" whatever language you try to type
at the front.

Why not allow the user to specify all the necessary data needed for a
language, if treesit-language-source-alist lacks those details for
that language?  You can then update the data structure with those
details.  Wouldn't this be much better and user-friendlier than asking
users to read the readme and fill a data structure (which some of them
could fill wrongly, not being Lisp programmers) in advance?

> > This command should also be mentioned in NEWS, where we describe how
> > to install the grammar libraries.
> 
> I’ll do that if we decide this function is desirable and good.

?? It's already on the branch, so I think we are past that point?



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Targeting libtreesitter from wisent and other parser generators for emacs
  2022-12-31  8:25                                                           ` Eli Zaretskii
@ 2022-12-31 13:07                                                             ` Lynn Winebarger
  0 siblings, 0 replies; 138+ messages in thread
From: Lynn Winebarger @ 2022-12-31 13:07 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: stephen_leake, gregory, philipk, casouri, monnier, dgutov,
	theophilusx, emacs-devel

On Sat, Dec 31, 2022 at 3:25 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > Would anyone be willing to supply me with an emacs mode for one of
> > these (or similar) that I could use to compare the behavior of
> > tree-sitter-cli generated library to the behavior of a wisent/bison
> > generated library?
>
> All of the tree-sitter based modes are called SOMETHING-ts-mode, and
> they are all called out in NEWS on the emacs-29 branch.  Does this
> answer your question?

Not really.   Since I am not currently able to provide a clear
assignment to the FSF, I'm going to start a fork on github with my
experimental work until I can make a clearly unencumbered assignment.
Once I have something that produces a usable grammar module, maybe
someone will throw some tests my way in the form of issues.

Lynn



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  9:24                                         ` Eli Zaretskii
@ 2022-12-31 22:14                                           ` Yuan Fu
  2023-01-01  1:12                                             ` Yuan Fu
  0 siblings, 1 reply; 138+ messages in thread
From: Yuan Fu @ 2022-12-31 22:14 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Philip Kaludercic, Stefan Monnier, Dmitry Gutov, Tim Cross,
	emacs-devel



> On Dec 31, 2022, at 1:24 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Fri, 30 Dec 2022 16:03:42 -0800
>> Cc: Philip Kaludercic <philipk@posteo.net>,
>> monnier@iro.umontreal.ca,
>> dgutov@yandex.ru,
>> theophilusx@gmail.com,
>> emacs-devel@gnu.org
>> 
>>> I don't think treesit-language-source-alist is a good idea, especially
>>> if we don't intend populating it, at least not as a user-facing
>>> feature.  Instead, the command should ask the user for the relevant
>>> values, and offer recording the values on some file that would be read
>>> next time the user wants to install an updated library.
>> 
>> I consider this as a fallback method for installing language grammars. Because distress might not end up bundle language grammar for us, and even if they do, they can’t cover every grammar so some user would end up needing to install some grammar by themselves. If we don’t include this feature, someone will definitely write something like this and make it a third-party package (indeed, someone already has). So we might have it in Emacs and do it right.
>> 
>> This is the use case that I had in mind when writing this function: some major mode xxx-mode requires language grammar for xxx, so it has the following instruction in its readme:
>> 
>> Add installation recipe of tree-sitter-xxx to your config, and run treesit-install-language-grammar:
>> 
>> (add-to-list 'treesit-language-source-alist
>>             '(xxx "https://github.com/xxx/tree-sitter-xxx.git"))
> 
> This is a user command, so it must comply to some minimal level of
> usefulness and user-friendliness.  Right now, if you invoke the
> command after just loading treesit.el, you will be stuck at the first
> prompt, since Emacs says "No match" whatever language you try to type
> at the front.
> 
> Why not allow the user to specify all the necessary data needed for a
> language, if treesit-language-source-alist lacks those details for
> that language?  You can then update the data structure with those
> details.  Wouldn't this be much better and user-friendlier than asking
> users to read the readme and fill a data structure (which some of them
> could fill wrongly, not being Lisp programmers) in advance?

Makes sense, I can do that.

> 
>>> This command should also be mentioned in NEWS, where we describe how
>>> to install the grammar libraries.
>> 
>> I’ll do that if we decide this function is desirable and good.
> 
> ?? It's already on the branch, so I think we are past that point?

I thought that was my unilateral action which might be rejected, but anyway, I’ll add a NEWS entry :-)

Yuan




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31 22:14                                           ` Yuan Fu
@ 2023-01-01  1:12                                             ` Yuan Fu
  0 siblings, 0 replies; 138+ messages in thread
From: Yuan Fu @ 2023-01-01  1:12 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Philip Kaludercic, Stefan Monnier, Dmitry Gutov, Tim Cross,
	emacs-devel



> On Dec 31, 2022, at 2:14 PM, Yuan Fu <casouri@gmail.com> wrote:
> 
> 
> 
>> On Dec 31, 2022, at 1:24 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> 
>>> From: Yuan Fu <casouri@gmail.com>
>>> Date: Fri, 30 Dec 2022 16:03:42 -0800
>>> Cc: Philip Kaludercic <philipk@posteo.net>,
>>> monnier@iro.umontreal.ca,
>>> dgutov@yandex.ru,
>>> theophilusx@gmail.com,
>>> emacs-devel@gnu.org
>>> 
>>>> I don't think treesit-language-source-alist is a good idea, especially
>>>> if we don't intend populating it, at least not as a user-facing
>>>> feature.  Instead, the command should ask the user for the relevant
>>>> values, and offer recording the values on some file that would be read
>>>> next time the user wants to install an updated library.
>>> 
>>> I consider this as a fallback method for installing language grammars. Because distress might not end up bundle language grammar for us, and even if they do, they can’t cover every grammar so some user would end up needing to install some grammar by themselves. If we don’t include this feature, someone will definitely write something like this and make it a third-party package (indeed, someone already has). So we might have it in Emacs and do it right.
>>> 
>>> This is the use case that I had in mind when writing this function: some major mode xxx-mode requires language grammar for xxx, so it has the following instruction in its readme:
>>> 
>>> Add installation recipe of tree-sitter-xxx to your config, and run treesit-install-language-grammar:
>>> 
>>> (add-to-list 'treesit-language-source-alist
>>>            '(xxx "https://github.com/xxx/tree-sitter-xxx.git"))
>> 
>> This is a user command, so it must comply to some minimal level of
>> usefulness and user-friendliness.  Right now, if you invoke the
>> command after just loading treesit.el, you will be stuck at the first
>> prompt, since Emacs says "No match" whatever language you try to type
>> at the front.
>> 
>> Why not allow the user to specify all the necessary data needed for a
>> language, if treesit-language-source-alist lacks those details for
>> that language?  You can then update the data structure with those
>> details.  Wouldn't this be much better and user-friendlier than asking
>> users to read the readme and fill a data structure (which some of them
>> could fill wrongly, not being Lisp programmers) in advance?
> 
> Makes sense, I can do that.

Done. Free feel to adjust the wording, etc.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  0:25                                         ` Stefan Monnier
@ 2023-01-01  1:16                                           ` Yuan Fu
  2023-01-01  6:39                                             ` Eli Zaretskii
  0 siblings, 1 reply; 138+ messages in thread
From: Yuan Fu @ 2023-01-01  1:16 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Eli Zaretskii, Philip Kaludercic, dgutov, theophilusx,
	emacs-devel

> 
>> I wanted to keep it simple, synchronous, and quiet, and didn’t thought
>> much about it.
> 
> I think we should default to asynchonous operations as much as possible.

If you are talking about the whole command, maybe. Does call-process automatically yield when ran in a make-thread?

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-31  0:12                                           ` Philip Kaludercic
@ 2023-01-01  1:18                                             ` Yuan Fu
  2023-01-02 19:10                                               ` [SPAM UNSURE] " Stephen Leake
  0 siblings, 1 reply; 138+ messages in thread
From: Yuan Fu @ 2023-01-01  1:18 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Eli Zaretskii, monnier, dgutov, theophilusx, emacs-devel



> On Dec 30, 2022, at 4:12 PM, Philip Kaludercic <philipk@posteo.net> wrote:
> 
> Yuan Fu <casouri@gmail.com> writes:
> 
>>> On Dec 30, 2022, at 7:54 AM, Philip Kaludercic <philipk@posteo.net> wrote:
>>> 
>>> Eli Zaretskii <eliz@gnu.org> writes:
>>> 
>>>>         (message "Cloning repository")
>>>>         ;; git clone xxx --depth 1 --quiet workdir
>>>>         (treesit--call-process-signal
>>>>          "git" nil t nil "clone" url "--depth" "1" "--quiet"
>>>>          workdir)
>>>> 
>>>> Why "--depth 1"?  This should be a defcustom, and the default should
>>>> be to clone the full repository, IMO.  Also, what about updating the
>>>> library when it is already installed, and the Git repository already
>>>> exists for it?  Or are we going to clone anew each time and them
>>>> remove the repository? that could make its cloning be slow in some
>>>> cases.
>>> 
>>> I have proposed just downloading a tarball.  GitHub provides these for
>>> each tag, and the tree-sitter developers appear to tag versions on a
>>> regular basis.  The file could then be downloaded via url.el instead of
>>> using Git.
>>> 
>>> https://github.com/tree-sitter/tree-sitter-c/archive/refs/tags/v0.20.2.tar.gz
>> 
>> Not all language grammars would bother to make a release[1]. The fallback method better support as many cases as possible.
>> 
>> [1] https://github.com/elixir-lang/tree-sitter-elixir
> 
> That doesn't have to be a blocker.  We can download a tarball for each
> commit, if an explicit release is missing.
> 
>  https://codeload.github.com/elixir-lang/tree-sitter-elixir/tar.gz/b20eaa75565243c50be5e35e253d8beb58f45d56

The command already requires a C/C++ compiler, in that case I don’t think Git is too much to ask. The git url is much simpler for the user to figure out IMO.

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-27 19:53                                       ` Dmitry Gutov
@ 2023-01-01  3:03                                         ` Richard Stallman
  0 siblings, 0 replies; 138+ messages in thread
From: Richard Stallman @ 2023-01-01  3:03 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > and I'm unsure (but
  > > never bothered to find out) whether free replacements even exist

  > Not 100% sure what you mean by that: Node.js uses the MIT/Expat license. 
  > That's Free software.

  > A specific grammar might use some proprietary modules, but that's just 
  > something for us to verify.

Does the Javascript package manager provide a simple way to reject
nonfree dependencies?  If it does not, this could be a real can of
worms.  We should not recommend the use or installation of a grammar
whose build procedure uses nonfree software, and if the package manager
doesn't give us an easy way to check that, we need to vet these grammars
one by one.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 11:06                                   ` Yuan Fu
  2022-12-30 11:25                                     ` Philip Kaludercic
  2022-12-30 15:31                                     ` Eli Zaretskii
@ 2023-01-01  3:03                                     ` Richard Stallman
  2023-01-01  6:54                                       ` Eli Zaretskii
  2 siblings, 1 reply; 138+ messages in thread
From: Richard Stallman @ 2023-01-01  3:03 UTC (permalink / raw)
  To: Yuan Fu; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > (setq treesit-language-source-alist
  >       '((python "https://github.com/tree-sitter/tree-sitter-python.git")
  >         (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
  >                     "typescript/src" "typescript")))

A GNU package should not load other programs straight off of someone
else's repository.  That is vulnerable to surprise changes, in the
code and in its license.  We can hardly count on Github to make sure
these run without any nonfree software.

There are three proper ways for Emacs to handle it needs.

* To assume that they are already installed.  Normally they would be
part of some other package in your system distro.  This is what we do
with many standard tools and libraries.

With this method, we outsource the vetting of those programs to the
GNU/Linux distro.  If you use a free distro, you can count it to make
free versions of those programs available to install.

We could in principle handle tree-sitter grammars this way,
but only if and when GNU/Linux distros generally package them.
Is that the case today?

* To include their source code in Emacs the Emacs release, and build
them along with the rest of Emacs.

* To tell the user, "Installing these external programs is your
responsibility."  That is the least helpful method, but it's
acceptable.

How does the current tree-sitter code obtain grammars to run?  Does it
download those straight from Github too?  That's not an acceptable
solution -- we should replace it with one of the three above.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-01  1:16                                           ` Yuan Fu
@ 2023-01-01  6:39                                             ` Eli Zaretskii
  2023-01-02  0:31                                               ` Yuan Fu
  0 siblings, 1 reply; 138+ messages in thread
From: Eli Zaretskii @ 2023-01-01  6:39 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, philipk, dgutov, theophilusx, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Sat, 31 Dec 2022 17:16:11 -0800
> Cc: Eli Zaretskii <eliz@gnu.org>,
>  Philip Kaludercic <philipk@posteo.net>,
>  dgutov@yandex.ru,
>  theophilusx@gmail.com,
>  emacs-devel@gnu.org
> 
> > 
> >> I wanted to keep it simple, synchronous, and quiet, and didn’t thought
> >> much about it.
> > 
> > I think we should default to asynchonous operations as much as possible.
> 
> If you are talking about the whole command, maybe. Does call-process automatically yield when ran in a make-thread?

No, it doesn't.  And it's threads that yield, not Emacs primitives.  A
thread will yield when it calls some API that invokes pselect.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-01  3:03                                     ` Richard Stallman
@ 2023-01-01  6:54                                       ` Eli Zaretskii
  2023-01-01 19:14                                         ` Gregory Heytings
  2023-01-03  4:06                                         ` Richard Stallman
  0 siblings, 2 replies; 138+ messages in thread
From: Eli Zaretskii @ 2023-01-01  6:54 UTC (permalink / raw)
  To: rms; +Cc: casouri, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Sat, 31 Dec 2022 22:03:56 -0500
> 
>   > (setq treesit-language-source-alist
>   >       '((python "https://github.com/tree-sitter/tree-sitter-python.git")
>   >         (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
>   >                     "typescript/src" "typescript")))
> 
> A GNU package should not load other programs straight off of someone
> else's repository.  That is vulnerable to surprise changes, in the
> code and in its license.  We can hardly count on Github to make sure
> these run without any nonfree software.

The above was an example of user customizations.  There's no such text
anywhere in Emacs, nor there will be.  Emacs will be distributed with
that variable having the nil value, and users will have to download
and install the grammar libraries by themselves.  The major modes we
distribute all use grammars whose licenses are free, and as long as we
take care to verify this aspect, there should be no problem here from
this aspect.

> There are three proper ways for Emacs to handle it needs.
> 
> * To assume that they are already installed.  Normally they would be
> part of some other package in your system distro.  This is what we do
> with many standard tools and libraries.
> 
> With this method, we outsource the vetting of those programs to the
> GNU/Linux distro.  If you use a free distro, you can count it to make
> free versions of those programs available to install.
> 
> We could in principle handle tree-sitter grammars this way,
> but only if and when GNU/Linux distros generally package them.
> Is that the case today?
> 
> * To include their source code in Emacs the Emacs release, and build
> them along with the rest of Emacs.
> 
> * To tell the user, "Installing these external programs is your
> responsibility."  That is the least helpful method, but it's
> acceptable.

We currently assume either the 1st or the 3rd alternative.  For the
details, see the text in NEWS about installing the grammar libraries.

> How does the current tree-sitter code obtain grammars to run?

It doesn't.  It expects these grammar libraries to be installed "by
other means".  The tree-sitter code just checks that the grammar is
available, and if not, emits a warning to that effect and doesn't
enable the related features.

> Does it download those straight from Github too?

There's a command to download and install a grammar library, but it
leaves it to the user to specify from where to download the library
the user wants.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-01  6:54                                       ` Eli Zaretskii
@ 2023-01-01 19:14                                         ` Gregory Heytings
  2023-01-01 20:11                                           ` Eli Zaretskii
  2023-01-03  4:06                                         ` Richard Stallman
  1 sibling, 1 reply; 138+ messages in thread
From: Gregory Heytings @ 2023-01-01 19:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, casouri, emacs-devel


>> * To assume that they are already installed.  Normally they would be 
>> part of some other package in your system distro.  This is what we do 
>> with many standard tools and libraries.
>>
>> [...]
>>
>> * To include their source code in Emacs the Emacs release, and build 
>> them along with the rest of Emacs.
>>
>> * To tell the user, "Installing these external programs is your 
>> responsibility."  That is the least helpful method, but it's 
>> acceptable.
>
> We currently assume either the 1st or the 3rd alternative.
>

ATM these libraries are not packaged by distros, and are bundled by other 
editors that use them.  So in effect, the first option means that we 
outload the responsibility of packaging these libraries not to GNU/Linux 
distros in general, but to those who are responsible for packaging Emacs 
in GNU/Linux distros.




^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-01 19:14                                         ` Gregory Heytings
@ 2023-01-01 20:11                                           ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2023-01-01 20:11 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: rms, casouri, emacs-devel

> Date: Sun, 01 Jan 2023 19:14:31 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: rms@gnu.org, casouri@gmail.com, emacs-devel@gnu.org
> 
> 
> >> * To assume that they are already installed.  Normally they would be 
> >> part of some other package in your system distro.  This is what we do 
> >> with many standard tools and libraries.
> >>
> >> [...]
> >>
> >> * To include their source code in Emacs the Emacs release, and build 
> >> them along with the rest of Emacs.
> >>
> >> * To tell the user, "Installing these external programs is your 
> >> responsibility."  That is the least helpful method, but it's 
> >> acceptable.
> >
> > We currently assume either the 1st or the 3rd alternative.
> >
> 
> ATM these libraries are not packaged by distros, and are bundled by other 
> editors that use them.

That is inaccurate, AFAIU: someone already reported one distro which
does package the grammar libraries.

Anyway, we didn't yet release Emacs 29.1, and have at least 2 months
to go.  So do the distros.  Thus, it's too early to conclude what will
be the actual situation when Emacs 29 hits the streets.

My point above was that we currently assume at least one of the two
alternatives I mentioned will be used.

> So in effect, the first option means that we outload the
> responsibility of packaging these libraries not to GNU/Linux distros
> in general, but to those who are responsible for packaging Emacs in
> GNU/Linux distros.

Yes, that's the intent.  But I don't think we will mind if the more
general distros will pick this up, and see no reason why it couldn't
happen, the data points with other editors notwithstanding.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-01  6:39                                             ` Eli Zaretskii
@ 2023-01-02  0:31                                               ` Yuan Fu
  2023-01-02  0:40                                                 ` Stefan Monnier
  2023-01-02  3:34                                                 ` Eli Zaretskii
  0 siblings, 2 replies; 138+ messages in thread
From: Yuan Fu @ 2023-01-02  0:31 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Stefan Monnier, Philip Kaludercic, Dmitry Gutov, theophilusx,
	emacs-devel



> On Dec 31, 2022, at 10:39 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Sat, 31 Dec 2022 17:16:11 -0800
>> Cc: Eli Zaretskii <eliz@gnu.org>,
>> Philip Kaludercic <philipk@posteo.net>,
>> dgutov@yandex.ru,
>> theophilusx@gmail.com,
>> emacs-devel@gnu.org
>> 
>>> 
>>>> I wanted to keep it simple, synchronous, and quiet, and didn’t thought
>>>> much about it.
>>> 
>>> I think we should default to asynchonous operations as much as possible.
>> 
>> If you are talking about the whole command, maybe. Does call-process automatically yield when ran in a make-thread?
> 
> No, it doesn't.  And it's threads that yield, not Emacs primitives.  A
> thread will yield when it calls some API that invokes pselect.

Thanks. I guess what I’m asking is that if I run the command in make-thread, will the thread yield when it’s waiting for the subprocess? I tried it and it still blocks Emacs so I guess the answer is no. But maybe I’m doing it wrong.

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-02  0:31                                               ` Yuan Fu
@ 2023-01-02  0:40                                                 ` Stefan Monnier
  2023-01-03  6:58                                                   ` Yuan Fu
  2023-01-02  3:34                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Stefan Monnier @ 2023-01-02  0:40 UTC (permalink / raw)
  To: Yuan Fu
  Cc: Eli Zaretskii, Philip Kaludercic, Dmitry Gutov, theophilusx,
	emacs-devel

[-- Attachment #1: Type: text/plain, Size: 979 bytes --]

>>>> I think we should default to asynchonous operations as much as possible.
>>> If you are talking about the whole command, maybe. Does call-process
>>> automatically yield when ran in a make-thread?
>> No, it doesn't.  And it's threads that yield, not Emacs primitives.  A
>> thread will yield when it calls some API that invokes pselect.
> Thanks. I guess what I’m asking is that if I run the command in make-thread,
> will the thread yield when it’s waiting for the subprocess? I tried it and
> it still blocks Emacs so I guess the answer is no. But maybe I’m doing
> it wrong.

No, indeed by "asynchronous" I wasn't thinking of using threads, but
rather using `start-process` and then putting the "rest" into its
sentinel.

It's rather cumbersome to do, admittedly.
I have a work-in-progress library of "promises/futures" for Emacs which
should (eventually) make it easier, but that's not an option for the
`emacs-29` branch :-(


        Stefan

[-- Attachment #2: futur.el --]
[-- Type: application/emacs-lisp, Size: 11104 bytes --]

^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-02  0:31                                               ` Yuan Fu
  2023-01-02  0:40                                                 ` Stefan Monnier
@ 2023-01-02  3:34                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2023-01-02  3:34 UTC (permalink / raw)
  To: Yuan Fu; +Cc: monnier, philipk, dgutov, theophilusx, emacs-devel

> From: Yuan Fu <casouri@gmail.com>
> Date: Sun, 1 Jan 2023 16:31:30 -0800
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>  Philip Kaludercic <philipk@posteo.net>,
>  Dmitry Gutov <dgutov@yandex.ru>,
>  theophilusx@gmail.com,
>  emacs-devel@gnu.org
> 
> 
> 
> > On Dec 31, 2022, at 10:39 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> > 
> >> From: Yuan Fu <casouri@gmail.com>
> >> Date: Sat, 31 Dec 2022 17:16:11 -0800
> >> Cc: Eli Zaretskii <eliz@gnu.org>,
> >> Philip Kaludercic <philipk@posteo.net>,
> >> dgutov@yandex.ru,
> >> theophilusx@gmail.com,
> >> emacs-devel@gnu.org
> >> 
> >>> 
> >>>> I wanted to keep it simple, synchronous, and quiet, and didn’t thought
> >>>> much about it.
> >>> 
> >>> I think we should default to asynchonous operations as much as possible.
> >> 
> >> If you are talking about the whole command, maybe. Does call-process automatically yield when ran in a make-thread?
> > 
> > No, it doesn't.  And it's threads that yield, not Emacs primitives.  A
> > thread will yield when it calls some API that invokes pselect.
> 
> Thanks. I guess what I’m asking is that if I run the command in make-thread, will the thread yield when it’s waiting for the subprocess? I tried it and it still blocks Emacs so I guess the answer is no. But maybe I’m doing it wrong.

No, it isn't supposed to yield, because waiting for the process in
this case is done by calling a C library function, not via the Emacs
waiting loop.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: [SPAM UNSURE] Re: Tree-sitter introduction documentation
  2023-01-01  1:18                                             ` Yuan Fu
@ 2023-01-02 19:10                                               ` Stephen Leake
  0 siblings, 0 replies; 138+ messages in thread
From: Stephen Leake @ 2023-01-02 19:10 UTC (permalink / raw)
  To: Yuan Fu
  Cc: Philip Kaludercic, Eli Zaretskii, monnier, dgutov, theophilusx,
	emacs-devel

Yuan Fu <casouri@gmail.com> writes:

> The command already requires a C/C++ compiler, in that case I don’t
> think Git is too much to ask. 

There are other CM tools that people use; don't rule them out
unnecessarily.

> The git url is much simpler for the user to figure out IMO.

Ok, but it would be good to support a tarball download as well.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-01  6:54                                       ` Eli Zaretskii
  2023-01-01 19:14                                         ` Gregory Heytings
@ 2023-01-03  4:06                                         ` Richard Stallman
  2023-01-03 12:06                                           ` Eli Zaretskii
  1 sibling, 1 reply; 138+ messages in thread
From: Richard Stallman @ 2023-01-03  4:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > 
  > >   > (setq treesit-language-source-alist
  > >   >       '((python "https://github.com/tree-sitter/tree-sitter-python.git")
  > >   >         (typescript "https://github.com/tree-sitter/tree-sitter-typescript.git"
  > >   >                     "typescript/src" "typescript")))
  > > 

  > The above was an example of user customizations.  There's no such text
  > anywhere in Emacs, nor there will be.  Emacs will be distributed with
  > that variable having the nil value, and users will have to download
  > and install the grammar libraries by themselves.

That is reassuring.  It avoids the specific problem I raised concern
about.

I could not tell that that code was meant only as a possible user
customization; I thought it was a proposed patch.

                                                      The major modes we
  > distribute all use grammars whose licenses are free, and as long as we
  > take care to verify this aspect, there should be no problem here from
  > this aspect.

I agree.

  > There's a command to download and install a grammar library, but it
  > leaves it to the user to specify from where to download the library
  > the user wants.

Could you please tell me more?  Or tell me how to find that source code?

I think there is a significant difference between referring users to a
released tarball of some free program, and referring users to a
development repo of that same program.

In a purely technical sense, any bad thing that is possible with a
repo is possible with a release tarball.  However, if we consider the
social practices of using the two, I think referring users to the repo
lacks proper caution -- we shouldn't recommend it to users in general.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2022-12-30 15:31                                     ` Eli Zaretskii
                                                         ` (2 preceding siblings ...)
  2022-12-31  0:44                                       ` Gregory Heytings
@ 2023-01-03  4:08                                       ` Richard Stallman
  2023-01-03 12:14                                         ` Eli Zaretskii
  3 siblings, 1 reply; 138+ messages in thread
From: Richard Stallman @ 2023-01-03  4:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: casouri, philipk, monnier, dgutov, theophilusx, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

            > (message "Cloning repository")
            > ;; git clone xxx --depth 1 --quiet workdir
            > (treesit--call-process-signal
            >  "git" nil t nil "clone" url "--depth" "1" "--quiet"
            >  workdir)

This discussion is about details of code I don't know, so I have not
been following it.  However, making it depend specifically on git
raises a concern.  In general, we should not do that.

What job is this part of?  Why propose to make it soecifically require
git?  What would it be doing with git, here?


-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-02  0:40                                                 ` Stefan Monnier
@ 2023-01-03  6:58                                                   ` Yuan Fu
  0 siblings, 0 replies; 138+ messages in thread
From: Yuan Fu @ 2023-01-03  6:58 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Eli Zaretskii, Philip Kaludercic, Dmitry Gutov, Tim Cross,
	emacs-devel



> On Jan 1, 2023, at 4:40 PM, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>>>>> I think we should default to asynchonous operations as much as possible.
>>>> If you are talking about the whole command, maybe. Does call-process
>>>> automatically yield when ran in a make-thread?
>>> No, it doesn't.  And it's threads that yield, not Emacs primitives.  A
>>> thread will yield when it calls some API that invokes pselect.
>> Thanks. I guess what I’m asking is that if I run the command in make-thread,
>> will the thread yield when it’s waiting for the subprocess? I tried it and
>> it still blocks Emacs so I guess the answer is no. But maybe I’m doing
>> it wrong.
> 
> No, indeed by "asynchronous" I wasn't thinking of using threads, but
> rather using `start-process` and then putting the "rest" into its
> sentinel.
> 
> It's rather cumbersome to do, admittedly.
> I have a work-in-progress library of "promises/futures" for Emacs which
> should (eventually) make it easier, but that's not an option for the
> `emacs-29` branch :-(

Oh cool. This would have made some of my packages easier to write :-) I wish there’s also something like emacs-async: spawn a separate interpreter process, do some computations and return the result.

Yuan


^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-03  4:06                                         ` Richard Stallman
@ 2023-01-03 12:06                                           ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2023-01-03 12:06 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Mon, 02 Jan 2023 23:06:13 -0500
> 
>   > There's a command to download and install a grammar library, but it
>   > leaves it to the user to specify from where to download the library
>   > the user wants.
> 
> Could you please tell me more?  Or tell me how to find that source code?

If you mean the source code of the Emacs command I mentioned above,
then its name is treesit-install-language-grammar, and it was added to
lisp/treesit.el a few days ago.

> I think there is a significant difference between referring users to a
> released tarball of some free program, and referring users to a
> development repo of that same program.
> 
> In a purely technical sense, any bad thing that is possible with a
> repo is possible with a release tarball.  However, if we consider the
> social practices of using the two, I think referring users to the repo
> lacks proper caution -- we shouldn't recommend it to users in general.

Unfortunately, it looks like the developers of the grammar libraries
make only infrequent releases, and some don't make any releases.  Just
as one example, Alan Mackenzie recently reported a bug in the
c++-ts-mode which was apparently caused by using the last released
version of the C++ grammar, from Oct 2021, which is already fixed in
their Git repository.  So the ability to download and install the
latest development version seems to be important in this case, at
least when it is known that the last released version has a bug.



^ permalink raw reply	[flat|nested] 138+ messages in thread

* Re: Tree-sitter introduction documentation
  2023-01-03  4:08                                       ` Richard Stallman
@ 2023-01-03 12:14                                         ` Eli Zaretskii
  0 siblings, 0 replies; 138+ messages in thread
From: Eli Zaretskii @ 2023-01-03 12:14 UTC (permalink / raw)
  To: rms; +Cc: casouri, philipk, monnier, dgutov, theophilusx, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Cc: casouri@gmail.com, philipk@posteo.net, monnier@iro.umontreal.ca,
> 	dgutov@yandex.ru, theophilusx@gmail.com, emacs-devel@gnu.org
> Date: Mon, 02 Jan 2023 23:08:11 -0500
> 
>             > (message "Cloning repository")
>             > ;; git clone xxx --depth 1 --quiet workdir
>             > (treesit--call-process-signal
>             >  "git" nil t nil "clone" url "--depth" "1" "--quiet"
>             >  workdir)
> 
> This discussion is about details of code I don't know, so I have not
> been following it.  However, making it depend specifically on git
> raises a concern.  In general, we should not do that.
> 
> What job is this part of?  Why propose to make it soecifically require
> git?  What would it be doing with git, here?

Clone the repository and then build the library from the sources in
that repository.  The build itself uses a C/C++ compiler and linker.

As I explained in my other message, the ability to build the latest
development version of a grammar library seems to be important, given
even our relatively short experience, given how these grammar
libraries are currently developed and released.



^ permalink raw reply	[flat|nested] 138+ messages in thread

end of thread, other threads:[~2023-01-03 12:14 UTC | newest]

Thread overview: 138+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-16 14:47 Tree-sitter introduction documentation Perry Smith
2022-12-16 15:06 ` Eli Zaretskii
2022-12-16 15:24   ` João Távora
2022-12-16 15:36     ` Perry Smith
2022-12-16 15:43       ` João Távora
2022-12-16 17:56       ` Philip Kaludercic
2022-12-16 15:38     ` Eli Zaretskii
2022-12-16 15:48       ` João Távora
2022-12-16 15:53         ` Perry Smith
2022-12-16 16:02           ` João Távora
2022-12-18  9:59             ` Eli Zaretskii
2022-12-18 14:07               ` Perry Smith
2022-12-18 17:18                 ` Eli Zaretskii
2022-12-16 16:34         ` Eli Zaretskii
2022-12-17  0:03           ` Tim Cross
2022-12-17  8:42             ` Eli Zaretskii
2022-12-17 10:40               ` João Távora
2022-12-17 11:00                 ` Eli Zaretskii
2022-12-18  0:40               ` Tim Cross
2022-12-16 16:01       ` Manuel Giraud
2022-12-16 16:40         ` Eli Zaretskii
2022-12-16 16:47           ` Perry Smith
2022-12-16 17:21             ` Eli Zaretskii
2022-12-16 15:53     ` Manuel Giraud
2022-12-16 15:56       ` João Távora
2022-12-16 16:39       ` Eli Zaretskii
2022-12-16 17:15         ` Manuel Giraud
2022-12-16 17:23           ` Eli Zaretskii
2022-12-16 20:22             ` Ken Brown
2022-12-17  4:06               ` Tim Cross
2022-12-17 15:42                 ` Stefan Monnier
2022-12-17 17:41                   ` T.V Raman
2022-12-26 22:42                   ` Dmitry Gutov
2022-12-27 12:11                     ` Eli Zaretskii
2022-12-27 12:43                       ` Dmitry Gutov
2022-12-27 13:38                         ` Eli Zaretskii
2022-12-27 14:11                           ` Dmitry Gutov
2022-12-27 14:32                             ` Eli Zaretskii
2022-12-27 16:36                               ` Stefan Monnier
2022-12-27 16:44                                 ` Philip Kaludercic
2022-12-27 17:16                                   ` Eli Zaretskii
2022-12-27 17:20                                     ` Philip Kaludercic
2022-12-27 18:06                                       ` Eli Zaretskii
2022-12-27 17:33                                     ` Stefan Monnier
2022-12-30 11:06                                   ` Yuan Fu
2022-12-30 11:25                                     ` Philip Kaludercic
2022-12-30 11:54                                       ` tomas
2022-12-30 11:59                                         ` Philip Kaludercic
2022-12-30 12:27                                           ` tomas
2022-12-30 12:45                                             ` Philip Kaludercic
2022-12-30 14:26                                             ` Dmitry Gutov
2022-12-30 23:33                                       ` Yuan Fu
2022-12-30 15:31                                     ` Eli Zaretskii
2022-12-30 15:54                                       ` Philip Kaludercic
2022-12-30 16:17                                         ` Eli Zaretskii
2022-12-31  0:06                                         ` Yuan Fu
2022-12-31  0:12                                           ` Philip Kaludercic
2023-01-01  1:18                                             ` Yuan Fu
2023-01-02 19:10                                               ` [SPAM UNSURE] " Stephen Leake
2022-12-31  0:03                                       ` Yuan Fu
2022-12-31  0:25                                         ` Stefan Monnier
2023-01-01  1:16                                           ` Yuan Fu
2023-01-01  6:39                                             ` Eli Zaretskii
2023-01-02  0:31                                               ` Yuan Fu
2023-01-02  0:40                                                 ` Stefan Monnier
2023-01-03  6:58                                                   ` Yuan Fu
2023-01-02  3:34                                                 ` Eli Zaretskii
2022-12-31  9:24                                         ` Eli Zaretskii
2022-12-31 22:14                                           ` Yuan Fu
2023-01-01  1:12                                             ` Yuan Fu
2022-12-31  0:44                                       ` Gregory Heytings
2023-01-03  4:08                                       ` Richard Stallman
2023-01-03 12:14                                         ` Eli Zaretskii
2023-01-01  3:03                                     ` Richard Stallman
2023-01-01  6:54                                       ` Eli Zaretskii
2023-01-01 19:14                                         ` Gregory Heytings
2023-01-01 20:11                                           ` Eli Zaretskii
2023-01-03  4:06                                         ` Richard Stallman
2023-01-03 12:06                                           ` Eli Zaretskii
2022-12-27 17:10                                 ` Eli Zaretskii
2022-12-27 17:31                                   ` Stefan Monnier
2022-12-27 18:08                                     ` Eli Zaretskii
2022-12-27 18:44                                       ` Stefan Monnier
2022-12-27 20:06                                         ` Philip Kaludercic
2022-12-27 21:13                                           ` Stefan Monnier
2022-12-28  2:52                                             ` Yuan Fu
2022-12-28 13:10                                               ` Gregory Heytings
2022-12-28 13:38                                               ` Lynn Winebarger
2022-12-28 14:41                                                 ` Danny Freeman
2022-12-29 11:14                                               ` Philip Kaludercic
2022-12-29 15:27                                                 ` Gregory Heytings
2022-12-29 15:40                                                   ` Lynn Winebarger
2022-12-29 21:50                                                     ` [SPAM UNSURE] " Stephen Leake
2022-12-29 22:37                                                       ` Lynn Winebarger
2022-12-30 14:10                                                       ` Lynn Winebarger
2022-12-30 16:25                                                         ` Targeting libtreesitter from wisent and other parser generators for emacs Lynn Winebarger
2022-12-31  8:25                                                           ` Eli Zaretskii
2022-12-31 13:07                                                             ` Lynn Winebarger
2022-12-29 15:45                                                   ` Tree-sitter introduction documentation Philip Kaludercic
2022-12-29 17:00                                                     ` Gregory Heytings
2022-12-29 17:12                                                       ` Philip Kaludercic
2022-12-29 17:31                                                         ` Gregory Heytings
2022-12-29 18:12                                                           ` Philip Kaludercic
2022-12-29 18:28                                                             ` Eli Zaretskii
2022-12-29 18:44                                                               ` Stefan Monnier
2022-12-29 19:34                                                                 ` Eli Zaretskii
2022-12-29 19:48                                                                   ` Stefan Monnier
2022-12-29 19:59                                                                     ` Eli Zaretskii
2022-12-29 18:32                                                             ` Stefan Monnier
2022-12-29 16:32                                                   ` Eli Zaretskii
2022-12-29 16:53                                                     ` Philip Kaludercic
2022-12-29 16:59                                                       ` Eli Zaretskii
2022-12-29 17:01                                                         ` Philip Kaludercic
2022-12-29 17:03                                                       ` Stefan Monnier
2022-12-29 17:12                                                         ` Gregory Heytings
2022-12-29 17:13                                                         ` Philip Kaludercic
2022-12-29 17:04                                                     ` Gregory Heytings
2022-12-30  1:01                                                 ` Gregory Heytings
2022-12-30 11:00                                                   ` Philip Kaludercic
2022-12-30 12:07                                                     ` Gregory Heytings
2022-12-30 13:10                                                       ` Philip Kaludercic
2022-12-30 15:23                                                         ` Gregory Heytings
2022-12-28 12:56                                         ` Gregory Heytings
2022-12-28 14:41                                           ` Stefan Monnier
2022-12-27 19:53                                       ` Dmitry Gutov
2023-01-01  3:03                                         ` Richard Stallman
2022-12-27 13:51                         ` tomas
2022-12-27 15:58                         ` Stefan Monnier
2022-12-16 17:23   ` Perry Smith
2022-12-16 17:31     ` Eli Zaretskii
2022-12-16 19:08       ` Perry Smith
2022-12-16 19:37         ` Eli Zaretskii
2022-12-16 20:05           ` Perry Smith
  -- strict thread matches above, loose matches on Subject: below --
2022-12-17  4:50 Payas Relekar
2022-12-18  6:32 Pedro Andres Aranda Gutierrez
2022-12-18  8:07 ` Eli Zaretskii
2022-12-18 10:39   ` Pedro Andres Aranda Gutierrez
2022-12-18 11:44     ` Eli Zaretskii
2022-12-31  6:59 Pedro Andres Aranda Gutierrez
2022-12-31  7:47 ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).