unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* cc-mode fontification feels random
@ 2021-06-04  3:16 Daniel Colascione
  2021-06-04  6:10 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04  3:16 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 621 bytes --]

As long as I can remember, cc-mode fontification has felt totally 
random, with actual faces depending on happenstance of previously-parsed 
types, luck of the draw in jit-lock chunking, and so on. Is there any 
*general* way that we can make fontification more robust and consistent?

For years and years now, I've been thinking we just need more 
deterministic parser-and-based mode support, and I still think that, but 
on a realistic level, that doesn't seem to be coming any time soon.

In the meantime, is there any general approach we might be able to use 
to get stuff like the attached to stop happening?








[-- Attachment #2: types.png --]
[-- Type: image/png, Size: 33446 bytes --]

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
@ 2021-06-04  6:10 ` Eli Zaretskii
  2021-06-04  7:10   ` Theodor Thornhill
  2021-06-04 10:05   ` Daniel Colascione
  2021-06-04 10:42 ` Ergus
  2021-06-04 15:54 ` Alan Mackenzie
  2 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04  6:10 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 3 Jun 2021 20:16:53 -0700
> 
> As long as I can remember, cc-mode fontification has felt totally 
> random, with actual faces depending on happenstance of previously-parsed 
> types, luck of the draw in jit-lock chunking, and so on. Is there any 
> *general* way that we can make fontification more robust and consistent?
> 
> For years and years now, I've been thinking we just need more 
> deterministic parser-and-based mode support, and I still think that, but 
> on a realistic level, that doesn't seem to be coming any time soon.

Full agreement.  And not only for C and C-like languages, IMO.

See

  https://lists.gnu.org/archive/html/emacs-devel/2020-01/msg00059.html

See also Eglot and LSP.

Patches more than welcome, I think having this (whether tree-sitter or
some other similar technology) in core is long overdue.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  6:10 ` Eli Zaretskii
@ 2021-06-04  7:10   ` Theodor Thornhill
  2021-06-04 10:08     ` João Távora
  2021-06-04 10:25     ` Eli Zaretskii
  2021-06-04 10:05   ` Daniel Colascione
  1 sibling, 2 replies; 206+ messages in thread
From: Theodor Thornhill @ 2021-06-04  7:10 UTC (permalink / raw)
  To: Eli Zaretskii, Daniel Colascione; +Cc: emacs-devel, ubolonton, joaotavora




>> As long as I can remember, cc-mode fontification has felt totally 
>> random, with actual faces depending on happenstance of previously-parsed 
>> types, luck of the draw in jit-lock chunking, and so on. Is there any 
>> *general* way that we can make fontification more robust and consistent?

Yes, tree-sitter.  Ubolonton has made a tremendous package implementing
this for emacs.  It is used in csharp-mode already, with success.  At
least for the fontification.  There are still some kinks to work out in
the indentation part of the mode.

In C#-mode we use tree sitter for:

- Fontification
- Indentation

There is also a normal CC mode version, which is enabled by default.  So
you need to install the third party packages as well as enabling
csharp-tree-sitter-mode.  You can try it out and see if it has some
benefits.  Performance wise the tree-sitter mode is leagues above the CC
mode one.  Also one benefit is that it is extremely easy to define
these grammars.

> See also Eglot and LSP.

LSP-mode supports the semantic fontification from lsp servers, which
usually uses tree-sitter.  Examples for this is Rust, F# and others.
Eglot does not yet support this, though I believe there is an issue
somewhere for it.

>
> Patches more than welcome, I think having this (whether tree-sitter or
> some other similar technology) in core is long overdue.

Pinging @Ubolonton and Joao, as they probably know way more than
me about this.

--
Theodor Thornhill



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  6:10 ` Eli Zaretskii
  2021-06-04  7:10   ` Theodor Thornhill
@ 2021-06-04 10:05   ` Daniel Colascione
  2021-06-04 10:22     ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04 10:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On 6/3/21 11:10 PM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Thu, 3 Jun 2021 20:16:53 -0700
>>
>> As long as I can remember, cc-mode fontification has felt totally
>> random, with actual faces depending on happenstance of previously-parsed
>> types, luck of the draw in jit-lock chunking, and so on. Is there any
>> *general* way that we can make fontification more robust and consistent?
>>
>> For years and years now, I've been thinking we just need more
>> deterministic parser-and-based mode support, and I still think that, but
>> on a realistic level, that doesn't seem to be coming any time soon.
> Full agreement.  And not only for C and C-like languages, IMO.
>
> See
>
>    https://lists.gnu.org/archive/html/emacs-devel/2020-01/msg00059.html
>
> See also Eglot and LSP.
>
> Patches more than welcome, I think having this (whether tree-sitter or
> some other similar technology) in core is long overdue.

We could just vendor tree-sitter.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  7:10   ` Theodor Thornhill
@ 2021-06-04 10:08     ` João Távora
  2021-06-04 10:39       ` Eli Zaretskii
  2021-06-04 16:43       ` Jim Porter
  2021-06-04 10:25     ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: João Távora @ 2021-06-04 10:08 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Eli Zaretskii, Daniel Colascione, ubolonton, emacs-devel

Theodor Thornhill <theo@thornhill.no> writes:

> Pinging @Ubolonton and Joao, as they probably know way more than
> me about this.

Here are my quick views on this:

- Eglot can add LSP fontification support, that doesn't seem hard.

- However, LSP support for fontification seems like it's potentially
  _less_ efficient than integrating something like tree-sitter as a C
  module in Emacs.  That's because the contents of the buffer and
  fontification results are continually transmitted back and forth via
  pipes and JSON format.

- Moreover, if one wishes 100% out-of-the-box support for LSP (this or
  any other feature), one needs to also distribute a capable server
  program.  For C/C++ this is potentially problematic due to licensing
  issues: the most capable such program for C/C++, is to the best of my
  limited knowldge, clangd.  There are others, though.

- The past few weeks I've been trying to get back to the long-stated
  goal of integrating Eglot into Emacs proper, as discussed some time
  ago.  The idea is to first let it be an independent extension much
  like it is now, then experiment with integrating its functionality
  directly in major modes, eventually evolving into an out-of-the-box,
  seamless
  "i-dont-even-know-that-LSP-is-being-leveraged-in-the-background"
  experience for documentation, definition-finding, diagnostics, etc.
  And also fontification, of course, but my gut feeling says that
  tree-sitter (or any other integrated parser) approach is more
  efficient and "tighter" for such a basic thing.

João





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:05   ` Daniel Colascione
@ 2021-06-04 10:22     ` Eli Zaretskii
  2021-06-04 10:34       ` João Távora
  2021-06-04 10:41       ` Eli Zaretskii
  0 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:22 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 4 Jun 2021 03:05:53 -0700
> Cc: emacs-devel@gnu.org
> 
> We could just vendor tree-sitter.

Sorry, I don't understand what that means.

My problem is that I know of now package that integrates tree-sitter
into Emacs with architecture that makes sense to me.  The ones I saw
all send the entire buffer to tree-sitter using buffer-string (and
failing to encode it), which doesn't scale.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  7:10   ` Theodor Thornhill
  2021-06-04 10:08     ` João Távora
@ 2021-06-04 10:25     ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:25 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: ubolonton, dancol, joaotavora, emacs-devel

> From: Theodor Thornhill <theo@thornhill.no>
> Cc: emacs-devel@gnu.org, ubolonton@gmail.com, joaotavora@gmail.com
> Date: Fri, 04 Jun 2021 09:10:33 +0200
> 
> >> As long as I can remember, cc-mode fontification has felt totally 
> >> random, with actual faces depending on happenstance of previously-parsed 
> >> types, luck of the draw in jit-lock chunking, and so on. Is there any 
> >> *general* way that we can make fontification more robust and consistent?
> 
> Yes, tree-sitter.  Ubolonton has made a tremendous package implementing
> this for emacs.  It is used in csharp-mode already, with success.  At
> least for the fontification.  There are still some kinks to work out in
> the indentation part of the mode.

Not from my POV, see my other message.

I welcome patches submitted to the project with the goal of
integrating that into Emacs core.  Past discussions indicated to me
that authors of the existing packages are not interested in that
enough to modify the packages according to our suggestions.  Sorry to
be blunt.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:22     ` Eli Zaretskii
@ 2021-06-04 10:34       ` João Távora
  2021-06-04 10:43         ` Eli Zaretskii
  2021-06-04 18:25         ` Stefan Monnier
  2021-06-04 10:41       ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: João Távora @ 2021-06-04 10:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> My problem is that I know of now package that integrates tree-sitter
> into Emacs with architecture that makes sense to me.  The ones I saw
> all send the entire buffer to tree-sitter using buffer-string (and
> failing to encode it), which doesn't scale.

In this matter, the LSP approach may be more efficient, since it
transmits only changes/differences, and should (in principle) handle the
encoding troubles.

But I don't understand what's stopping these tree-sitter C modules (like
[1] and [2]) to have access to the buffer's contents directly and have
the best of both worlds.

João

[1]: https://github.com/karlotness/tree-sitter.el
[2]: https://github.com/ubolonton/emacs-tree-sitter






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:08     ` João Távora
@ 2021-06-04 10:39       ` Eli Zaretskii
  2021-06-04 10:59         ` Philipp
  2021-06-04 16:43       ` Jim Porter
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:39 UTC (permalink / raw)
  To: João Távora; +Cc: ubolonton, dancol, theo, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>   emacs-devel@gnu.org,  ubolonton@gmail.com
> Date: Fri, 04 Jun 2021 11:08:48 +0100
> 
> - However, LSP support for fontification seems like it's potentially
>   _less_ efficient than integrating something like tree-sitter as a C
>   module in Emacs.  That's because the contents of the buffer and
>   fontification results are continually transmitted back and forth via
>   pipes and JSON format.

The communication of buffer contents to these agents/servers is indeed
one aspect of the existing packages (those I had time to look at) that
I personally am unhappy about.  Sending the whole buffer or its large
chunks down the wire as buffer-substring (which requires encoding to
be correct) is non-scalable, especially if it also requires conversion
to JSON.  A core feature cannot work that way, IMO.

Unfortunately, every discussion about the alternatives, at least those
in which I participated, ended with nothing, although I think a much
better solution is possible and even not too hard.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:22     ` Eli Zaretskii
  2021-06-04 10:34       ` João Távora
@ 2021-06-04 10:41       ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:41 UTC (permalink / raw)
  To: dancol; +Cc: emacs-devel

> Date: Fri, 04 Jun 2021 13:22:38 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> My problem is that I know of now package that integrates tree-sitter
                               ^^^
Sorry, should have been "no".

> into Emacs with architecture that makes sense to me.  The ones I saw
> all send the entire buffer to tree-sitter using buffer-string (and
> failing to encode it), which doesn't scale.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
  2021-06-04  6:10 ` Eli Zaretskii
@ 2021-06-04 10:42 ` Ergus
  2021-06-04 15:54 ` Alan Mackenzie
  2 siblings, 0 replies; 206+ messages in thread
From: Ergus @ 2021-06-04 10:42 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

On Thu, Jun 03, 2021 at 08:16:53PM -0700, Daniel Colascione wrote:

>For years and years now, I've been thinking we just need more 
>deterministic parser-and-based mode support, and I still think that, 
>but on a realistic level, that doesn't seem to be coming any time 
>soon.
>

There is something going on with lsp and lsp-mode I think, but not in
vanilla for sure. And I am not aware of the actual status.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:34       ` João Távora
@ 2021-06-04 10:43         ` Eli Zaretskii
  2021-06-04 18:25         ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:43 UTC (permalink / raw)
  To: João Távora; +Cc: dancol, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Daniel Colascione <dancol@dancol.org>,  emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 11:34:31 +0100
> 
> But I don't understand what's stopping these tree-sitter C modules (like
> [1] and [2]) to have access to the buffer's contents directly and have
> the best of both worlds.

Exactly.  I proposed that much in past discussions, but no one seemed
to be interested enough to pick up the gauntlet.  I still offer help
in making this happen to anyone who'd like to work on this (and needs
help).

TIA



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:39       ` Eli Zaretskii
@ 2021-06-04 10:59         ` Philipp
  2021-06-04 11:05           ` João Távora
  2021-06-04 11:18           ` Eli Zaretskii
  0 siblings, 2 replies; 206+ messages in thread
From: Philipp @ 2021-06-04 10:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ubolonton, dancol, theo, João Távora, emacs-devel



> Am 04.06.2021 um 12:39 schrieb Eli Zaretskii <eliz@gnu.org>:
> 
>> From: João Távora <joaotavora@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>>  emacs-devel@gnu.org,  ubolonton@gmail.com
>> Date: Fri, 04 Jun 2021 11:08:48 +0100
>> 
>> - However, LSP support for fontification seems like it's potentially
>>  _less_ efficient than integrating something like tree-sitter as a C
>>  module in Emacs.  That's because the contents of the buffer and
>>  fontification results are continually transmitted back and forth via
>>  pipes and JSON format.
> 
> The communication of buffer contents to these agents/servers is indeed
> one aspect of the existing packages (those I had time to look at) that
> I personally am unhappy about.  Sending the whole buffer or its large
> chunks down the wire as buffer-substring (which requires encoding to
> be correct) is non-scalable, especially if it also requires conversion
> to JSON.

How bad is is actually; are there good numbers on this?
A while ago, I tested this hypothesis by transferring the `buffer-string' of xdisp.c to a Go module.  This goes through a full UTF-8 encoding and makes three copies (first, to create the string object; then, to copy it to the module interface; lastly, to make a Go string out of it), and it still only took a few milliseconds.
Modern CPUs are very good at copying memory, so maybe we're optimizing the wrong thing here.  We definitely should have good benchmarks and profiling data before deciding what to optimize.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:59         ` Philipp
@ 2021-06-04 11:05           ` João Távora
  2021-06-04 11:22             ` Eli Zaretskii
  2021-06-04 11:18           ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: João Távora @ 2021-06-04 11:05 UTC (permalink / raw)
  To: Philipp; +Cc: Eli Zaretskii, Daniel Colascione, theo, ubolonton, emacs-devel

On Fri, Jun 4, 2021 at 11:59 AM Philipp <p.stephani2@gmail.com> wrote:

> > Am 04.06.2021 um 12:39 schrieb Eli Zaretskii <eliz@gnu.org>:
> >
> >> From: João Távora <joaotavora@gmail.com>
> >> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
> >>  emacs-devel@gnu.org,  ubolonton@gmail.com
> >> Date: Fri, 04 Jun 2021 11:08:48 +0100
> >>
> >> - However, LSP support for fontification seems like it's potentially
> >>  _less_ efficient than integrating something like tree-sitter as a C
> >>  module in Emacs.  That's because the contents of the buffer and
> >>  fontification results are continually transmitted back and forth via
> >>  pipes and JSON format.
> >
> > The communication of buffer contents to these agents/servers is indeed
> > one aspect of the existing packages (those I had time to look at) that
> > I personally am unhappy about.  Sending the whole buffer or its large
> > chunks down the wire as buffer-substring (which requires encoding to
> > be correct) is non-scalable, especially if it also requires conversion
> > to JSON.
>
> How bad is is actually; are there good numbers on this?

Not from me.  Only gut feeling.  But I have seen latency from servers before.
That just depends on the server and its architecture, I guess.

 However there are reports of enormous latency on Emacs side when JSON
messages get very long and complex. Part of this related simply to JSON
parsing and allocation of lots of lisp objects.  My hunch is that
fontification of
a big and complex buffer would give rise to one of these big and complex
JSON messages.

João



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:59         ` Philipp
  2021-06-04 11:05           ` João Távora
@ 2021-06-04 11:18           ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 11:18 UTC (permalink / raw)
  To: Philipp; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

> From: Philipp <p.stephani2@gmail.com>
> Date: Fri, 4 Jun 2021 12:59:45 +0200
> Cc: João Távora <joaotavora@gmail.com>,
>  ubolonton@gmail.com,
>  dancol@dancol.org,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> > The communication of buffer contents to these agents/servers is indeed
> > one aspect of the existing packages (those I had time to look at) that
> > I personally am unhappy about.  Sending the whole buffer or its large
> > chunks down the wire as buffer-substring (which requires encoding to
> > be correct) is non-scalable, especially if it also requires conversion
> > to JSON.
> 
> How bad is is actually; are there good numbers on this?

It doesn't matter to me; we cannot go that way in core.  And there's
no reason, really.

> A while ago, I tested this hypothesis by transferring the `buffer-string' of xdisp.c to a Go module.  This goes through a full UTF-8 encoding and makes three copies (first, to create the string object; then, to copy it to the module interface; lastly, to make a Go string out of it), and it still only took a few milliseconds.
> Modern CPUs are very good at copying memory, so maybe we're optimizing the wrong thing here.  We definitely should have good benchmarks and profiling data before deciding what to optimize.

First, for LSP  this is not a memory copy.

Second, buffer-string (or buffer-substring) conses a Lisp string,
which increases memory pressure and GC.  Imagine doing this for many
buffers.  E.g., I have jit-lock-stealth enabled, so Emacs fontifies
buffers in the background whenever it is idle.

And third, why settle for an inferior solution that scales badly, when
a superior one is just around the corner?  I understand why we would
want to compromise if there were no alternatives, but why compromise
up front when a better alternative exists?  It makes no sense to me.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 11:05           ` João Távora
@ 2021-06-04 11:22             ` Eli Zaretskii
  2021-06-04 12:44               ` Dmitry Gutov
  2021-06-04 13:46               ` João Távora
  0 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 11:22 UTC (permalink / raw)
  To: João Távora; +Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Date: Fri, 4 Jun 2021 12:05:18 +0100
> Cc: Eli Zaretskii <eliz@gnu.org>, ubolonton@gmail.com, 
> 	Daniel Colascione <dancol@dancol.org>, theo@thornhill.no, emacs-devel <emacs-devel@gnu.org>
> 
> > > The communication of buffer contents to these agents/servers is indeed
> > > one aspect of the existing packages (those I had time to look at) that
> > > I personally am unhappy about.  Sending the whole buffer or its large
> > > chunks down the wire as buffer-substring (which requires encoding to
> > > be correct) is non-scalable, especially if it also requires conversion
> > > to JSON.
> >
> > How bad is is actually; are there good numbers on this?
> 
> Not from me.  Only gut feeling.  But I have seen latency from servers before.
> That just depends on the server and its architecture, I guess.
> 
>  However there are reports of enormous latency on Emacs side when JSON
> messages get very long and complex. Part of this related simply to JSON
> parsing and allocation of lots of lisp objects.  My hunch is that
> fontification of
> a big and complex buffer would give rise to one of these big and complex
> JSON messages.

Ask Dmitry about performance problems with native JSON support, and
the effort we invested (a year ago?) into optimizing UTF-8 encoding of
strings, to squeeze every last percent of performance.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 11:22             ` Eli Zaretskii
@ 2021-06-04 12:44               ` Dmitry Gutov
  2021-06-04 13:46               ` João Távora
  1 sibling, 0 replies; 206+ messages in thread
From: Dmitry Gutov @ 2021-06-04 12:44 UTC (permalink / raw)
  To: Eli Zaretskii, João Távora
  Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

On 04.06.2021 14:22, Eli Zaretskii wrote:

> Ask Dmitry about performance problems with native JSON support, and
> the effort we invested (a year ago?) into optimizing UTF-8 encoding of
> strings, to squeeze every last percent of performance.

About a year ago, yes (bug#31138 plus some follow-ups).

With string encoding taken care of, IIUC the current bottleneck is in 
parsing: Lisp object allocation which still has to happen on the current 
thread (some way to use parallel heaps could help with that).

And to get all of the highlightings for the current buffer, we will need 
to parse the response JSON document, probably also fairly large.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 11:22             ` Eli Zaretskii
  2021-06-04 12:44               ` Dmitry Gutov
@ 2021-06-04 13:46               ` João Távora
  2021-06-04 14:11                 ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: João Távora @ 2021-06-04 13:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> a big and complex buffer would give rise to one of these big and complex
>> JSON messages.
> Ask Dmitry about performance problems with native JSON support, and
> the effort we invested (a year ago?) into optimizing UTF-8 encoding of
> strings, to squeeze every last percent of performance.

As I remember, the biggest bottleneck was parsing and allocating Lisp
objects.  Commonly, it means parsing a big JSON message even if you're
only interested in a fraction of it (and this happens in LSP when
e.g. some servers decide to serve up huge buckets of diagnostics
unrelated to the current file being edited, for instance).  The json.c
parser is faster, but ultimately borks here, too.  

My idea at the time was to develop a technique to only parse the bits of
JSON we're interested in, which dramatically improved performance.  I
had a prototype for json.el lying around (can't seem to find it) based
on lazy evaluation.  If I remember correctly, Dmitry proposed another
technique based on a "path/selector language", which can also work but
is not quite so elegant IMO.

Of course, this is only useful if the starting assumption of much
useless JSON garbage is indeed true.  And I don't get a lot of bug
reports in Eglot about big-and-slow JSON, so it's been off the radar for
a while.

And again, for fontification, this point is probably moot if we're going
to integrate tree-sitter directly with direct access to the buffer
(which just makes sense).

João



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 13:46               ` João Távora
@ 2021-06-04 14:11                 ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 14:11 UTC (permalink / raw)
  To: João Távora; +Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: p.stephani2@gmail.com,  ubolonton@gmail.com,  dancol@dancol.org,
>   theo@thornhill.no,  emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 14:46:12 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> a big and complex buffer would give rise to one of these big and complex
> >> JSON messages.
> > Ask Dmitry about performance problems with native JSON support, and
> > the effort we invested (a year ago?) into optimizing UTF-8 encoding of
> > strings, to squeeze every last percent of performance.
> 
> As I remember, the biggest bottleneck was parsing and allocating Lisp
> objects.

But that's exactly the problem: the packages I've seen try to solve
this on the Lisp level, and that just has got to involve consing of
Lisp objects, so there's no way around that problem with this
approach.

By contrast, fast access to buffer text is on the C level, similar to
what we do with regexp search, and doesn't require any Lisp objects as
intermediates.

The other problem with the integration of this packages into Emacs
(again, those few packages that I took a good enough look at) is that
they don't plug themselves into the JIT lock mechanism triggered by
redisplay, and instead use all kinds of hooks to put text properties
on buffer text (and turn off font-lock for that to work).  That's
another aspect of IMO poor integration into the Emacs core, probably
again because of the desire to stay away of C and the innards of the
display engine.

> Commonly, it means parsing a big JSON message even if you're
> only interested in a fraction of it (and this happens in LSP when
> e.g. some servers decide to serve up huge buckets of diagnostics
> unrelated to the current file being edited, for instance).  The json.c
> parser is faster, but ultimately borks here, too.  
> 
> My idea at the time was to develop a technique to only parse the bits of
> JSON we're interested in, which dramatically improved performance.

I think this is a separate issue.  I guess if the percentage of
"garbage" is large, then this will indeed be a win, but it must come
with some overhead (to figure out what is "garbage"), so it isn't
going to produce significant speedup with milder amounts of "garbage".

And this is only relevant if the protocol is based on JSON.

> And again, for fontification, this point is probably moot if we're going
> to integrate tree-sitter directly with direct access to the buffer
> (which just makes sense).

Only if someone does the job.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
  2021-06-04  6:10 ` Eli Zaretskii
  2021-06-04 10:42 ` Ergus
@ 2021-06-04 15:54 ` Alan Mackenzie
  2021-06-04 18:30   ` Daniel Colascione
  2021-06-05 20:25   ` Dmitry Gutov
  2 siblings, 2 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-04 15:54 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

Hello, Daniel.

On Thu, Jun 03, 2021 at 20:16:53 -0700, Daniel Colascione wrote:
> As long as I can remember, cc-mode fontification has felt totally 
> random, .....

Hmmm.  It is anything but totally random.

> ..... with actual faces depending on happenstance of previously-parsed
> types, .....

Whether a type is recognised as such depends on that, yes.  It's hard to
think of a better way without having the resources of a compiler,
particularly for ill-behaved languages like C++.

> ..... luck of the draw in jit-lock chunking, .....

That should be a thing of the past, much effort having been put into
eradicating such errors.  That is one of the main reasons for the
relative slowness of CC Mode, as compared with, say, Emacs Lisp Mode.

> ..... and so on.

And so on???

> Is there any *general* way that we can make fontification more robust
> and consistent?

Like other people have said on the thread, rewriting CC Mode to use an
LSP parser.

Less drastically, it would be possible to fix the specific bug you
allude to, by the user making a list of types and configuring CC Mode
with them, rather than attempting to recognise such types.  This feels
as though it would be tedious to use, though.

> For years and years now, I've been thinking we just need more 
> deterministic parser-and-based mode support, and I still think that, but 
> on a realistic level, that doesn't seem to be coming any time soon.

What does "parser-and-based" mean?

> In the meantime, is there any general approach we might be able to use 
> to get stuff like the attached to stop happening?

Probably none that we'd like.  Fontifying types only at their point of
declaration would be one, but I don't think people would want that.  My
impression is that the approach taken by CC Mode, like that of most
language modes in Emacs, has pretty much reached the limits of what's
possible, and it is unreasonable to expect perfect fontification (and
indentation) from languages like C++ in all cases.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:08     ` João Távora
  2021-06-04 10:39       ` Eli Zaretskii
@ 2021-06-04 16:43       ` Jim Porter
       [not found]         ` <83k0n9l9pv.fsf@gnu.org>
  1 sibling, 1 reply; 206+ messages in thread
From: Jim Porter @ 2021-06-04 16:43 UTC (permalink / raw)
  To: emacs-devel; +Cc: Eli Zaretskii, Daniel Colascione, ubolonton

On 6/4/2021 3:08 AM, João Távora wrote:
> - However, LSP support for fontification seems like it's potentially
>    _less_ efficient than integrating something like tree-sitter as a C
>    module in Emacs.  That's because the contents of the buffer and
>    fontification results are continually transmitted back and forth via
>    pipes and JSON format.

I imagine these potential performance issues would also be exacerbated 
by editing over TRAMP. Currently, the latest development builds of Eglot 
work nicely with TRAMP files, but having to send fontification results 
back to the local Emacs instance could be a problem over slow connections.

Having something built into Emacs (as much as possible) would also have 
the benefit of allowing users to read a properly-fontified source file 
even for languages they haven't installed tools for. For example, I 
might want to read a C# source file occasionally, despite not having a 
C# compiler/LSP server.

- Jim




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:34       ` João Távora
  2021-06-04 10:43         ` Eli Zaretskii
@ 2021-06-04 18:25         ` Stefan Monnier
  2021-06-04 18:36           ` Daniel Colascione
  2021-06-04 19:07           ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-04 18:25 UTC (permalink / raw)
  To: João Távora; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel

> But I don't understand what's stopping these tree-sitter C modules (like
> [1] and [2]) to have access to the buffer's contents directly and have
> the best of both worlds.

I think it's a direct result of them being "modules": the API doesn't
let modules access a buffer's content directly, so it's more efficient
copy the content via `buffer-substring` and toss it on the other side
than having to use something like `char-after`.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 15:54 ` Alan Mackenzie
@ 2021-06-04 18:30   ` Daniel Colascione
  2021-06-06 11:37     ` Alan Mackenzie
  2021-06-05 20:25   ` Dmitry Gutov
  1 sibling, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04 18:30 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On 6/4/21 8:54 AM, Alan Mackenzie wrote:
>> Is there any *general* way that we can make fontification more robust
>> and consistent?
> Like other people have said on the thread, rewriting CC Mode to use an
> LSP parser.
>
> Less drastically, it would be possible to fix the specific bug you
> allude to, by the user making a list of types and configuring CC Mode
> with them, rather than attempting to recognise such types.  This feels
> as though it would be tedious to use, though.

I understand that cc-mode can't always get it right. It's only 
asymptotically omniscient. :-) Some deficiencies in highlighting are 
bound to happen.

What's striking to me is the inconsistency in the highlighting. None of 
the types in the std::variant declaration in my screenshot is special. 
They're all declared in the same file as the std::variant typedef. So 
why is PrimitiveType fontified while the others aren't?

FWIW, fontification is correct and consistent when I set 
font-lock-support-mode to nil, so this really does look like another 
case of getting unlucky with jit-lock block divisions.

Yes, I'm sure that this particular problem is caused by some bug, and 
with the right repro, we can quickly isolate and fix it. But this kind 
of seemingly-inexplicable inconsistent highlighting has been happening 
for years and years now. There's something fundamental about the way 
cc-mode is written that makes bugs like this keep popping up. Is there 
some internal abstraction we can add, some algorithmic test suite we can 
write, that would make this whole class of bug less likely?
>> For years and years now, I've been thinking we just need more
>> deterministic parser-and-based mode support, and I still think that, but
>> on a realistic level, that doesn't seem to be coming any time soon.
> What does "parser-and-based" mean?

I'd meant to type "parser-and-ast" I think.

>> In the meantime, is there any general approach we might be able to use
>> to get stuff like the attached to stop happening?
> Probably none that we'd like.  Fontifying types only at their point of
> declaration would be one, but I don't think people would want that.  My
> impression is that the approach taken by CC Mode, like that of most
> language modes in Emacs, has pretty much reached the limits of what's
> possible, and it is unreasonable to expect perfect fontification (and
> indentation) from languages like C++ in all cases.
>



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:25         ` Stefan Monnier
@ 2021-06-04 18:36           ` Daniel Colascione
  2021-06-04 19:11             ` Eli Zaretskii
  2021-06-05  0:29             ` Stefan Monnier
  2021-06-04 19:07           ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04 18:36 UTC (permalink / raw)
  To: Stefan Monnier, João Távora; +Cc: Eli Zaretskii, emacs-devel

On 6/4/21 11:25 AM, Stefan Monnier wrote:
>> But I don't understand what's stopping these tree-sitter C modules (like
>> [1] and [2]) to have access to the buffer's contents directly and have
>> the best of both worlds.
> I think it's a direct result of them being "modules": the API doesn't
> let modules access a buffer's content directly, so it's more efficient
> copy the content via `buffer-substring` and toss it on the other side
> than having to use something like `char-after`.

The problem is more fundamental than that. Internally, each buffer has a 
gap. External tools that operate on char arrays don't expect a gap. 
(They also don't expect to operate on Emacs internal coding, but that's 
another issue.) If we *did* grant direct buffer access via modules, we'd 
at least have to memcpy half (on average) the buffer to close the gap, 
then memcpy half the buffer (on average) to open the gap again when we 
began editing. If we're going to copy anyway, let's just copy via the 
buffer-substring interface. There's no reason that it has to be 
particularly inefficient.

Besides, memory copies are really, really, ridiculously fast. My system 
can cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's 
contents so we can give it to tree-sitter should be no issue at all.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:25         ` Stefan Monnier
  2021-06-04 18:36           ` Daniel Colascione
@ 2021-06-04 19:07           ` Eli Zaretskii
  2021-06-04 19:26             ` Daniel Colascione
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dancol, joaotavora, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>   emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 14:25:23 -0400
> 
> > But I don't understand what's stopping these tree-sitter C modules (like
> > [1] and [2]) to have access to the buffer's contents directly and have
> > the best of both worlds.
> 
> I think it's a direct result of them being "modules"

In the _real_ integration of those into Emacs, there's no reason for
them to be modules.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:36           ` Daniel Colascione
@ 2021-06-04 19:11             ` Eli Zaretskii
  2021-06-04 19:16               ` Daniel Colascione
  2021-06-05  0:29             ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:11 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 4 Jun 2021 11:36:05 -0700
> 
> On 6/4/21 11:25 AM, Stefan Monnier wrote:
> >> But I don't understand what's stopping these tree-sitter C modules (like
> >> [1] and [2]) to have access to the buffer's contents directly and have
> >> the best of both worlds.
> > I think it's a direct result of them being "modules": the API doesn't
> > let modules access a buffer's content directly, so it's more efficient
> > copy the content via `buffer-substring` and toss it on the other side
> > than having to use something like `char-after`.
> 
> The problem is more fundamental than that. Internally, each buffer has a 
> gap. External tools that operate on char arrays don't expect a gap. 
> (They also don't expect to operate on Emacs internal coding, but that's 
> another issue.) If we *did* grant direct buffer access via modules, we'd 
> at least have to memcpy half (on average) the buffer to close the gap, 
> then memcpy half the buffer (on average) to open the gap again when we 
> began editing.

I see no reason for copying, nor for making these tools aware of the
gap.  At least tree-sitter allows the application to provide a
function through which tree-sitter will access the edited text.  It
should be simple to write such a function, because on the C level we
always know where the gap is.

> Besides, memory copies are really, really, ridiculously fast. My system 
> can cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's 
> contents so we can give it to tree-sitter should be no issue at all.

Why copy at all? all these libraries need is access to buffer text.
We can just give it to them.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:11             ` Eli Zaretskii
@ 2021-06-04 19:16               ` Daniel Colascione
  2021-06-04 19:26                 ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04 19:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, monnier, joaotavora



On June 4, 2021 12:11:35 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 4 Jun 2021 11:36:05 -0700
>>
>> On 6/4/21 11:25 AM, Stefan Monnier wrote:
>>>> But I don't understand what's stopping these tree-sitter C modules (like
>>>> [1] and [2]) to have access to the buffer's contents directly and have
>>>> the best of both worlds.
>>> I think it's a direct result of them being "modules": the API doesn't
>>> let modules access a buffer's content directly, so it's more efficient
>>> copy the content via `buffer-substring` and toss it on the other side
>>> than having to use something like `char-after`.
>>
>> The problem is more fundamental than that. Internally, each buffer has a
>> gap. External tools that operate on char arrays don't expect a gap.
>> (They also don't expect to operate on Emacs internal coding, but that's
>> another issue.) If we *did* grant direct buffer access via modules, we'd
>> at least have to memcpy half (on average) the buffer to close the gap,
>> then memcpy half the buffer (on average) to open the gap again when we
>> began editing.
>
> I see no reason for copying, nor for making these tools aware of the
> gap.  At least tree-sitter allows the application to provide a
> function through which tree-sitter will access the edited text.  It
> should be simple to write such a function, because on the C level we
> always know where the gap is.

So you propose providing a "char get_buffer_char(size_t POS)" function? 
That *is* copying If you run that over all values of POS, all you've done 
is make a slow and shitty memcpy.

So you want to amortize the call over several characters? Okay. Now you've 
reinvented buffer-substring.

>
>
>> Besides, memory copies are really, really, ridiculously fast. My system
>> can cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's
>> contents so we can give it to tree-sitter should be no issue at all.
>
> Why copy at all? all these libraries need is access to buffer text.
> We can just give it to them.

Because any kind of "access" to the buffer that doesn't expose the gap is 
going to be a copy anyway.





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:07           ` Eli Zaretskii
@ 2021-06-04 19:26             ` Daniel Colascione
  2021-06-04 19:32               ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04 19:26 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: joaotavora, emacs-devel



On June 4, 2021 12:09:32 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>> emacs-devel@gnu.org
>> Date: Fri, 04 Jun 2021 14:25:23 -0400
>>
>>> But I don't understand what's stopping these tree-sitter C modules (like
>>> [1] and [2]) to have access to the buffer's contents directly and have
>>> the best of both worlds.
>>
>> I think it's a direct result of them being "modules"
>
> In the _real_ integration of those into Emacs, there's no reason for
> them to be modules.

Eh. There's a benefit to keeping components loosely coupled






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:16               ` Daniel Colascione
@ 2021-06-04 19:26                 ` Eli Zaretskii
  2021-06-04 19:33                   ` Daniel Colascione
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:26 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <joaotavora@gmail.com>, <emacs-devel@gnu.org>
> Date: Fri, 04 Jun 2021 12:16:47 -0700
> 
> > I see no reason for copying, nor for making these tools aware of the
> > gap.  At least tree-sitter allows the application to provide a
> > function through which tree-sitter will access the edited text.  It
> > should be simple to write such a function, because on the C level we
> > always know where the gap is.
> 
> So you propose providing a "char get_buffer_char(size_t POS)" function? 
> That *is* copying If you run that over all values of POS, all you've done 
> is make a slow and shitty memcpy.

What do you think tree-sitter does with the fast copy you hand to it?
doesn't it walk it one character at a time?

And if you studied the tree-sitter's internals, and it uses
get_buffer_char as a means of copying text into its own buffer, then
perhaps we could ask tree-sitter developers to avoid the copy and use
the text directly.

> So you want to amortize the call over several characters? Okay. Now you've 
> reinvented buffer-substring.

buffer-substring is not just a copy of a chunk of text, it's much
more.  Even if eventually we need to use a memory copy, that'll run
circles around buffer-substring, and will avoid triggering GC.

> Because any kind of "access" to the buffer that doesn't expose the gap is 
> going to be a copy anyway.

The regexp routines aren't.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:26             ` Daniel Colascione
@ 2021-06-04 19:32               ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:32 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> From: Daniel Colascione <dancol@dancol.org>
> CC: <joaotavora@gmail.com>, <emacs-devel@gnu.org>
> Date: Fri, 04 Jun 2021 12:26:28 -0700
> 
> >> I think it's a direct result of them being "modules"
> >
> > In the _real_ integration of those into Emacs, there's no reason for
> > them to be modules.
> 
> Eh. There's a benefit to keeping components loosely coupled

That's an advantage, but we need to weigh it against the
disadvantages.  Maybe eventually we will decide it's worth making that
a module, but my point is that it isn't a restriction we cannot lift.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:26                 ` Eli Zaretskii
@ 2021-06-04 19:33                   ` Daniel Colascione
  2021-06-04 19:51                     ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-04 19:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, monnier, joaotavora



On June 4, 2021 12:26:47 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> CC: <monnier@iro.umontreal.ca>, <joaotavora@gmail.com>, <emacs-devel@gnu.org>
>> Date: Fri, 04 Jun 2021 12:16:47 -0700
>>
>>> I see no reason for copying, nor for making these tools aware of the
>>> gap.  At least tree-sitter allows the application to provide a
>>> function through which tree-sitter will access the edited text.  It
>>> should be simple to write such a function, because on the C level we
>>> always know where the gap is.
>>
>> So you propose providing a "char get_buffer_char(size_t POS)" function?
>> That *is* copying If you run that over all values of POS, all you've done
>> is make a slow and shitty memcpy.
>
> What do you think tree-sitter does with the fast copy you hand to it?
> doesn't it walk it one character at a time?
>
> And if you studied the tree-sitter's internals, and it uses
> get_buffer_char as a means of copying text into its own buffer, then
> perhaps we could ask tree-sitter developers to avoid the copy and use
> the text directly.

Teaching TS to use a generic cursor interface would be great.
>
>
>> So you want to amortize the call over several characters? Okay. Now you've
>> reinvented buffer-substring.
>
> buffer-substring is not just a copy of a chunk of text, it's much
> more.

The variant without text properties doesn't do much.

> Even if eventually we need to use a memory copy, that'll run
> circles around buffer-substring, and will avoid triggering GC.

Sure. I'm not opposed to adding an API that's basically a more efficient 
buffer substring for C callers. I'm just pointing out that the idea of 
giving TS "direct access" to a buffer without any copy at all doesn't make 
a lot of sense.


>
>
>> Because any kind of "access" to the buffer that doesn't expose the gap is
>> going to be a copy anyway.
>
> The regexp routines aren't.

The regexp routines have Emacs specific knowledge. My argument doesn't 
apply to code we can customize for Emacs.






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
       [not found]         ` <83k0n9l9pv.fsf@gnu.org>
@ 2021-06-04 19:41           ` Jim Porter
  2021-06-04 19:53             ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Jim Porter @ 2021-06-04 19:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

(Note: re-adding emacs-devel here, since I posted through Gmane and
attempted to eliminate dupe messages by posting only to the Gmane
mirror and not mailing the list directly. That was backwards, and I
should have removed the Gmane mirror, or perhaps just ignored the
issue and let the mailing list handle dupes.)

On Fri, Jun 4, 2021 at 12:18 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > Cc: Eli Zaretskii <eliz@gnu.org>, Daniel Colascione <dancol@dancol.org>,
> >  ubolonton@gmail.com
> > From: Jim Porter <jporterbugs@gmail.com>
> > Date: Fri, 4 Jun 2021 09:43:26 -0700
> >
> > On 6/4/2021 3:08 AM, João Távora wrote:
> > > - However, LSP support for fontification seems like it's potentially
> > >    _less_ efficient than integrating something like tree-sitter as a C
> > >    module in Emacs.  That's because the contents of the buffer and
> > >    fontification results are continually transmitted back and forth via
> > >    pipes and JSON format.
> >
> > I imagine these potential performance issues would also be exacerbated
> > by editing over TRAMP.
>
> Why?  Fontification is always local, even if the files you edit are on
> a remote host.

The way I understand this particular hypothetical is that Eglot would
be responsible for asking the LSP server for syntax highlighting and
would then do the necessary work to tell Emacs how to fontify the
buffer. Currently, the way Eglot works for remote files is that it
runs the LSP server on the remote host via TRAMP. That works out
nicely right now, but if we wanted to get the syntax highlighting from
the (remote) LSP server to the (local) Emacs instance, that data would
have to go through TRAMP. I'm not sure how much data we're talking
about here, but if there are performance concerns about doing this
locally via pipes, it would be exacerbated by going through a slow
network.

- Jim



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:33                   ` Daniel Colascione
@ 2021-06-04 19:51                     ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:51 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <joaotavora@gmail.com>, <emacs-devel@gnu.org>
> Date: Fri, 04 Jun 2021 12:33:25 -0700
> 
> > What do you think tree-sitter does with the fast copy you hand to it?
> > doesn't it walk it one character at a time?
> >
> > And if you studied the tree-sitter's internals, and it uses
> > get_buffer_char as a means of copying text into its own buffer, then
> > perhaps we could ask tree-sitter developers to avoid the copy and use
> > the text directly.
> 
> Teaching TS to use a generic cursor interface would be great.

I don't remember if I looked at how it does it now, but are you sure
it doesn't already know how to do that?  Sounds like a natural thing
to me, but maybe I'm missing something.

> > buffer-substring is not just a copy of a chunk of text, it's much
> > more.
> 
> The variant without text properties doesn't do much.

It allocates memory!  For a large buffer (think xdisp.c) that is best
avoided.  I hope if we need to memcpy, we could at least use a pointer
to a buffer allocated by the parser library, so we won't need to.

> > Even if eventually we need to use a memory copy, that'll run
> > circles around buffer-substring, and will avoid triggering GC.
> 
> Sure. I'm not opposed to adding an API that's basically a more efficient 
> buffer substring for C callers. I'm just pointing out that the idea of 
> giving TS "direct access" to a buffer without any copy at all doesn't make 
> a lot of sense.

If it can use that wisely, I don't see why it wouldn't make sense.  If
it cannot, then I agree.  But still, I'd rather not give up from the
get-go and use buffer-substring just because it's there, I'd try
looking for something more scalable and less Lisp-consing.

Also, I hope we could arrange the copying to be driven by the display
engine through the JIT font-lock machinery, rather than sending the
entire buffer or its large parts.

> >> Because any kind of "access" to the buffer that doesn't expose the gap is
> >> going to be a copy anyway.
> >
> > The regexp routines aren't.
> 
> The regexp routines have Emacs specific knowledge.

I mean the way regexp routines use the buffer text as a C string (as 2
C strings, actually).  That doesn't use any Emacs specific knowledge
except the gap, and even the latter is largely solved by the caller.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:41           ` Jim Porter
@ 2021-06-04 19:53             ` Eli Zaretskii
  2021-06-04 20:05               ` Jim Porter
  2021-06-04 20:14               ` Yuri Khan
  0 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:53 UTC (permalink / raw)
  To: Jim Porter; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

> From: Jim Porter <jporterbugs@gmail.com>
> Date: Fri, 4 Jun 2021 12:41:56 -0700
> Cc: joaotavora@gmail.com, theo@thornhill.no, dancol@dancol.org, 
> 	ubolonton@gmail.com, emacs-devel@gnu.org
> 
> Currently, the way Eglot works for remote files is that it
> runs the LSP server on the remote host via TRAMP.

Why does it do that?  Does the LSP server have to access the file
itself?  We have all the contents of that file locally in a buffer, so
we could hand it to LSP locally.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:53             ` Eli Zaretskii
@ 2021-06-04 20:05               ` Jim Porter
  2021-06-04 20:11                 ` Joost Kremers
  2021-06-05  6:41                 ` Eli Zaretskii
  2021-06-04 20:14               ` Yuri Khan
  1 sibling, 2 replies; 206+ messages in thread
From: Jim Porter @ 2021-06-04 20:05 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: ubolonton, Daniel Colascione, Theodor Thornhill,
	João Távora, emacs-devel

On Fri, Jun 4, 2021 at 12:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Jim Porter <jporterbugs@gmail.com>
> > Date: Fri, 4 Jun 2021 12:41:56 -0700
> > Cc: joaotavora@gmail.com, theo@thornhill.no, dancol@dancol.org,
> >       ubolonton@gmail.com, emacs-devel@gnu.org
> >
> > Currently, the way Eglot works for remote files is that it
> > runs the LSP server on the remote host via TRAMP.
>
> Why does it do that?  Does the LSP server have to access the file
> itself?  We have all the contents of that file locally in a buffer, so
> we could hand it to LSP locally.

I'm not an expert on the internals of LSP servers, but it's my
understanding that for a language server like clangd, it needs access
not just to the current file, but the entire source tree[1]. That
allows for things like completion of member function names of classes
defined in another file, etc. For clangd in particular, it might be
possible to run a local clangd that pulls from a remote index[2], but
I don't know if every LSP server has such capabilities.

Moreover, in my own usage of Eglot, I find it very convenient that it
runs the LSP server remotely. I often find myself files remotely over
TRAMP from a local machine with a minimal set of devtools. While I
could install all the LSP servers I need on all the machines I connect
from, it's less effort to rely on the fact that the machine that'll be
doing the compilation has all the devtools I need.

- Jim

[1] As well as instructions about how to *build* the source, contained
in `compile_commands.json'.
[2] https://clangd.llvm.org/remote-index.html



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 20:05               ` Jim Porter
@ 2021-06-04 20:11                 ` Joost Kremers
  2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05  6:41                 ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Joost Kremers @ 2021-06-04 20:11 UTC (permalink / raw)
  To: emacs-devel


On Fri, Jun 04 2021, Jim Porter wrote:
> On Fri, Jun 4, 2021 at 12:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
>> Why does it do that?  Does the LSP server have to access the file
>> itself?  We have all the contents of that file locally in a buffer, so
>> we could hand it to LSP locally.
>
> I'm not an expert on the internals of LSP servers, but it's my
> understanding that for a language server like clangd, it needs access
> not just to the current file, but the entire source tree[1].

And speaking from my experience with lsp-mode (not eglot) and Python, it needs
access to the entire virtual env so it can provide type information and
completions for built-in Python packages and for 3rd-party packages that you use
your code.


-- 
Joost Kremers
Life has its moments



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:53             ` Eli Zaretskii
  2021-06-04 20:05               ` Jim Porter
@ 2021-06-04 20:14               ` Yuri Khan
  1 sibling, 0 replies; 206+ messages in thread
From: Yuri Khan @ 2021-06-04 20:14 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Jim Porter, ubolonton, theo, Emacs developers,
	João Távora, Daniel Colascione

On Sat, 5 Jun 2021 at 02:53, Eli Zaretskii <eliz@gnu.org> wrote:

> > Currently, the way Eglot works for remote files is that it
> > runs the LSP server on the remote host via TRAMP.
>
> Why does it do that?  Does the LSP server have to access the file
> itself?  We have all the contents of that file locally in a buffer, so
> we could hand it to LSP locally.

The contents of the file are not sufficient to parse it. A C file will
#include some headers. A Python program will import some modules. Some
of these (e.g. the standard library) will likely be installed both
locally and remotely, but might be different versions. Some
(first-party dependencies) will be resolvable from the source file.
Some (third-party dependencies) will be installed remotely but not
locally.

A useful pattern is to build a Docker container, mount the source tree
as a volume, install the toolchain and any third-party dependencies
into the container, and run the LSP server in there. This way, these
dependencies do not contaminate the developer’s machine, while still
being available to the LSP server. The container can also run
different versions of compilers, interpreters, etc. than are installed
on the developer’s machine.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:36           ` Daniel Colascione
  2021-06-04 19:11             ` Eli Zaretskii
@ 2021-06-05  0:29             ` Stefan Monnier
  2021-06-05  6:32               ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-05  0:29 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: João Távora, Eli Zaretskii, emacs-devel

> The problem is more fundamental than that. Internally, each buffer has
> a gap. External tools that operate on char arrays don't expect a gap. (They

Yes, there's that as well.

> Besides, memory copies are really, really, ridiculously fast. My system can
>  cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's contents
> so we can give it to tree-sitter should be no issue at all.

Yes, beside the potential difficulty of giving direct access to the
buffer's content, there's the fact that the time needed to make a copy
will be dwarfed by the time needed by tree-sitter to parse it, turn it
into a tree, and for us to process the returned parse tree (unless we
copy a lot more than the part that tree-sitter parses, admittedly, but
presumably we shouldn't need to copy text at which tree-sitter won't
look).


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  0:29             ` Stefan Monnier
@ 2021-06-05  6:32               ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05  6:32 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dancol, joaotavora, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: João Távora <joaotavora@gmail.com>,  Eli
>  Zaretskii <eliz@gnu.org>,
>   emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 20:29:02 -0400
> 
> > Besides, memory copies are really, really, ridiculously fast. My system can
> >  cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's contents
> > so we can give it to tree-sitter should be no issue at all.
> 
> Yes, beside the potential difficulty of giving direct access to the
> buffer's content, there's the fact that the time needed to make a copy
> will be dwarfed by the time needed by tree-sitter to parse it, turn it
> into a tree, and for us to process the returned parse tree

Are you sure?  Tree-sitter advertises itself as being very fast in
that department.  Do we have any benchmark somewhere showing its
parsing speed?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 20:05               ` Jim Porter
  2021-06-04 20:11                 ` Joost Kremers
@ 2021-06-05  6:41                 ` Eli Zaretskii
  2021-06-05  9:32                   ` João Távora
  2021-06-05  9:46                   ` Ergus
  1 sibling, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05  6:41 UTC (permalink / raw)
  To: Jim Porter; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

> From: Jim Porter <jporterbugs@gmail.com>
> Date: Fri, 4 Jun 2021 13:05:40 -0700
> Cc: João Távora <joaotavora@gmail.com>, 
> 	Theodor Thornhill <theo@thornhill.no>, Daniel Colascione <dancol@dancol.org>, ubolonton@gmail.com, 
> 	emacs-devel@gnu.org
> 
> On Fri, Jun 4, 2021 at 12:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > > From: Jim Porter <jporterbugs@gmail.com>
> > > Date: Fri, 4 Jun 2021 12:41:56 -0700
> > > Cc: joaotavora@gmail.com, theo@thornhill.no, dancol@dancol.org,
> > >       ubolonton@gmail.com, emacs-devel@gnu.org
> > >
> > > Currently, the way Eglot works for remote files is that it
> > > runs the LSP server on the remote host via TRAMP.
> >
> > Why does it do that?  Does the LSP server have to access the file
> > itself?  We have all the contents of that file locally in a buffer, so
> > we could hand it to LSP locally.
> 
> I'm not an expert on the internals of LSP servers, but it's my
> understanding that for a language server like clangd, it needs access
> not just to the current file, but the entire source tree[1].

I see, thanks.

So is Emacs the only editor using LSP with remote files?  If other
editors support that, how do they solve this problem without incurring
delays?

> Moreover, in my own usage of Eglot, I find it very convenient that it
> runs the LSP server remotely. I often find myself files remotely over
> TRAMP from a local machine with a minimal set of devtools. While I
> could install all the LSP servers I need on all the machines I connect
> from, it's less effort to rely on the fact that the machine that'll be
> doing the compilation has all the devtools I need.

That sounds like a use case for running Emacs on the remote machine,
and only having the display on the local machine, like via X
forwarding or similar technology?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 20:11                 ` Joost Kremers
@ 2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05 10:14                     ` Joost Kremers
                                       ` (2 more replies)
  0 siblings, 3 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05  6:51 UTC (permalink / raw)
  To: Joost Kremers; +Cc: emacs-devel

> From: Joost Kremers <joostkremers@fastmail.fm>
> Date: Fri, 04 Jun 2021 22:11:06 +0200
> 
> > I'm not an expert on the internals of LSP servers, but it's my
> > understanding that for a language server like clangd, it needs access
> > not just to the current file, but the entire source tree[1].
> 
> And speaking from my experience with lsp-mode (not eglot) and Python, it needs
> access to the entire virtual env so it can provide type information and
> completions for built-in Python packages and for 3rd-party packages that you use
> your code.

That cannot be a mandatory requirement, right?  Because otherwise LSP
wouldn't be able to support editing of an unfinished project, where
not everything is laid out 100% yet.  The user will expect that some
completion cases could be inaccurate when not everything is coded yet,
but the user will NOT expect to see inaccurate "syntax highlighting"
or indentation, nor incorrect "show definition" and "show callers"
results for the code that was already written, and in particular for
the code in the file being edited.

Thus, I'd expect LSP to be able to deal with missing information,
which then means it shouldn't require access to the entire tree as a
prerequisite for useful functionality.

Am I missing something?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:41                 ` Eli Zaretskii
@ 2021-06-05  9:32                   ` João Távora
  2021-06-05  9:59                     ` Ergus
  2021-06-05 11:25                     ` cc-mode fontification feels random Eli Zaretskii
  2021-06-05  9:46                   ` Ergus
  1 sibling, 2 replies; 206+ messages in thread
From: João Távora @ 2021-06-05  9:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jim Porter, ubolonton, dancol, theo, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> That sounds like a use case for running Emacs on the remote machine,
> and only having the display on the local machine, like via X
> forwarding or similar technology?

Then why use TRAMP at all?  Anyway, this is just an aside, but running
the server remotely and also editing the files via TRAMP is still
popular and predates LSP by many years: Many use the SLIME or SLY Common
Lisp IDEs like that, pretty effectively.  Personally I prefer running an
Emacs on the remote machine, but there's clearly a share of users who
like to keep use one local Emacs for everything.

João



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:41                 ` Eli Zaretskii
  2021-06-05  9:32                   ` João Távora
@ 2021-06-05  9:46                   ` Ergus
  2021-06-05 11:27                     ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-05  9:46 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Jim Porter, ubolonton, dancol, theo, joaotavora, emacs-devel

On Sat, Jun 05, 2021 at 09:41:15AM +0300, Eli Zaretskii wrote:
>
>I see, thanks.
>
>So is Emacs the only editor using LSP with remote files?  If other
>editors support that, how do they solve this problem without incurring
>delays?
>
I work with servers all the time and I have tried all kind of tools for
remote editing.

  So far the only other editor for remote files supporting completions
that just works and I am aware of; it is Visual Studio [code]
family. And actually, they use LSP protocol for that (completion and
indentation).

If I understand more or less how it works in VS it seems like the LSP
server runs locally (because it does not require any remote
modification/installation or so).

They do a kind of local mirror for completion (probably something
similar to sshfs to access all the unmodified files "on demand" and get
the best possible information) and they store a cache of the project in
the local filesystem to avoid recompiling everything the next time they
use the project..

Of course, there are some problems when the remote environment is not
available locally (missing modules, compilers, libraries). But in
general it is easier to install/modify locally than remotely (in our
tramp approach + lsp-mode or eglot; if clangd is not installed in the
remote server the user have nothing at all... and installing clangd in
every single remote system we use is not an option due to time,
permissions or resources.).

It seems that there are some heuristics there too, to reduce errors
exposes to the user and do the best possible, but in general it works
pretty well, specially for C/C++, jacascript, nodejs, and python.

>> Moreover, in my own usage of Eglot, I find it very convenient that it
>> runs the LSP server remotely. I often find myself files remotely over
>> TRAMP from a local machine with a minimal set of devtools. While I
>> could install all the LSP servers I need on all the machines I connect
>> from, it's less effort to rely on the fact that the machine that'll be
>> doing the compilation has all the devtools I need.
>
>That sounds like a use case for running Emacs on the remote machine,
>and only having the display on the local machine, like via X
>forwarding or similar technology?
>
Some time ago, one of my first questions in this mailing list was how to
run emacsserver on a remote machine and connect to that with the local
emacsclient. On that moment that was not supported.

The approach any way has some practical issues. 

1) We can't always open ports in the remote server; so we need at the
end to do a proxy throw ssh.

2) Many remote servers (for example login nodes in HPC servers) kill
the running processes if the user disconnects or if the process is in
the background.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:32                   ` João Távora
@ 2021-06-05  9:59                     ` Ergus
  2021-06-05 11:29                       ` Eli Zaretskii
  2021-06-05 13:59                       ` Remote GUI Emacs really works (was: cc-mode fontification feels random) Óscar Fuentes
  2021-06-05 11:25                     ` cc-mode fontification feels random Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Ergus @ 2021-06-05  9:59 UTC (permalink / raw)
  To: João Távora
  Cc: Eli Zaretskii, Jim Porter, ubolonton, dancol, theo, emacs-devel

On Sat, Jun 05, 2021 at 10:32:12AM +0100, Jo�o T�vora wrote:
>Eli Zaretskii <eliz@gnu.org> writes:
>
>> That sounds like a use case for running Emacs on the remote machine,
>> and only having the display on the local machine, like via X
>> forwarding or similar technology?
>
>Then why use TRAMP at all?  Anyway, this is just an aside, but running
>the server remotely and also editing the files via TRAMP is still
>popular and predates LSP by many years: Many use the SLIME or SLY Common
>Lisp IDEs like that, pretty effectively.  Personally I prefer running an
>Emacs on the remote machine, but there's clearly a share of users who
>like to keep use one local Emacs for everything.
>
>Jo�o
>
Usually running a remote emacs is extremely slow if using gui and
creates all kind of issues if the connection fails or hang.

When using tui there are also some issues due to terminfo in the remote
system; because the local TERM is informed to the remote system when ssh
connection starts, but if the remote system does not have terminfo for
that term, then it tries to do the best (use a default). In that case,
for normal uses it just works, even for vim and nano; but is seems like
emacs tries to use more advanced features or characters. Also a
connection hang is very problematic because emacs totally blocks and you
lost your changes. And packages like xclip doesn't work (as expected).

The other issue is also "resources", when using remote systems like
raspberry pi or permissions restricted environments, installing emacs
remotely is not always possible either.








^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:51                   ` Eli Zaretskii
@ 2021-06-05 10:14                     ` Joost Kremers
  2021-06-05 11:31                       ` Eli Zaretskii
  2021-06-05 13:23                     ` Stefan Monnier
  2021-06-05 18:46                     ` João Távora
  2 siblings, 1 reply; 206+ messages in thread
From: Joost Kremers @ 2021-06-05 10:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On Sat, Jun 05 2021, Eli Zaretskii wrote:
> Thus, I'd expect LSP to be able to deal with missing information,
> which then means it shouldn't require access to the entire tree as a
> prerequisite for useful functionality.

Of course the LSP server won't just give up if there is missing information, but
its functionality will be reduced. For me, if my development environment were on
a remote machine and I couldn't (or don't want to) replicate that environment on
my local machine, running the LSP server locally would probably take away much
of the appeal of using one.


-- 
Joost Kremers
Life has its moments



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:32                   ` João Távora
  2021-06-05  9:59                     ` Ergus
@ 2021-06-05 11:25                     ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:25 UTC (permalink / raw)
  To: João Távora; +Cc: jporterbugs, ubolonton, dancol, theo, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Jim Porter <jporterbugs@gmail.com>,  theo@thornhill.no,
>   dancol@dancol.org,  ubolonton@gmail.com,  emacs-devel@gnu.org
> Date: Sat, 05 Jun 2021 10:32:12 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > That sounds like a use case for running Emacs on the remote machine,
> > and only having the display on the local machine, like via X
> > forwarding or similar technology?
> 
> Then why use TRAMP at all?

Because it could be the other way around: the local machine has the
full development environment available, while the remote one doesn't.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:46                   ` Ergus
@ 2021-06-05 11:27                     ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:27 UTC (permalink / raw)
  To: Ergus; +Cc: jporterbugs, ubolonton, theo, emacs-devel, joaotavora, dancol

> Date: Sat, 5 Jun 2021 11:46:39 +0200
> From: Ergus <spacibba@aol.com>
> Cc: Jim Porter <jporterbugs@gmail.com>, ubolonton@gmail.com,
> 	dancol@dancol.org, theo@thornhill.no, joaotavora@gmail.com,
> 	emacs-devel@gnu.org
> 
> >That sounds like a use case for running Emacs on the remote machine,
> >and only having the display on the local machine, like via X
> >forwarding or similar technology?
> >
> Some time ago, one of my first questions in this mailing list was how to
> run emacsserver on a remote machine and connect to that with the local
> emacsclient. On that moment that was not supported.

I didn't mean to suggest using emacsclient, I meant to suggest a
remote display via the X capabilities.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:59                     ` Ergus
@ 2021-06-05 11:29                       ` Eli Zaretskii
  2021-06-05 11:55                         ` Daniel Colascione
  2021-06-05 12:43                         ` Ergus
  2021-06-05 13:59                       ` Remote GUI Emacs really works (was: cc-mode fontification feels random) Óscar Fuentes
  1 sibling, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:29 UTC (permalink / raw)
  To: Ergus; +Cc: jporterbugs, ubolonton, theo, emacs-devel, joaotavora, dancol

> Date: Sat, 5 Jun 2021 11:59:04 +0200
> From: Ergus <spacibba@aol.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
> 	emacs-devel@gnu.org
> 
> Usually running a remote emacs is extremely slow if using gui and
> creates all kind of issues if the connection fails or hang.

And using Tramp with bad connections doesn't create any issues?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 10:14                     ` Joost Kremers
@ 2021-06-05 11:31                       ` Eli Zaretskii
  2021-06-05 12:12                         ` Joost Kremers
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:31 UTC (permalink / raw)
  To: Joost Kremers; +Cc: emacs-devel

> From: Joost Kremers <joostkremers@fastmail.fm>
> Cc: emacs-devel@gnu.org
> Date: Sat, 05 Jun 2021 12:14:42 +0200
> 
> Of course the LSP server won't just give up if there is missing information, but
> its functionality will be reduced. For me, if my development environment were on
> a remote machine and I couldn't (or don't want to) replicate that environment on
> my local machine, running the LSP server locally would probably take away much
> of the appeal of using one.

And the speedup doesn't matter?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:29                       ` Eli Zaretskii
@ 2021-06-05 11:55                         ` Daniel Colascione
  2021-06-05 12:27                           ` Eli Zaretskii
  2021-06-05 12:43                         ` Ergus
  1 sibling, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-05 11:55 UTC (permalink / raw)
  To: Eli Zaretskii, Ergus
  Cc: jporterbugs, ubolonton, theo, joaotavora, emacs-devel

On 6/5/21 4:29 AM, Eli Zaretskii wrote:
>> Date: Sat, 5 Jun 2021 11:59:04 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
>> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
>> 	emacs-devel@gnu.org
>>
>> Usually running a remote emacs is extremely slow if using gui and
>> creates all kind of issues if the connection fails or hang.
> 
> And using Tramp with bad connections doesn't create any issues?

Fewer than running a remote Emacs: you don't interact with Tramp on each 
keystroke. There are tons of advantages to Tramp; it's a first-class 
feature and it's worth making language server support work properly with it.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:31                       ` Eli Zaretskii
@ 2021-06-05 12:12                         ` Joost Kremers
  0 siblings, 0 replies; 206+ messages in thread
From: Joost Kremers @ 2021-06-05 12:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On Sat, Jun 05 2021, Eli Zaretskii wrote:
>> From: Joost Kremers <joostkremers@fastmail.fm>
>> Cc: emacs-devel@gnu.org
>> Date: Sat, 05 Jun 2021 12:14:42 +0200
>> 
>> Of course the LSP server won't just give up if there is missing information,
>> but its functionality will be reduced. For me, if my development environment
>> were on a remote machine and I couldn't (or don't want to) replicate that
>> environment on my local machine, running the LSP server locally would
>> probably take away much of the appeal of using one.
>
> And the speedup doesn't matter?

I guess it would be a trade-off. Last time I had to run my code on a remote
machine I wasn't using lsp-mode yet, so I'm not speaking from experience here. I
just wanted to say that running an LSP server without the full development
environment makes the LSP server less capable, and if all the server does is
provide syntax highlighting and indentation, it might be less of a hassle to use
something else instead (e.g., elpy, or just plain old python-mode).

-- 
Joost Kremers
Life has its moments



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:55                         ` Daniel Colascione
@ 2021-06-05 12:27                           ` Eli Zaretskii
  2021-06-05 17:59                             ` Jim Porter
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05 12:27 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: jporterbugs, spacibba, theo, ubolonton, emacs-devel, joaotavora

> Cc: joaotavora@gmail.com, jporterbugs@gmail.com, ubolonton@gmail.com,
>  theo@thornhill.no, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Sat, 5 Jun 2021 04:55:21 -0700
> 
> On 6/5/21 4:29 AM, Eli Zaretskii wrote:
> >> Date: Sat, 5 Jun 2021 11:59:04 +0200
> >> From: Ergus <spacibba@aol.com>
> >> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
> >> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
> >> 	emacs-devel@gnu.org
> >>
> >> Usually running a remote emacs is extremely slow if using gui and
> >> creates all kind of issues if the connection fails or hang.
> > 
> > And using Tramp with bad connections doesn't create any issues?
> 
> Fewer than running a remote Emacs: you don't interact with Tramp on each 
> keystroke. There are tons of advantages to Tramp; it's a first-class 
> feature and it's worth making language server support work properly with it.

No argument that we need to support that properly.  This sub-thread
started when someone said it would probably be slow with Tramp in the
loop.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:29                       ` Eli Zaretskii
  2021-06-05 11:55                         ` Daniel Colascione
@ 2021-06-05 12:43                         ` Ergus
  1 sibling, 0 replies; 206+ messages in thread
From: Ergus @ 2021-06-05 12:43 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: joaotavora, jporterbugs, ubolonton, dancol, theo, emacs-devel

On Sat, Jun 05, 2021 at 02:29:15PM +0300, Eli Zaretskii wrote:
>> Date: Sat, 5 Jun 2021 11:59:04 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
>> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
>> 	emacs-devel@gnu.org
>>
>> Usually running a remote emacs is extremely slow if using gui and
>> creates all kind of issues if the connection fails or hang.
>
>And using Tramp with bad connections doesn't create any issues?

Yes, but much much less than forwarding X over ssh or even use emacs on
tty over ssh.

ex: Tramp is capable to reconnect when the connection fails for some
minutes, and don't exchange a so huge amount of information constantly
over the network, so a poor bandwidth almost does not affected either.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05 10:14                     ` Joost Kremers
@ 2021-06-05 13:23                     ` Stefan Monnier
  2021-06-05 17:08                       ` Óscar Fuentes
  2021-06-05 18:46                     ` João Távora
  2 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-05 13:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Joost Kremers, emacs-devel

> not everything is laid out 100% yet.  The user will expect that some
> completion cases could be inaccurate when not everything is coded yet,
> but the user will NOT expect to see inaccurate "syntax highlighting"
> or indentation, nor incorrect "show definition" and "show callers"
> results for the code that was already written, and in particular for
> the code in the file being edited.

I think that's where tree-sitter shines, because AFAIK it does not rely
on access to other files.

I think you'd expect a good LSP server to "degrade gracefully" and still
provide good info for indentation and syntax highlighting even if you
only have the one file and all the other files in the project
are missing.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Remote GUI Emacs really works (was: cc-mode fontification feels random)
  2021-06-05  9:59                     ` Ergus
  2021-06-05 11:29                       ` Eli Zaretskii
@ 2021-06-05 13:59                       ` Óscar Fuentes
  1 sibling, 0 replies; 206+ messages in thread
From: Óscar Fuentes @ 2021-06-05 13:59 UTC (permalink / raw)
  To: emacs-devel

Ergus <spacibba@aol.com> writes:

> Usually running a remote emacs is extremely slow if using gui and
> creates all kind of issues if the connection fails or hang.

Use the right method: something based on the NX protocol, like x2go.

Until a year ago, I used that with no issues on ADSL lines with 60KBps
upstream bandwidth. Connection failures are a non-issue: the session is
kept live on the remote machine, you simply reconnect to it and
everything comes back as it was. The same mechanism allows you to
suspend and resume the session at your convenience.

Text-based applications such as Emacs work very well with this setup
over slow networks, just some tens of KBps are enough. As long as you
don't have too much latency (as it is often the case with cellular
networks) the experience is quite good.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 13:23                     ` Stefan Monnier
@ 2021-06-05 17:08                       ` Óscar Fuentes
  2021-06-05 17:31                         ` Stefan Monnier
  2021-06-05 17:32                         ` Eli Zaretskii
  0 siblings, 2 replies; 206+ messages in thread
From: Óscar Fuentes @ 2021-06-05 17:08 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> not everything is laid out 100% yet.  The user will expect that some
>> completion cases could be inaccurate when not everything is coded yet,
>> but the user will NOT expect to see inaccurate "syntax highlighting"
>> or indentation, nor incorrect "show definition" and "show callers"
>> results for the code that was already written, and in particular for
>> the code in the file being edited.
>
> I think that's where tree-sitter shines, because AFAIK it does not rely
> on access to other files.

I took a look at tree-sitter and, IIUC, it suffers from the same
limitations as CC mode: it gets the information provided by a parser.

For starts, in C++, being limited to the current file means that it is
unable to determine if Foo::bar is a type, a value or a function when
Foo is defined on a header file.

But most fundamentally, it is unable to determine what
Foo<whatever>::bar is even when it is defined on the current file.

If we are going to really modernize Emacs' programming language support
we need to provide more than parser-based syntax highlighting and
indentation. We need smart code completion, code hints, transformations,
etc. That means we need something like LSP. Tree-sitter migth be useful
for the languages not yet supported by LSP, though (but, if I got it
right, tree-sitter is implemented on Javascript, so it requires a JS
engine to work, maybe too much of a dependency for something that
doesn't add that much over what we have now.)

> I think you'd expect a good LSP server to "degrade gracefully" and still
> provide good info for indentation and syntax highlighting even if you
> only have the one file and all the other files in the project
> are missing.

As already mentioned elsewhere on this thread, an LSP server with access
to just the current file is severely handicapped. One thing is to miss
the information about some functions yet-to-be-written and another thing
entirely is to ignore everything not defined on the current file.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 17:08                       ` Óscar Fuentes
@ 2021-06-05 17:31                         ` Stefan Monnier
  2021-06-05 17:32                         ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-05 17:31 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

>> I think that's where tree-sitter shines, because AFAIK it does not rely
>> on access to other files.
>
> I took a look at tree-sitter and, IIUC, it suffers from the same
> limitations as CC mode: it gets the information provided by a parser.

To a large extent that's unavoidable.  AFAIU it should be able to do
a slightly better job in some cases by just trying out all possible
interpretations and only keeping those that make sense (from a purely
syntactic point of view).

> But most fundamentally, it is unable to determine what
> Foo<whatever>::bar is even when it is defined on the current file.

Indeed it's quite possible that there are also cases where tree-sitter
does a worse job than CC-mode, e.g. by not taking into account semantic
information that can be extracted from the current file.

> If we are going to really modernize Emacs' programming language
> support we need to provide more than parser-based syntax highlighting
> and indentation. [...] That means we need something like LSP.

I believe/hope this is obvious to everyone, yes.

> Tree-sitter migth be useful for the languages not yet supported by
> LSP, though

My impression is that tree-sitter might be useful for
syntax-highlighting and indentation.  I'm not sure how well those two
features are supported/handled by LSP servers and clients currently.

> (but, if I got it right, tree-sitter is implemented on Javascript,

AFAIK only the source grammars and the grammar-compiler is written in
Javascript: the parsing engine is written in C and exposed as
a C library.

>> I think you'd expect a good LSP server to "degrade gracefully" and still
>> provide good info for indentation and syntax highlighting even if you
>> only have the one file and all the other files in the project
>> are missing.
> As already mentioned elsewhere on this thread, an LSP server with access
> to just the current file is severely handicapped.

Of course.  The question is whether it can still provide a good enough
behavior in that case compared to tree-sitter.  If not, it might be
an argument in favor of using both LSP and tree-sitter.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 17:08                       ` Óscar Fuentes
  2021-06-05 17:31                         ` Stefan Monnier
@ 2021-06-05 17:32                         ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-05 17:32 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Sat, 05 Jun 2021 19:08:13 +0200
> 
> If we are going to really modernize Emacs' programming language support
> we need to provide more than parser-based syntax highlighting and
> indentation. We need smart code completion, code hints, transformations,
> etc.

Yes, we need that, and much more.  But if we reject partial solutions
because they aren't 110% perfect with every PL out there, we get to
stay with what we have now, which is much worse.  So in my book
incremental improvements using contemporary technology are a win, even
if they don't get us all the way to the ultimate goal.  Let's not
discourage potential volunteers from taking up the job of bringing
stuff like tree-sitter to Emacs because it may not be perfect for some
demanding languages.

> That means we need something like LSP.

We need to try both these technologies, before we make the decision.
Each one of them has upsides and downsides, and it is therefore
unwise, IMO, to put all the eggs into a single basket.  Chances are we
will want to keep both solutions handy, because they can be
complementary.

> Tree-sitter migth be useful for the languages not yet supported by
> LSP, though (but, if I got it right, tree-sitter is implemented on
> Javascript, so it requires a JS engine to work, maybe too much of a
> dependency for something that doesn't add that much over what we
> have now.)

That's secondary, IMO.  If the main issues are solved satisfactorily,
I don't expect too much time to pass before someone comes up with a
way of producing the tree-sitter grammars in Emacs Lisp.

> > I think you'd expect a good LSP server to "degrade gracefully" and still
> > provide good info for indentation and syntax highlighting even if you
> > only have the one file and all the other files in the project
> > are missing.
> 
> As already mentioned elsewhere on this thread, an LSP server with access
> to just the current file is severely handicapped. One thing is to miss
> the information about some functions yet-to-be-written and another thing
> entirely is to ignore everything not defined on the current file.

Once again, my suggestion is not to require perfect solutions,
especially since what we have now is nowhere near perfection.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 12:27                           ` Eli Zaretskii
@ 2021-06-05 17:59                             ` Jim Porter
  2021-06-05 18:56                               ` Daniel Martín
  0 siblings, 1 reply; 206+ messages in thread
From: Jim Porter @ 2021-06-05 17:59 UTC (permalink / raw)
  To: Eli Zaretskii, Daniel Colascione
  Cc: joaotavora, spacibba, theo, ubolonton, emacs-devel

On 6/5/2021 5:27 AM, Eli Zaretskii wrote:
>> Cc: joaotavora@gmail.com, jporterbugs@gmail.com, ubolonton@gmail.com,
>>   theo@thornhill.no, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Sat, 5 Jun 2021 04:55:21 -0700
>>
>> Fewer than running a remote Emacs: you don't interact with Tramp on each
>> keystroke. There are tons of advantages to Tramp; it's a first-class
>> feature and it's worth making language server support work properly with it.
> 
> No argument that we need to support that properly.  This sub-thread
> started when someone said it would probably be slow with Tramp in the
> loop.

Just to clarify: it may turn out that communicating with a remote LSP 
server is fast enough for this purpose. However, if performance issues 
do crop up, they'll be more severe with TRAMP. Having used Eglot over 
TRAMP on a fast connection (within a LAN), nothing jumps out as 
annoyingly slow, although the things I use LSP for aren't 
latency-sensitive. It might be worth collecting some numbers on this to 
see how slow it really is.

(As mentioned elsewhere in the thread, I'm very happy with how Eglot 
works with remote files currently, since it means that my dev tools for 
a particular environment can live on the same system as my source code. 
Having to run the appropriate LSP server locally to edit remote files 
would be inconvenient, although I suppose I could tolerate it.)

Looking into this a bit more, I'm not actually 100% sure how much VSCode 
(or other LSP-aware editors) use LSP for syntax highlighting today. 
Semantic tokens are only available in the most recent specification of 
LSP (3.16)[1], so many LSP clients/servers likely wouldn't be using this 
yet. It might be helpful to see what they were doing prior to this; 
there may be some relatively non-invasive changes that could improve things.

Perhaps there's a way to use something like tree-sitter (or even cc-mode 
as it currently stands) to get 90% of the way there and then augment 
that with results from the LSP server. For example, to address the 
original post, it seems the main issue is that cc-mode doesn't know 
what's a type and what isn't. If we could get type information from the 
LSP server, then cc-mode could take that into account. In the example, 
we could even rely on the fact that `std::variant' takes types as 
arguments, so we know that the arguments are types (or the code is 
incorrect).

- Jim

[1] 
https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_semanticTokens



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05 10:14                     ` Joost Kremers
  2021-06-05 13:23                     ` Stefan Monnier
@ 2021-06-05 18:46                     ` João Távora
  2 siblings, 0 replies; 206+ messages in thread
From: João Távora @ 2021-06-05 18:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Joost Kremers, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Joost Kremers <joostkremers@fastmail.fm>
>> Date: Fri, 04 Jun 2021 22:11:06 +0200
>> 
>> > I'm not an expert on the internals of LSP servers, but it's my
>> > understanding that for a language server like clangd, it needs access
>> > not just to the current file, but the entire source tree[1].
>> 
>> And speaking from my experience with lsp-mode (not eglot) and Python, it needs
>> access to the entire virtual env so it can provide type information and
>> completions for built-in Python packages and for 3rd-party packages that you use
>> your code.
>
> That cannot be a mandatory requirement, right?  Because otherwise LSP
> wouldn't be able to support editing of an unfinished project, where
> not everything is laid out 100% yet.  

You're mostly right.  Most good servers give some level of support even
if they can't make out the whole project.  And clangd is one of them, in
my experience.  It'd likely be able to fontify perfectly just by looking
at the file.  Of course, to be able to relate compilation units and
provide full completion they must understand the project and the linking
between units (unfortunately, this requires duplicating much of one's
makefile in a compile-commands.json or equivalent, though there are
tools that try to automate that).

But "seeing" the whole project isn't generally a problem as LSP usually
run in the same host where the project lives.  They don't see the
project through Emacs, they only see the "document" through Emacs, which
acts as the LSP client.  A "document" is similar to a
file-visiting-buffer.

João



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 17:59                             ` Jim Porter
@ 2021-06-05 18:56                               ` Daniel Martín
  0 siblings, 0 replies; 206+ messages in thread
From: Daniel Martín @ 2021-06-05 18:56 UTC (permalink / raw)
  To: Jim Porter
  Cc: Eli Zaretskii, Daniel Colascione, joaotavora, spacibba, theo,
	ubolonton, emacs-devel

Jim Porter <jporterbugs@gmail.com> writes:

> Looking into this a bit more, I'm not actually 100% sure how much
> VSCode (or other LSP-aware editors) use LSP for syntax highlighting
> today. Semantic tokens are only available in the most recent
> specification of LSP (3.16)[1], so many LSP clients/servers likely
> wouldn't be using this yet. It might be helpful to see what they were
> doing prior to this; there may be some relatively non-invasive changes
> that could improve things.

VSCode uses TextMate grammars[1] for syntax highlighting.  If the
language server supports it, it adds semantic highlighting on top of it.

I think TextMate grammars have more or less the same problems our
current syntax highlighting engine has: They are regexp-based, and
difficult to write and maintain.

Two editors that I think are already using Tree-sitter for syntax
highlighting are Atom and Neovim.

[1]: https://macromates.com/manual/en/language_grammars



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 15:54 ` Alan Mackenzie
  2021-06-04 18:30   ` Daniel Colascione
@ 2021-06-05 20:25   ` Dmitry Gutov
  2021-06-06 11:53     ` Alan Mackenzie
  1 sibling, 1 reply; 206+ messages in thread
From: Dmitry Gutov @ 2021-06-05 20:25 UTC (permalink / raw)
  To: Alan Mackenzie, Daniel Colascione; +Cc: emacs-devel

On 04.06.2021 18:54, Alan Mackenzie wrote:
> Whether a type is recognised as such depends on that, yes.  It's hard to
> think of a better way without having the resources of a compiler,
> particularly for ill-behaved languages like C+

Would it work much worse if you took the approach of not applying the 
highlighting when you frequently cannot be sure of what the type of the 
term is?

That would mean none of the types in brackets would be highlighted in 
the original example, but perhaps that is still better than the current 
result?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:30   ` Daniel Colascione
@ 2021-06-06 11:37     ` Alan Mackenzie
  2021-06-06 11:57       ` Eli Zaretskii
  2021-06-06 17:44       ` Stefan Monnier
  0 siblings, 2 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-06 11:37 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

Hello, Daniel.

On Fri, Jun 04, 2021 at 11:30:09 -0700, Daniel Colascione wrote:
> On 6/4/21 8:54 AM, Alan Mackenzie wrote:
> >> Is there any *general* way that we can make fontification more robust
> >> and consistent?
> > Like other people have said on the thread, rewriting CC Mode to use an
> > LSP parser.

> > Less drastically, it would be possible to fix the specific bug you
> > allude to, by the user making a list of types and configuring CC Mode
> > with them, rather than attempting to recognise such types.  This feels
> > as though it would be tedious to use, though.

> I understand that cc-mode can't always get it right. It's only 
> asymptotically omniscient. :-) Some deficiencies in highlighting are 
> bound to happen.

> What's striking to me is the inconsistency in the highlighting. None of 
> the types in the std::variant declaration in my screenshot is special. 
> They're all declared in the same file as the std::variant typedef. So 
> why is PrimitiveType fontified while the others aren't?

Because of the order various jit-lock chunks are fontified.  If the
chunk which establishes foo as a type is fontified first, subsequent
fontifications of foo will use font-lock-type-face.  Otherwise, not.

> FWIW, fontification is correct and consistent when I set 
> font-lock-support-mode to nil, so this really does look like another 
> case of getting unlucky with jit-lock block divisions.

Maybe an improvement might come from scanning the buffer for occurrences
of foo after foo has been recognised as a type and entered into the CC
Mode table.  That way, the lack of fontification on foo would be
temporary, at least provided your Emacs is configured to fontify
non-displayed bits of the buffer in the background (which it is by
default).

This might need enhanced support from jit-lock, such as some sort of
signal indicating a buffer has been completly fontified.  I haven't
thought this through, yet.

> Yes, I'm sure that this particular problem is caused by some bug, and
> with the right repro, we can quickly isolate and fix it. But this kind
> of seemingly-inexplicable inconsistent highlighting has been happening
> for years and years now.  There's something fundamental about the way
> cc-mode is written that makes bugs like this keep popping up. Is there
> some internal abstraction we can add, some algorithmic test suite we
> can write, that would make this whole class of bug less likely?

Well, "seemingly-inexplicable inconsistent highlighting" isn't much to
go on.  If this means "problems with types not getting fontified", then
see above.  Otherwise, particulars help.  It may well be that the ad-hoc
parsing method which CC Mode uses is no longer appropriate for the
modern languages it supports; that's what a lot of this thread has been
discussing.  By "internal abstraction" I think you might mean getting
information from a compiler, or building a partial compiler into CC
Mode.  This is surely possible in theory, but in practice?

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 20:25   ` Dmitry Gutov
@ 2021-06-06 11:53     ` Alan Mackenzie
  2021-06-06 17:08       ` Dmitry Gutov
  0 siblings, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-06 11:53 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Daniel Colascione, emacs-devel

Hello, Dmitry.

On Sat, Jun 05, 2021 at 23:25:41 +0300, Dmitry Gutov wrote:
> On 04.06.2021 18:54, Alan Mackenzie wrote:
> > Whether a type is recognised as such depends on that, yes.  It's hard to
> > think of a better way without having the resources of a compiler,
> > particularly for ill-behaved languages like C+

> Would it work much worse if you took the approach of not applying the 
> highlighting when you frequently cannot be sure of what the type of the 
> term is?

Cases of "not being sure" are common indeed.  The whole of CC Mode is
based on heuristics.

> That would mean none of the types in brackets would be highlighted in 
> the original example, but perhaps that is still better than the current 
> result?

That would mean adding complicated decision functions for "not being
sure".  If the fontification of types where they are used (as opposed to
being declared) were to become less common, people would notice and
complain too.

There's the idea I proposed in my post to Daniel C of today - when a
type is newly recognised, then go through the buffer fontifying
occurrences of it.  That would probably help a lot, possibly at the cost
of slowing the mode down a bit.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:37     ` Alan Mackenzie
@ 2021-06-06 11:57       ` Eli Zaretskii
  2021-06-06 12:27         ` Alan Mackenzie
  2021-06-06 17:44       ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-06 11:57 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Sun, 6 Jun 2021 11:37:47 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
> > FWIW, fontification is correct and consistent when I set 
> > font-lock-support-mode to nil, so this really does look like another 
> > case of getting unlucky with jit-lock block divisions.
> 
> Maybe an improvement might come from scanning the buffer for occurrences
> of foo after foo has been recognised as a type and entered into the CC
> Mode table.  That way, the lack of fontification on foo would be
> temporary, at least provided your Emacs is configured to fontify
> non-displayed bits of the buffer in the background (which it is by
> default).
> 
> This might need enhanced support from jit-lock, such as some sort of
> signal indicating a buffer has been completly fontified.  I haven't
> thought this through, yet.

AFAIR, the way to tell JIT font-lock that a chunk of text was already
fontified is to set the 'fontified' property on that text.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:57       ` Eli Zaretskii
@ 2021-06-06 12:27         ` Alan Mackenzie
  2021-06-06 12:44           ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-06 12:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, emacs-devel

Hello, Eli.

On Sun, Jun 06, 2021 at 14:57:35 +0300, Eli Zaretskii wrote:
> > Date: Sun, 6 Jun 2021 11:37:47 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> > > FWIW, fontification is correct and consistent when I set 
> > > font-lock-support-mode to nil, so this really does look like another 
> > > case of getting unlucky with jit-lock block divisions.

> > Maybe an improvement might come from scanning the buffer for occurrences
> > of foo after foo has been recognised as a type and entered into the CC
> > Mode table.  That way, the lack of fontification on foo would be
> > temporary, at least provided your Emacs is configured to fontify
> > non-displayed bits of the buffer in the background (which it is by
> > default).

> > This might need enhanced support from jit-lock, such as some sort of
> > signal indicating a buffer has been completly fontified.  I haven't
> > thought this through, yet.

> AFAIR, the way to tell JIT font-lock that a chunk of text was already
> fontified is to set the 'fontified' property on that text.

Sorry, I was unclear.  I was thinking of a signal from jit-lock to the
major mode, indicating that background fontification had been completed.
CC Mode could react to this by fontifying all occurrences in the buffer
of "newly found" types.  Or something like that.

Or maybe the fontification could be done immediately after parsing a new
type.  This might be a bit sluggish, but it might be OK.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 12:27         ` Alan Mackenzie
@ 2021-06-06 12:44           ` Eli Zaretskii
  2021-06-06 14:19             ` Alan Mackenzie
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-06 12:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Sun, 6 Jun 2021 12:27:05 +0000
> Cc: dancol@dancol.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > AFAIR, the way to tell JIT font-lock that a chunk of text was already
> > fontified is to set the 'fontified' property on that text.
> 
> Sorry, I was unclear.  I was thinking of a signal from jit-lock to the
> major mode, indicating that background fontification had been completed.
> CC Mode could react to this by fontifying all occurrences in the buffer
> of "newly found" types.  Or something like that.

Sorry, I don't understand (probably because I missed the beginning of
this discussion): what do you mean by "background fontification", and
what does it mean for that to have been "completed"?  I'm afraid we
are not on the same page wrt JIT font-lock related terminology.

> Or maybe the fontification could be done immediately after parsing a new
> type.

Parsing by whom? by CC Mode?  If so, CC Mode parsing is itself part of
fontification, AFAIU, and is invoked by the JIT font-lock machinery.
So I'm confused wrt what you are looking for.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 12:44           ` Eli Zaretskii
@ 2021-06-06 14:19             ` Alan Mackenzie
  2021-06-06 17:06               ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-06 14:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, emacs-devel

Hello, Eli.

On Sun, Jun 06, 2021 at 15:44:38 +0300, Eli Zaretskii wrote:
> > Date: Sun, 6 Jun 2021 12:27:05 +0000
> > Cc: dancol@dancol.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > AFAIR, the way to tell JIT font-lock that a chunk of text was already
> > > fontified is to set the 'fontified' property on that text.

> > Sorry, I was unclear.  I was thinking of a signal from jit-lock to the
> > major mode, indicating that background fontification had been completed.
> > CC Mode could react to this by fontifying all occurrences in the buffer
> > of "newly found" types.  Or something like that.

> Sorry, I don't understand (probably because I missed the beginning of
> this discussion): what do you mean by "background fontification", and
> what does it mean for that to have been "completed"?  I'm afraid we
> are not on the same page wrt JIT font-lock related terminology.

CC Mode maintains a simple table of a buffer's types, which it uses to
fontify the same types when they occur again in the buffer.  Daniel's
main problem was that with JIT fontification, the occurrences of foo get
"fontified" to default face before foo has been entered into the table.
This happens because jit-lock doesn't scan the buffer from (point-min).

By "background fontification" I meant stealth fontification (and should
have said so).  This is, sadly, disabled by default.  If it were to be
enabled again, I was envisaging some sort of signal from jit-lock stealth
fontification when the stealth had determined a buffer was completely
fontified.  Reacting to this signal, CC Mode could then fontify all the
types which the stealth had caused to be added to the CC Mode table.

I no longer think this is a good idea.

> > Or maybe the fontification could be done immediately after parsing a new
> > type.

> Parsing by whom? by CC Mode?

Yes.  By CC Mode's fontification detecting a symbol, foo, must be a type,
and entering it into its internal table.  I am thinking that immediately
following, CC Mode could scan the entire buffer and refontify occurrences
of foo which hadn't yet got font-lock-type-face.

> If so, CC Mode parsing is itself part of fontification, AFAIU, and is
> invoked by the JIT font-lock machinery.  So I'm confused wrt what you
> are looking for.

I was looking for jit stealth locking to detect when it had completely
fontified a buffer (i.e. the `fontified' property was on the entire
buffer) and do something like calling a major mode function.  As I said,
I don't think this is a good idea, any more.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 14:19             ` Alan Mackenzie
@ 2021-06-06 17:06               ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-06 17:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Sun, 6 Jun 2021 14:19:00 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: dancol@dancol.org, emacs-devel@gnu.org
> 
> By "background fontification" I meant stealth fontification (and should
> have said so).  This is, sadly, disabled by default.  If it were to be
> enabled again, I was envisaging some sort of signal from jit-lock stealth
> fontification when the stealth had determined a buffer was completely
> fontified.

When a buffer has been completely fontified by jit-lock stealth
fontification, that buffer no longer appears in
jit-lock-stealth-buffers.  Is that good enough?

But yes, since stealth fontifications are disabled by default, this
isn't the way to make CC mode fontifications more accurate.

> Yes.  By CC Mode's fontification detecting a symbol, foo, must be a type,
> and entering it into its internal table.  I am thinking that immediately
> following, CC Mode could scan the entire buffer and refontify occurrences
> of foo which hadn't yet got font-lock-type-face.

You could still do that, but it could be costly, and slow down
redisplay, no?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:53     ` Alan Mackenzie
@ 2021-06-06 17:08       ` Dmitry Gutov
  0 siblings, 0 replies; 206+ messages in thread
From: Dmitry Gutov @ 2021-06-06 17:08 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel

Hi Alan,

On 06.06.2021 14:53, Alan Mackenzie wrote:

>> Would it work much worse if you took the approach of not applying the
>> highlighting when you frequently cannot be sure of what the type of the
>> term is?
> 
> Cases of "not being sure" are common indeed.  The whole of CC Mode is
> based on heuristics.

I would differentiate between approaches like

   need to parse around the callsite/usage site [of identifier]

and

   need to parse the identifier's definition itself

and, as far as Emacs major modes go, only used the first approach, plus 
perhaps some predefined/customizable list of built-ins.

Because it's pretty much a given that in a big enough project a lot of 
functions/classes/etc will be defined in files that the user will never 
visit in the current session.

>> That would mean none of the types in brackets would be highlighted in
>> the original example, but perhaps that is still better than the current
>> result?
> 
> That would mean adding complicated decision functions for "not being
> sure".  If the fontification of types where they are used (as opposed to
> being declared) were to become less common, people would notice and
> complain too.

Some might be relieved, too, seeing more stability of what is highlighed 
and what is not (and when).

> There's the idea I proposed in my post to Daniel C of today - when a
> type is newly recognised, then go through the buffer fontifying
> occurrences of it.  That would probably help a lot, possibly at the cost
> of slowing the mode down a bit.

What about the types that are defined in files you never visited?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:37     ` Alan Mackenzie
  2021-06-06 11:57       ` Eli Zaretskii
@ 2021-06-06 17:44       ` Stefan Monnier
  2021-06-06 18:00         ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-06 17:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel

> Because of the order various jit-lock chunks are fontified.  If the
> chunk which establishes foo as a type is fontified first, subsequent
> fontifications of foo will use font-lock-type-face.  Otherwise, not.

The way this is handled in other modes is to keep a highwater mark of
the buffer position up to which the text has been scanned for type
definitions and then in the font-lock-keywords you start by scanning the
text between this mark and the text that needs to be fontified (and
then moving the mark, of course).

Of course, this presumes that text later in the buffer can't affect
highlighting of earlier text (e.g. a type definition has to come before
its first use).  And it can have other downsides (e.g. if you already do
the scan for highlighting itself, it means you now have to do the scan
twice (once to collect and once to highlight), and it also means that if
the user jumps to the end of the buffer you'll have to scan the whole
buffer before you can start highlighting the last screenful of text).

> Maybe an improvement might come from scanning the buffer for occurrences
> of foo after foo has been recognised as a type and entered into the CC
> Mode table.  That way, the lack of fontification on foo would be
> temporary, at least provided your Emacs is configured to fontify
> non-displayed bits of the buffer in the background (which it is by
> default).

Not since:

    commit d0483d25c034c38a8c6f0d718e9780c50e6ba03a
    Author: David Kastrup <dak@gnu.org>
    Date:   Sun Mar 4 08:41:08 2007 +0000
    
        * NEWS (fontification): Mention that the new default for
        jit-lock-stealth-time is now nil.
        
        * jit-lock.el (jit-lock-stealth-time): Change default to nil.
        Preserve 16 as default value for "seconds" when customizing.
    
    diff --git a/lisp/jit-lock.el b/lisp/jit-lock.el
    --- a/lisp/jit-lock.el
    +++ b/lisp/jit-lock.el
    @@ -77,9 +77,9 @@
    -(defcustom jit-lock-stealth-time 16
    +(defcustom jit-lock-stealth-time nil
       "*Time in seconds to wait before beginning stealth fontification.
     Stealth fontification occurs if there is no input within this time.
     If nil, stealth fontification is never performed.

> This might need enhanced support from jit-lock, such as some sort of
> signal indicating a buffer has been completly fontified.

Indeed, there's no way currently for font-lock to tell jit-lock that it
has decided to fontify a particular chunk without being requested to do
so (Eli suggests setting the `fontified` property, but this means that
all the clients of jit-lock have done their work, so it's only correct
to set it from font-lock if you run the other `jit-lock-functions` (or
if there are currently no other `jit-lock-functions`)),

The closest related functionality is that a jit-lock function
(e.g. `font-lock-fontify-region`) can return a value of the form
(jit-lock-bounds BEG . END) to indicate the region it actually
fontified (which should cover the region they were asked to fontify).


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 17:44       ` Stefan Monnier
@ 2021-06-06 18:00         ` Eli Zaretskii
  2021-06-06 18:18           ` Stefan Monnier
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-06 18:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Daniel Colascione <dancol@dancol.org>,  emacs-devel@gnu.org
> Date: Sun, 06 Jun 2021 13:44:06 -0400
> 
> > Because of the order various jit-lock chunks are fontified.  If the
> > chunk which establishes foo as a type is fontified first, subsequent
> > fontifications of foo will use font-lock-type-face.  Otherwise, not.
> 
> The way this is handled in other modes is to keep a highwater mark of
> the buffer position up to which the text has been scanned for type
> definitions and then in the font-lock-keywords you start by scanning the
> text between this mark and the text that needs to be fontified (and
> then moving the mark, of course).

So if the first windowful of a file that's displayed is at EOB,
fontification must go all the way back to BOB and start scanning
there, until it comes to the end?

> Indeed, there's no way currently for font-lock to tell jit-lock that it
> has decided to fontify a particular chunk without being requested to do
> so (Eli suggests setting the `fontified` property, but this means that
> all the clients of jit-lock have done their work, so it's only correct
> to set it from font-lock if you run the other `jit-lock-functions` (or
> if there are currently no other `jit-lock-functions`)),

By "other clients" you mean those which don't fontify, but instead
piggy-back jit-lock to do other jobs?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:00         ` Eli Zaretskii
@ 2021-06-06 18:18           ` Stefan Monnier
  2021-06-06 18:33             ` Daniel Colascione
  2021-06-06 19:03             ` Eli Zaretskii
  0 siblings, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-06 18:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, dancol, emacs-devel

> So if the first windowful of a file that's displayed is at EOB,
> fontification must go all the way back to BOB and start scanning
> there, until it comes to the end?

Yup.  The way to make it bearable is to make that scan be as simple and
fast as possible.

Note that `syntax-propertize` and `syntax-ppss` also work this way, so
it's already the case that when we start by displaying EOB we first have
to apply `syntax-propertize` over the whole buffer :-(

In theory, there are various cases (which depend on the specific
programming language under consideration) where we could avoid such
a scan, but it would introduce a lot of complexity so we don't bother.

>> Indeed, there's no way currently for font-lock to tell jit-lock that it
>> has decided to fontify a particular chunk without being requested to do
>> so (Eli suggests setting the `fontified` property, but this means that
>> all the clients of jit-lock have done their work, so it's only correct
>> to set it from font-lock if you run the other `jit-lock-functions` (or
>> if there are currently no other `jit-lock-functions`)),
>
> By "other clients" you mean those which don't fontify, but instead
> piggy-back jit-lock to do other jobs?

`grep jit-lock-register` in Emacs' bundled files gives
bug-reference-mode, glasses-mode, and goto-address-mode as packages
which use jit-lock.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:18           ` Stefan Monnier
@ 2021-06-06 18:33             ` Daniel Colascione
  2021-06-06 20:24               ` Stefan Monnier
  2021-06-06 19:03             ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-06 18:33 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: acm, emacs-devel

On 6/6/21 11:18 AM, Stefan Monnier wrote:
>> So if the first windowful of a file that's displayed is at EOB,
>> fontification must go all the way back to BOB and start scanning
>> there, until it comes to the end?
> Yup.  The way to make it bearable is to make that scan be as simple and
> fast as possible.
>
> Note that `syntax-propertize` and `syntax-ppss` also work this way, so
> it's already the case that when we start by displaying EOB we first have
> to apply `syntax-propertize` over the whole buffer :-(
>
> In theory, there are various cases (which depend on the specific
> programming language under consideration) where we could avoid such
> a scan, but it would introduce a lot of complexity so we don't bother.

I've been thinking of a new core facility for helping modes implement 
this kind of incremental buffer analysis. Basically, it works like this: 
fontification logically proceeds from bob to eob in fixed-size chunks. 
After each chunk, we checkpoint the state of the fontification engine in 
a text property. Whenever we modify the buffer, we invalidate chunks 
that the modification might have affected and proceed from the last 
known-valid checkpoint.

It's more subtle than it sounds though.

First, we need to support lookahead. Fontification of region [A, B) 
might do lookahead and depend on text in region [B, C). If it does, a 
modification occurs somewhere between B and C, we need to invalidate the 
[A, B) chunk. If we put the fontification-by-chunking code in core, we 
can track (via core magic) a high-water-mark of accessed buffer position 
for fontification of each chunk. This way, invalidation becomes 
automatically correct.

Second, writing fontification as some kind of callback with explicit 
checkpoint and restore support is annoying, and nobody's going to do 
that. If it were possible to write fontification programs as coroutines, 
we would keep mode fontification routines simply and declarative and 
automatically do both the chunking and the checkpointing.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:18           ` Stefan Monnier
  2021-06-06 18:33             ` Daniel Colascione
@ 2021-06-06 19:03             ` Eli Zaretskii
  2021-06-06 20:28               ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-06 19:03 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  dancol@dancol.org,  emacs-devel@gnu.org
> Date: Sun, 06 Jun 2021 14:18:15 -0400
> 
> > So if the first windowful of a file that's displayed is at EOB,
> > fontification must go all the way back to BOB and start scanning
> > there, until it comes to the end?
> 
> Yup.  The way to make it bearable is to make that scan be as simple and
> fast as possible.
> 
> Note that `syntax-propertize` and `syntax-ppss` also work this way, so
> it's already the case that when we start by displaying EOB we first have
> to apply `syntax-propertize` over the whole buffer :-(

What exactly are the reasons that we need to scan from BOB?  With the
exception of data type declarations, what else requires to go back
farther that the beginning of the defun in which we start fontifying?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:33             ` Daniel Colascione
@ 2021-06-06 20:24               ` Stefan Monnier
  2021-06-06 20:27                 ` Daniel Colascione
  0 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-06 20:24 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, acm, emacs-devel

> I've been thinking of a new core facility for helping modes implement this
> kind of incremental buffer analysis. Basically, it works like this:
> fontification logically proceeds from bob to eob in fixed-size chunks. After
> each chunk, we checkpoint the state of the fontification engine in a text
> property. Whenever we modify the buffer, we invalidate chunks that the
> modification might have affected and proceed from the last
> known-valid checkpoint.

[ I assume that what you mean by "fontification" is not literally
  placing faces (which is typically what font-lock does), but only
  a subset of that job (the subset that needs to proceed sequentially
  from BOB).  ]

You mean like what we do for `syntax-ppss` (except we keep the
checkpoint data in an alist indexed by positions, rather than in
text-properties)?

I think it would be fairly easy to add some way to keep extra data in
`syntax-ppss-wide/narrow`.

> It's more subtle than it sounds though.
>
> First, we need to support lookahead. Fontification of region [A, B) might do
> lookahead and depend on text in region [B, C).

For `syntax-propertize` we handle this via a `syntax-multiline` text
property, so that changes in the B region cause re-propertization of the
A region.

> Second, writing fontification as some kind of callback with explicit
> checkpoint and restore support is annoying, and nobody's going to do
> that. If it were possible to write fontification programs as coroutines, we
> would keep mode fontification routines simply and declarative and
> automatically do both the chunking and the checkpointing.

When I wrote the `syntax-ppss` code, I did expect to add facilities to
keep extra data in there, but so far the need has not really come up.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:24               ` Stefan Monnier
@ 2021-06-06 20:27                 ` Daniel Colascione
  2021-06-06 20:38                   ` Stefan Monnier
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-06 20:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, Eli Zaretskii, emacs-devel

On 6/6/21 1:24 PM, Stefan Monnier wrote:
>> I've been thinking of a new core facility for helping modes implement this
>> kind of incremental buffer analysis. Basically, it works like this:
>> fontification logically proceeds from bob to eob in fixed-size chunks. After
>> each chunk, we checkpoint the state of the fontification engine in a text
>> property. Whenever we modify the buffer, we invalidate chunks that the
>> modification might have affected and proceed from the last
>> known-valid checkpoint.
> [ I assume that what you mean by "fontification" is not literally
>    placing faces (which is typically what font-lock does), but only
>    a subset of that job (the subset that needs to proceed sequentially
>    from BOB).  ]
>
> You mean like what we do for `syntax-ppss` (except we keep the
> checkpoint data in an alist indexed by positions, rather than in
> text-properties)?
Yes, but generic.
>
> I think it would be fairly easy to add some way to keep extra data in
> `syntax-ppss-wide/narrow`.
>
>> It's more subtle than it sounds though.
>>
>> First, we need to support lookahead. Fontification of region [A, B) might do
>> lookahead and depend on text in region [B, C).
> For `syntax-propertize` we handle this via a `syntax-multiline` text
> property, so that changes in the B region cause re-propertization of the
> A region.

Manually placing syntax-multiline is annoying and error-prone. Can't we 
instead keep track of what buffer positions were actually inspected?




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 19:03             ` Eli Zaretskii
@ 2021-06-06 20:28               ` Stefan Monnier
  2021-06-07  7:35                 ` martin rudalics
  2021-06-07 12:08                 ` Eli Zaretskii
  0 siblings, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-06 20:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, dancol, emacs-devel

> What exactly are the reasons that we need to scan from BOB?  With the
> exception of data type declarations, what else requires to go back
> farther that the beginning of the defun in which we start fontifying?

It all depends on the language.

E.g. in ELisp, what looks like a defun might actually be in the middle
of a string and there's no reliable way to know if something's in
a string other than to parse from BOB.
In C the situation is somewhat similar but for comments.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:27                 ` Daniel Colascione
@ 2021-06-06 20:38                   ` Stefan Monnier
  0 siblings, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-06 20:38 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, acm, emacs-devel

> Manually placing syntax-multiline is annoying and error-prone.
> Can't we instead keep track of what buffer positions were actually inspected?

Depends how the inspection is done, but of course that could be done.
Note that in the current uses of `syntax-propertize`, it's rather
unusual to need `syntax-multiline` (and it's fairly easy to add it in
most cases).  So while I agree with "annoying and error-prone" the
motivation to come up with some automatic way to do it has been
rather low.

In any case, I think this is a very secondary issue compared to the
issue of deciding what it is you want to do in that
"fontification" scan (and then how you want to do it, etc...).

If you want to do something fancier than `parse-partial-sexp`, then that
probably means inventing a new parsing engine, along with
corresponding grammars.  If so, using tree-sitter as that parsing engine
is probably one of the most attractive options since it lets us reuse
existing grammars.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:28               ` Stefan Monnier
@ 2021-06-07  7:35                 ` martin rudalics
  2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 12:08                 ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: martin rudalics @ 2021-06-07  7:35 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: acm, dancol, emacs-devel

 >> What exactly are the reasons that we need to scan from BOB?  With the
 >> exception of data type declarations, what else requires to go back
 >> farther that the beginning of the defun in which we start fontifying?
 >
 > It all depends on the language.
 >
 > E.g. in ELisp, what looks like a defun might actually be in the middle
 > of a string and there's no reliable way to know if something's in
 > a string other than to parse from BOB.

Unless `open-paren-in-column-0-is-defun-start' is non-nil.

martin



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:28               ` Stefan Monnier
  2021-06-07  7:35                 ` martin rudalics
@ 2021-06-07 12:08                 ` Eli Zaretskii
  2021-06-08 15:22                   ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-07 12:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  dancol@dancol.org,  emacs-devel@gnu.org
> Date: Sun, 06 Jun 2021 16:28:02 -0400
> 
> > What exactly are the reasons that we need to scan from BOB?  With the
> > exception of data type declarations, what else requires to go back
> > farther that the beginning of the defun in which we start fontifying?
> 
> It all depends on the language.
> 
> E.g. in ELisp, what looks like a defun might actually be in the middle
> of a string and there's no reliable way to know if something's in
> a string other than to parse from BOB.
> In C the situation is somewhat similar but for comments.

So you are saying we need that just to know where the current defun
begins?  Any other needs to start from BOB?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07  7:35                 ` martin rudalics
@ 2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 13:37                     ` Eli Zaretskii
                                       ` (2 more replies)
  0 siblings, 3 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-07 13:20 UTC (permalink / raw)
  To: martin rudalics; +Cc: Eli Zaretskii, acm, dancol, emacs-devel

>>> What exactly are the reasons that we need to scan from BOB?  With the
>>> exception of data type declarations, what else requires to go back
>>> farther that the beginning of the defun in which we start fontifying?
>> It all depends on the language.
>> E.g. in ELisp, what looks like a defun might actually be in the middle
>> of a string and there's no reliable way to know if something's in
>> a string other than to parse from BOB.
> Unless `open-paren-in-column-0-is-defun-start' is non-nil.

We can use hacks like this one, indeed, but it's not in fashion
nowadays.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:20                   ` Stefan Monnier
@ 2021-06-07 13:37                     ` Eli Zaretskii
  2021-06-08  0:06                       ` Daniel Colascione
  2021-06-08 15:16                       ` Stefan Monnier
  2021-06-07 15:58                     ` martin rudalics
  2021-06-08  4:01                     ` Richard Stallman
  2 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-07 13:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, dancol, emacs-devel, acm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  acm@muc.de,  dancol@dancol.org,
>   emacs-devel@gnu.org
> Date: Mon, 07 Jun 2021 09:20:22 -0400
> 
> > Unless `open-paren-in-column-0-is-defun-start' is non-nil.
> 
> We can use hacks like this one, indeed, but it's not in fashion
> nowadays.

Yes, we prefer waiting forever for Emacs to respond to a TAB or RET,
and are okay with "random" fontification which triggered this thread.
The price of fashion, I guess.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 13:37                     ` Eli Zaretskii
@ 2021-06-07 15:58                     ` martin rudalics
  2021-06-08  4:01                     ` Richard Stallman
  2 siblings, 0 replies; 206+ messages in thread
From: martin rudalics @ 2021-06-07 15:58 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, Eli Zaretskii, dancol, emacs-devel

 > We can use hacks like this one, indeed, but it's not in fashion
 > nowadays.

So we joined the Carnabetian army.

martin



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:37                     ` Eli Zaretskii
@ 2021-06-08  0:06                       ` Daniel Colascione
  2021-06-08 15:16                       ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-08  0:06 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: rudalics, emacs-devel, acm

On 6/7/21 6:37 AM, Eli Zaretskii wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  acm@muc.de,  dancol@dancol.org,
>>    emacs-devel@gnu.org
>> Date: Mon, 07 Jun 2021 09:20:22 -0400
>>
>>> Unless `open-paren-in-column-0-is-defun-start' is non-nil.
>> We can use hacks like this one, indeed, but it's not in fashion
>> nowadays.
> Yes, we prefer waiting forever for Emacs to respond to a TAB or RET,
> and are okay with "random" fontification which triggered this thread.
> The price of fashion, I guess.

If a modern machine you're waiting "forever" to syntactically scan the 
buffer from BOB, something is very wrong. There's just no reason to use 
hacks like open-paren-in-column-0-is-defun-start, especially if we can 
checkpoint parsing.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 13:37                     ` Eli Zaretskii
  2021-06-07 15:58                     ` martin rudalics
@ 2021-06-08  4:01                     ` Richard Stallman
  2021-06-08 15:29                       ` Stefan Monnier
  2 siblings, 1 reply; 206+ messages in thread
From: Richard Stallman @ 2021-06-08  4:01 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, eliz, dancol, emacs-devel, acm

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > Unless `open-paren-in-column-0-is-defun-start' is non-nil.

  > We can use hacks like this one, indeed, but it's not in fashion
  > nowadays.

We have to choose between imperfect options.  We can't afford to
let fashion dictate our choice.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:37                     ` Eli Zaretskii
  2021-06-08  0:06                       ` Daniel Colascione
@ 2021-06-08 15:16                       ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-08 15:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, acm, dancol, emacs-devel

>> > Unless `open-paren-in-column-0-is-defun-start' is non-nil.
>> We can use hacks like this one, indeed, but it's not in fashion
>> nowadays.
> Yes, we prefer waiting forever for Emacs to respond to a TAB or RET,

I'd be interested to hear about the cases where you think the time to
reply to RET or TAB would be sped up by
`open-paren-in-column-0-is-defun-start`.

The case I know of where it would make a significant difference in
practice are things like:
- open a large file in a mode that uses a heavy
  `syntax-propertize-function`, such as perl-mode, and jump to the end.
- turn off font-lock-mode, do the same as above (which should be
  quick this time around), and then hit TAB (at which point you should
  see the same delay as you saw above).

So, yes, there is a performance price to pay, but in return you get
simpler ELisp code (because you don't need to implement the hacks), and
a more reliable behavior.

> and are okay with "random" fontification which triggered this thread.
> The price of fashion, I guess.

I think you're confused:
- the "random" fontification in this thread is in CC-mode, which does
  not use the approach I described (and used in syntax-propertize).
  E.g. Alan mentioned that the problematic behavior of CC-mode's
  highlighting can depend on the order in which the chunks are
  fontified, and `syntax-propertize` specifically aims to avoid
  such order-dependency [ And please don't get me wrong:
  an approach like that of `syntax-propertize` wouldn't solve the
  problematic fontification, but it would (mis)fonftify the same way
  every time.  ]
- it's with `open-paren-in-column-0-is-defun-start` that we had
  occasional/random misfontification, and it's indeed to get rid of
  those that we finally changed its default value.

The performance cost is real, but AFAIK this cost gives *less random*
behavior contrary to what you state.  The whole point of the design of
`syntax-propertize` is to try and make it eas(y|ier) to get
correct&reliable behavior (at the cost of sometimes sub-optimal
performance).

AFAIK one of the reasons why Alan doesn't want to use an approach like
that of syntax-propertize in CC-mode is because his guts tell him that
it would be too inefficient for C++.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 12:08                 ` Eli Zaretskii
@ 2021-06-08 15:22                   ` Stefan Monnier
  2021-06-08 15:46                     ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-08 15:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, dancol, emacs-devel

>> E.g. in ELisp, what looks like a defun might actually be in the middle
>> of a string and there's no reliable way to know if something's in
>> a string other than to parse from BOB.
>> In C the situation is somewhat similar but for comments.
>
> So you are saying we need that just to know where the current defun
> begins?

Not really: the dependency goes the other way around.

The real question is "given a POS determine whether it is inside
a string or a comment or neither", which we need in all kinds of
circumstances (sometimes we need a bit more info than that, of course,
but this one is the killer).

Approaches like `open-paren-in-column-0-is-defun-start` try to answer
this question without parsing from BOB by making an assumption that if
something looks like a defun, then it is neither inside a string nor
a comment.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08  4:01                     ` Richard Stallman
@ 2021-06-08 15:29                       ` Stefan Monnier
  2021-06-08 15:52                         ` Eli Zaretskii
                                           ` (2 more replies)
  0 siblings, 3 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-08 15:29 UTC (permalink / raw)
  To: Richard Stallman; +Cc: rudalics, eliz, acm, dancol, emacs-devel

>   > We can use hacks like this one, indeed, but it's not in fashion
>   > nowadays.
> We have to choose between imperfect options.  We can't afford to
> let fashion dictate our choice.

Oh boy, I see my use of the term "fashion" has really tipped
people's sensitivities.

All I meant is that given the increase of performance of CPUs (until the
beginning of this century) and a non-corresponding increase in file size
and complexity of language syntax, programmers nowadays prefer correct
behavior over fast behavior, since the correct behavior is fast enough
anyway to be bearable.

Given the lack of improvement in CPU performance over the last decade,
this may well change again, of course, but so far I haven't seen people
shy away from Python and IDEs, so I expect this won't happen in the
near future.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:22                   ` Stefan Monnier
@ 2021-06-08 15:46                     ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-08 15:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  dancol@dancol.org,  emacs-devel@gnu.org
> Date: Tue, 08 Jun 2021 11:22:21 -0400
> 
> >> E.g. in ELisp, what looks like a defun might actually be in the middle
> >> of a string and there's no reliable way to know if something's in
> >> a string other than to parse from BOB.
> >> In C the situation is somewhat similar but for comments.
> >
> > So you are saying we need that just to know where the current defun
> > begins?
> 
> Not really: the dependency goes the other way around.
> 
> The real question is "given a POS determine whether it is inside
> a string or a comment or neither", which we need in all kinds of
> circumstances (sometimes we need a bit more info than that, of course,
> but this one is the killer).
> 
> Approaches like `open-paren-in-column-0-is-defun-start` try to answer
> this question without parsing from BOB by making an assumption that if
> something looks like a defun, then it is neither inside a string nor
> a comment.

Then I guess you are not describing what CC Mode does, do you.  Which
is the subject of this discussion, AFAIU.  Doesn't CC Mode go to BOB
_a_lot_, and not just to determine whether we are inside a string or a
comment?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:29                       ` Stefan Monnier
@ 2021-06-08 15:52                         ` Eli Zaretskii
  2021-06-08 16:36                           ` Stefan Monnier
  2021-06-09  3:39                         ` Richard Stallman
  2021-06-09  8:34                         ` martin rudalics
  2 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-08 15:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, dancol, emacs-devel, rms, acm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: rudalics@gmx.at,  eliz@gnu.org,  acm@muc.de,  dancol@dancol.org,
>   emacs-devel@gnu.org
> Date: Tue, 08 Jun 2021 11:29:07 -0400
> 
> All I meant is that given the increase of performance of CPUs (until the
> beginning of this century) and a non-corresponding increase in file size
> and complexity of language syntax, programmers nowadays prefer correct
> behavior over fast behavior, since the correct behavior is fast enough
> anyway to be bearable.

Not in CC Mode, not IMO anyway.  But perhaps you don't consider what
CC Mode does to be "correct behavior".

And then, of course, there's a question "what is correct"?  When I see
something like

   static foo_t __attribute__((bar)) myvar;

I'm not sure I'd care if everything before "myvar" would be in the
same face and "myvar" in another face.  IOW, it isn't necessarily
important to me that fontification knows that foo_t is a type and not
a keyword.  So searching the file (and perhaps other files) for the
definition of foo_t isn't important -- for the purposes of
fontification.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:52                         ` Eli Zaretskii
@ 2021-06-08 16:36                           ` Stefan Monnier
  2021-06-08 18:11                             ` Daniel Colascione
  2021-06-08 18:11                             ` Eli Zaretskii
  0 siblings, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-08 16:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, rudalics, acm, dancol, emacs-devel

>> All I meant is that given the increase of performance of CPUs (until the
>> beginning of this century) and a non-corresponding increase in file size
>> and complexity of language syntax, programmers nowadays prefer correct
>> behavior over fast behavior, since the correct behavior is fast enough
>> anyway to be bearable.
> Not in CC Mode, not IMO anyway.  But perhaps you don't consider what
> CC Mode does to be "correct behavior".

My comment was about using hacks like
`open-paren-in-column-0-is-defun-start` to avoid scanning from BOB in
`syntax-ppss/propertize`.

> And then, of course, there's a question "what is correct"?  When I see
> something like
>
>    static foo_t __attribute__((bar)) myvar;
>
> I'm not sure I'd care if everything before "myvar" would be in the
> same face and "myvar" in another face.  IOW, it isn't necessarily
> important to me that fontification knows that foo_t is a type and not
> a keyword.  So searching the file (and perhaps other files) for the
> definition of foo_t isn't important -- for the purposes of
> fontification.

FWIW, my `font-lock-type-face` is customized to:

    '(font-lock-type-face ((t)))

so I'll let you guess my opinion on this ;-)


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 16:36                           ` Stefan Monnier
@ 2021-06-08 18:11                             ` Daniel Colascione
  2021-06-08 18:25                               ` Eli Zaretskii
  2021-06-08 18:11                             ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-08 18:11 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: rudalics, emacs-devel, rms, acm

On 6/8/21 9:36 AM, Stefan Monnier wrote:

>>> All I meant is that given the increase of performance of CPUs (until the
>>> beginning of this century) and a non-corresponding increase in file size
>>> and complexity of language syntax, programmers nowadays prefer correct
>>> behavior over fast behavior, since the correct behavior is fast enough
>>> anyway to be bearable.
>> Not in CC Mode, not IMO anyway.  But perhaps you don't consider what
>> CC Mode does to be "correct behavior".
> My comment was about using hacks like
> `open-paren-in-column-0-is-defun-start` to avoid scanning from BOB in
> `syntax-ppss/propertize`.
>
>> And then, of course, there's a question "what is correct"?  When I see
>> something like
>>
>>     static foo_t __attribute__((bar)) myvar;
>>
>> I'm not sure I'd care if everything before "myvar" would be in the
>> same face and "myvar" in another face.  IOW, it isn't necessarily
>> important to me that fontification knows that foo_t is a type and not
>> a keyword.  So searching the file (and perhaps other files) for the
>> definition of foo_t isn't important -- for the purposes of
>> fontification.
> FWIW, my `font-lock-type-face` is customized to:
>
>      '(font-lock-type-face ((t)))
>
> so I'll let you guess my opinion on this ;-)

The whole point of fontification is to provide visual hints about the 
semantic structure of source code. If cc-mode can't do that reliably, my 
preference would be for it to not do it at all. Fontification of a 
type-using expression shouldn't change if I move the definition of that 
type from one file to another.

IMHO, we should rely on LSP to figure out what symbols are types, and if 
a LSP isn't available, we shouldn't try to guess.







^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 16:36                           ` Stefan Monnier
  2021-06-08 18:11                             ` Daniel Colascione
@ 2021-06-08 18:11                             ` Eli Zaretskii
  2021-06-08 21:25                               ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-08 18:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, dancol, emacs-devel, rms, acm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: rms@gnu.org,  rudalics@gmx.at,  acm@muc.de,  dancol@dancol.org,
>   emacs-devel@gnu.org
> Date: Tue, 08 Jun 2021 12:36:40 -0400
> 
> >    static foo_t __attribute__((bar)) myvar;
> >
> > I'm not sure I'd care if everything before "myvar" would be in the
> > same face and "myvar" in another face.  IOW, it isn't necessarily
> > important to me that fontification knows that foo_t is a type and not
> > a keyword.  So searching the file (and perhaps other files) for the
> > definition of foo_t isn't important -- for the purposes of
> > fontification.
> 
> FWIW, my `font-lock-type-face` is customized to:
> 
>     '(font-lock-type-face ((t)))

Does that make CC Mode bypass those scans from BOB?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:11                             ` Daniel Colascione
@ 2021-06-08 18:25                               ` Eli Zaretskii
  2021-06-08 18:28                                 ` Daniel Colascione
  2021-06-09 18:22                                 ` Alan Mackenzie
  0 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-08 18:25 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, acm, monnier, rms, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Tue, 8 Jun 2021 11:11:21 -0700
> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
> 
> The whole point of fontification is to provide visual hints about the 
> semantic structure of source code. If cc-mode can't do that reliably, my 
> preference would be for it to not do it at all. Fontification of a 
> type-using expression shouldn't change if I move the definition of that 
> type from one file to another.

I think we agree.  Except that for me, it should also not try if it
cannot do it quickly enough, not only reliably enough.

> IMHO, we should rely on LSP to figure out what symbols are types, and if 
> a LSP isn't available, we shouldn't try to guess.

I was talking about what to do (or not to do) with our existing
regexp- and "syntax"-based fontifications.  I still remember the days
when CC Mode handled that well enough without being a snail it
frequently is now, and that was on a machine about 10 times slower
than the one I use nowadays.  The C language didn't change too much
since then, at least not the flavor I frequently edit.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:25                               ` Eli Zaretskii
@ 2021-06-08 18:28                                 ` Daniel Colascione
  2021-06-08 18:54                                   ` Eli Zaretskii
  2021-06-09 18:22                                 ` Alan Mackenzie
  1 sibling, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-08 18:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, acm, monnier, rms, emacs-devel

On 6/8/21 11:25 AM, Eli Zaretskii wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Tue, 8 Jun 2021 11:11:21 -0700
>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
>>
>> The whole point of fontification is to provide visual hints about the
>> semantic structure of source code. If cc-mode can't do that reliably, my
>> preference would be for it to not do it at all. Fontification of a
>> type-using expression shouldn't change if I move the definition of that
>> type from one file to another.
> I think we agree.  Except that for me, it should also not try if it
> cannot do it quickly enough, not only reliably enough.
>
>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>> a LSP isn't available, we shouldn't try to guess.
> I was talking about what to do (or not to do) with our existing
> regexp- and "syntax"-based fontifications.  I still remember the days
> when CC Mode handled that well enough without being a snail it
> frequently is now, and that was on a machine about 10 times slower
> than the one I use nowadays.  The C language didn't change too much
> since then, at least not the flavor I frequently edit.

C++ is a much more complex language and a lot more relevant for modern 
software development.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:28                                 ` Daniel Colascione
@ 2021-06-08 18:54                                   ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-08 18:54 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, acm, monnier, rms, emacs-devel

> Cc: monnier@iro.umontreal.ca, rudalics@gmx.at, emacs-devel@gnu.org,
>  rms@gnu.org, acm@muc.de
> From: Daniel Colascione <dancol@dancol.org>
> Date: Tue, 8 Jun 2021 11:28:41 -0700
> 
> > I was talking about what to do (or not to do) with our existing
> > regexp- and "syntax"-based fontifications.  I still remember the days
> > when CC Mode handled that well enough without being a snail it
> > frequently is now, and that was on a machine about 10 times slower
> > than the one I use nowadays.  The C language didn't change too much
> > since then, at least not the flavor I frequently edit.
> 
> C++ is a much more complex language and a lot more relevant for modern 
> software development.

Sure, but that doesn't justify the slowdown in C editing I experience
over the years.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:11                             ` Eli Zaretskii
@ 2021-06-08 21:25                               ` Stefan Monnier
  0 siblings, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-08 21:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, rudalics, acm, dancol, emacs-devel

>> FWIW, my `font-lock-type-face` is customized to:
>> 
>>     '(font-lock-type-face ((t)))
>
> Does that make CC Mode bypass those scans from BOB?

It doesn't affect CC-mode, of course.  But I don't known which scans
you're referring to, since CC-mode does not perform many scans from
BOB, AFAIK.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:29                       ` Stefan Monnier
  2021-06-08 15:52                         ` Eli Zaretskii
@ 2021-06-09  3:39                         ` Richard Stallman
  2021-06-09  8:34                         ` martin rudalics
  2 siblings, 0 replies; 206+ messages in thread
From: Richard Stallman @ 2021-06-09  3:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, eliz, dancol, emacs-devel, acm

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > All I meant is that given the increase of performance of CPUs (until the
  > beginning of this century) and a non-corresponding increase in file size
  > and complexity of language syntax, programmers nowadays prefer correct
  > behavior over fast behavior, since the correct behavior is fast enough
  > anyway to be bearable.

That makes sense, in general.  But the alternatives available to us
may not give us a good way to adjust that tradeoff.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:29                       ` Stefan Monnier
  2021-06-08 15:52                         ` Eli Zaretskii
  2021-06-09  3:39                         ` Richard Stallman
@ 2021-06-09  8:34                         ` martin rudalics
  2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
  2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
  2 siblings, 2 replies; 206+ messages in thread
From: martin rudalics @ 2021-06-09  8:34 UTC (permalink / raw)
  To: Stefan Monnier, Richard Stallman; +Cc: acm, eliz, dancol, emacs-devel

 > Oh boy, I see my use of the term "fashion" has really tipped
 > people's sensitivities.

It was rather the use of the idiom "We can use hacks like this one".  I
see `open-paren-in-column-0-is-defun-start' as a way to subdivide code
into chunks that may be edited and processed independently.  Currently,
we use a monolithic approach (one that works on the whole buffer from
its beginning) for fontification and a chunk-wise approach (as in the
default `beginning-of-defun') for editing proper.

I do not like, for example, that inserting a quotation mark somewhere
into a Lisp buffer, with some delay repaints the entire rest of the
buffer just to undo that when I insert the closing quotation mark.
Maybe these are bad editing habits but I won't change them any more.  So
for me `open-paren-in-column-0-is-defun-start' is not a hack but an
entire philosophy which, unfortunately, doesn't work with fontification.

martin



^ permalink raw reply	[flat|nested] 206+ messages in thread

* `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random)
  2021-06-09  8:34                         ` martin rudalics
@ 2021-06-09 13:14                           ` Stefan Monnier
  2021-06-09 15:15                             ` Yuri Khan
  2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-09 13:14 UTC (permalink / raw)
  To: martin rudalics; +Cc: Richard Stallman, eliz, acm, dancol, emacs-devel

> It was rather the use of the idiom "We can use hacks like this one".  I
> see `open-paren-in-column-0-is-defun-start' as a way to subdivide code
> into chunks that may be edited and processed independently.  Currently,
> we use a monolithic approach (one that works on the whole buffer from
> its beginning) for fontification and a chunk-wise approach (as in the
> default `beginning-of-defun') for editing proper.

I see two problems with `open-paren-in-column-0-is-defun-start` (opic0ids):

- The implementation was a lot simpler than what's needed for your
  notion of "chunk-wise editing", thus leading to somewhat arbitrary
  behaviors because we only used the opic0ids property when it was
  convenient, rather than using it at every place where it could change
  the behavior.

- this convention is imposed on top of the definition of the language,
  so it's like editing "C with the opic0ids convention" rather than
  editing "C".  This works fine if your file is indeed written in "C
  with the opic0ids convention", but no so well otherwise.  And that
  convention is specific to Emacs (I can imagine other editors
  supporting a similar convention, but most likely it won't be exactly
  the same one since it's not a widely known convention), so unless all
  the coders agree to use Emacs you'll probably want to enforce that
  convention via some kind of "sanity check" maybe running in a CI.

- I don't think a major mode for language Foo should default to
  assuming that the buffer is written in "Foo with the opic0ids
  convention".


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random)
  2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
@ 2021-06-09 15:15                             ` Yuri Khan
  2021-06-09 15:16                               ` Yuri Khan
  0 siblings, 1 reply; 206+ messages in thread
From: Yuri Khan @ 2021-06-09 15:15 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Richard Stallman, Emacs developers, martin rudalics,
	Alan Mackenzie, Eli Zaretskii, Daniel Colascione

On Wed, 9 Jun 2021 at 20:16, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> I see two problems with `open-paren-in-column-0-is-defun-start` (opic0ids):
[…]
> - this convention is imposed on top of the definition of the language,
>   so it's like editing "C with the opic0ids convention" rather than
>   editing "C".  This works fine if your file is indeed written in "C
>   with the opic0ids convention", but no so well otherwise.  And that
>   convention is specific to Emacs

The convention of not indenting lines that start a function, or, at
least, an important landmark in the code, is also supported by ‘diff
--show-c-function’ and ‘git show-function’.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random)
  2021-06-09 15:15                             ` Yuri Khan
@ 2021-06-09 15:16                               ` Yuri Khan
  0 siblings, 0 replies; 206+ messages in thread
From: Yuri Khan @ 2021-06-09 15:16 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Richard Stallman, Emacs developers, martin rudalics,
	Alan Mackenzie, Eli Zaretskii, Daniel Colascione

On Wed, 9 Jun 2021 at 22:15, Yuri Khan <yuri.v.khan@gmail.com> wrote:

> The convention of not indenting lines that start a function, or, at
> least, an important landmark in the code, is also supported by ‘diff
> --show-c-function’ and ‘git show-function’.

‘git grep --show-function’ I meant, of course.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:25                               ` Eli Zaretskii
  2021-06-08 18:28                                 ` Daniel Colascione
@ 2021-06-09 18:22                                 ` Alan Mackenzie
  2021-06-09 18:36                                   ` Eli Zaretskii
  2021-06-09 19:05                                   ` Daniel Colascione
  1 sibling, 2 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-09 18:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, monnier, rms, emacs-devel

Hello, Eli.

On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Tue, 8 Jun 2021 11:11:21 -0700
> > Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de

> > The whole point of fontification is to provide visual hints about
> > the semantic structure of source code. If cc-mode can't do that
> > reliably, my preference would be for it to not do it at all.
> > Fontification of a type-using expression shouldn't change if I move
> > the definition of that type from one file to another.

> I think we agree.  Except that for me, it should also not try if it
> cannot do it quickly enough, not only reliably enough.

Quickly and reliably enough are desirable things, but in competition
with eachother.  Reliably enough is a lot easier to measure, quickly
enough depends on the machine, the degree of optimisation, and above
all, the user's expectations.

> > IMHO, we should rely on LSP to figure out what symbols are types, and if 
> > a LSP isn't available, we shouldn't try to guess.

"Shouldn't try to guess" means taking a great deal of
font-lock-type-faces out of CC Mode.  I don't honestly think the end
result would be any better than what we have at the moment.

> I was talking about what to do (or not to do) with our existing
> regexp- and "syntax"-based fontifications.  I still remember the days
> when CC Mode handled that well enough without being a snail it
> frequently is now, and that was on a machine about 10 times slower
> than the one I use nowadays.

Those old versions had masses of fontification bugs in them.  People
wrote bug reports about them and they got fixed.  Those fixes frequently
involved a loss of speed.  :-(

There have also been several bug reports about unusual buffers getting
fontified at the speed of continental drift, and fixing those has
usually led to a little slowdown for ordinary buffers.  I'm thinking,
for example, about bug #25706, where a 4 MB file took nearly an hour to
scroll through on my machine.  After the fix, it took around 86 seconds.

> The C language didn't change too much since then, at least not the
> flavor I frequently edit.

There are two places where CC Mode can be slow: font locking large areas
of text, and keeping up with somebody typing quickly.  Which of these
bothers you the most?  I have plans for speeding up one of these.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:22                                 ` Alan Mackenzie
@ 2021-06-09 18:36                                   ` Eli Zaretskii
  2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-09 19:05                                   ` Daniel Colascione
  1 sibling, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-09 18:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, monnier, rms, emacs-devel

> Date: Wed, 9 Jun 2021 18:22:57 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>   rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > I think we agree.  Except that for me, it should also not try if it
> > cannot do it quickly enough, not only reliably enough.
> 
> Quickly and reliably enough are desirable things, but in competition
> with eachother.  Reliably enough is a lot easier to measure, quickly
> enough depends on the machine, the degree of optimisation, and above
> all, the user's expectations.

That's why we had (and still have) font-lock-maximum-decoration: so
that users could control the tradeoff.  Unfortunately, support for
that variable is all but absent nowadays, because of the widespread
mistaken assumption that font-lock is fast enough in all modes.

> > > IMHO, we should rely on LSP to figure out what symbols are types, and if 
> > > a LSP isn't available, we shouldn't try to guess.
> 
> "Shouldn't try to guess" means taking a great deal of
> font-lock-type-faces out of CC Mode.  I don't honestly think the end
> result would be any better than what we have at the moment.

You don't think it will be better for what reason?

> > I was talking about what to do (or not to do) with our existing
> > regexp- and "syntax"-based fontifications.  I still remember the days
> > when CC Mode handled that well enough without being a snail it
> > frequently is now, and that was on a machine about 10 times slower
> > than the one I use nowadays.
> 
> Those old versions had masses of fontification bugs in them.

I don't remember bumping into those bugs.  Or maybe they were not
important enough to affect my UX.  Slow redisplay, by contrast, hits
me _every_day_, especially if I need to work with an unoptimized
build.  From where I stand, the balance between performance and
accuracy have shifted to the worse, unfortunately.

> People wrote bug reports about them and they got fixed.  Those fixes
> frequently involved a loss of speed.  :-(

If there's no way of fixing a bug without adversely affecting speed,
we should add user options to control those "fixes", so that people
could choose the balance that fits them.  Sometimes Emacs could itself
decide whether to invoke the "slow" code.  For example, it makes no
sense for users of C to be "punished" because we want more accurate
fontification of C++ sources.

> There have also been several bug reports about unusual buffers getting
> fontified at the speed of continental drift, and fixing those has
> usually led to a little slowdown for ordinary buffers.  I'm thinking,
> for example, about bug #25706, where a 4 MB file took nearly an hour to
> scroll through on my machine.  After the fix, it took around 86 seconds.

Once again, a pathological use case should not punish the usual ones;
if the punishment is too harsh, there should be a way to disable the
support for pathological cases for those who never hit them.

> > The C language didn't change too much since then, at least not the
> > flavor I frequently edit.
> 
> There are two places where CC Mode can be slow: font locking large areas
> of text, and keeping up with somebody typing quickly.  Which of these
> bothers you the most?  I have plans for speeding up one of these.

Both, I guess.  Though the former is probably more prominent, since
I'm not really such a fast typist, but I do happen to scroll through
source quite a lot.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:36                                   ` Eli Zaretskii
@ 2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 19:04                                       ` Eli Zaretskii
                                                         ` (2 more replies)
  2021-06-09 21:03                                     ` Alan Mackenzie
  1 sibling, 3 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-09 18:51 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, monnier, rms, emacs-devel



On June 9, 2021 11:37:17 AM Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Wed, 9 Jun 2021 18:22:57 +0000
>> Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>> rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>>> I think we agree.  Except that for me, it should also not try if it
>>> cannot do it quickly enough, not only reliably enough.
>>
>> Quickly and reliably enough are desirable things, but in competition
>> with eachother.  Reliably enough is a lot easier to measure, quickly
>> enough depends on the machine, the degree of optimisation, and above
>> all, the user's expectations.
>
> That's why we had (and still have) font-lock-maximum-decoration: so
> that users could control the tradeoff.  Unfortunately, support for
> that variable is all but absent nowadays, because of the widespread
> mistaken assumption that font-lock is fast enough in all modes.

It should be fast enough for all modes. This isn't 1985. Computers in 
general are *several orders* of magnitude faster than needed to do real 
time syntax highlighting in general. Other editors don't seem to struggle.  
Tree sitter is very fast. If regular editing is stuttering because of 
fontification, we have bad data structures, algorithms, or architectures 
--- that is, bugs. And we shouldn't add user options to paper over bugs. 
That's ridiculous. I can't believe we really want to propose a "please make 
syntax highlighting wrong" user option.





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:51                                     ` Daniel Colascione
@ 2021-06-09 19:04                                       ` Eli Zaretskii
  2021-06-09 20:07                                       ` chad
  2021-06-09 20:17                                       ` Dmitry Gutov
  2 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-09 19:04 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <emacs-devel@gnu.org>, <rms@gnu.org>
> Date: Wed, 09 Jun 2021 11:51:28 -0700
> 
> > That's why we had (and still have) font-lock-maximum-decoration: so
> > that users could control the tradeoff.  Unfortunately, support for
> > that variable is all but absent nowadays, because of the widespread
> > mistaken assumption that font-lock is fast enough in all modes.
> 
> It should be fast enough for all modes. This isn't 1985. Computers in 
> general are *several orders* of magnitude faster than needed to do real 
> time syntax highlighting in general.

I'm all for speeding it up, but the fact is, it isn't always fast
enough, especially in large files/buffers.  As long as it isn't fast
enough, that variable has its place, IMO.

> Other editors don't seem to struggle.  

Do you happen to know why?  Maybe we could use some of the ideas.

> Tree sitter is very fast.

But we don't use it.  I hope we will some day.

> If regular editing is stuttering because of 
> fontification, we have bad data structures, algorithms, or architectures 
> --- that is, bugs. And we shouldn't add user options to paper over bugs. 

I disagree.  These aren't "normal" bugs, these are design bugs, or
maybe even limitations of the methods we use for fontifications.  Such
issues sometimes take time to replace with better ones, and in the
meantime we need to provide reasonably responsive editing.

> That's ridiculous. I can't believe we really want to propose a "please make 
> syntax highlighting wrong" user option.

Not "wrong", just "less granular".  There's no single "right" here.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:22                                 ` Alan Mackenzie
  2021-06-09 18:36                                   ` Eli Zaretskii
@ 2021-06-09 19:05                                   ` Daniel Colascione
  2021-06-09 19:11                                     ` Eli Zaretskii
  2021-06-09 20:20                                     ` Alan Mackenzie
  1 sibling, 2 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-09 19:05 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: rudalics, monnier, rms, emacs-devel



On June 9, 2021 11:23:04 AM Alan Mackenzie <acm@muc.de> wrote:

> Hello, Eli.
>
> On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
>>> From: Daniel Colascione <dancol@dancol.org>
>>> Date: Tue, 8 Jun 2021 11:11:21 -0700
>>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
>
>>> The whole point of fontification is to provide visual hints about
>>> the semantic structure of source code. If cc-mode can't do that
>>> reliably, my preference would be for it to not do it at all.
>>> Fontification of a type-using expression shouldn't change if I move
>>> the definition of that type from one file to another.
>
>> I think we agree.  Except that for me, it should also not try if it
>> cannot do it quickly enough, not only reliably enough.
>
> Quickly and reliably enough are desirable things, but in competition
> with eachother.  Reliably enough is a lot easier to measure, quickly
> enough depends on the machine, the degree of optimisation, and above
> all, the user's expectations.
>
>>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>>> a LSP isn't available, we shouldn't try to guess.
>
> "Shouldn't try to guess" means taking a great deal of
> font-lock-type-faces out of CC Mode.  I don't honestly think the end
> result would be any better than what we have at the moment.


>
I think it would be better in fact. The whole point of fontification is to 
provide visual clues about the function of a word in a buffer. If I can't 
rely on font lock type face actually distinguishing types from non-types, 
what's the point? If fontification isn't reliable, it's not syntax 
highlighting, but instead a kewl rainbow effect.

ISTM we can only correctly do fontification of type references with the 
help of LSP. Without LSP support, I'd rather we not try to get it right, 
sometimes get it wrong, and make font-lock-type-face unreliable.  (We can 
correctly fontify declarations and definitions I think.)





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 19:05                                   ` Daniel Colascione
@ 2021-06-09 19:11                                     ` Eli Zaretskii
  2021-06-09 20:20                                     ` Alan Mackenzie
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-09 19:11 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <emacs-devel@gnu.org>, <rms@gnu.org>
> Date: Wed, 09 Jun 2021 12:05:27 -0700
> 
> ISTM we can only correctly do fontification of type references with the 
> help of LSP.

Patches are welcome to integrate LSP support, so that it could be the
main means of fontifying buffers.

> Without LSP support, I'd rather we not try to get it right,
> sometimes get it wrong, and make font-lock-type-face unreliable.
> (We can correctly fontify declarations and definitions I think.)

If we cannot do a reasonably good job in that case, then perhaps we
should indeed refrain from fontifying types.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 19:04                                       ` Eli Zaretskii
@ 2021-06-09 20:07                                       ` chad
  2021-06-10  6:43                                         ` Eli Zaretskii
  2021-06-09 20:17                                       ` Dmitry Gutov
  2 siblings, 1 reply; 206+ messages in thread
From: chad @ 2021-06-09 20:07 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Richard Stallman, EMACS development team, martin rudalics,
	Stefan Monnier, Alan Mackenzie, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

On Wed, Jun 9, 2021 at 11:56 AM Daniel Colascione <dancol@dancol.org> wrote:

> It should be fast enough for all modes. This isn't 1985. Computers in
> general are *several orders* of magnitude faster than needed to do real
> time syntax highlighting in general. Other editors don't seem to
> struggle.
> Tree sitter is very fast. If regular editing is stuttering because of
> fontification, we have bad data structures, algorithms, or architectures
> --- that is, bugs. And we shouldn't add user options to paper over bugs.
> That's ridiculous. I can't believe we really want to propose a "please
> make
> syntax highlighting wrong" user option.
>

I'm all for keeping context in mind, and I think that part of that is Eli's
unusual circumstances: running unoptimised builds with extra checking
enabled. I don't know what his particular hardware is like, but my laptop
is a medium-spec i5 from ~4 generations back running debian inside a
lightweight VM, and I can both scroll from top to bottom of src/xdisp.c and
open the file and immediately Esc-> to the end without (being aware of?)
font-lock falling behind.

Are other people having much worse experiences than this? Is there some
other situation where emacs developers are frequently seeing problems? I
don't do anything with C++ anymore, and I haven't bothered setting up LSP
here.

Thanks
~Chad

[-- Attachment #2: Type: text/html, Size: 1835 bytes --]

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 19:04                                       ` Eli Zaretskii
  2021-06-09 20:07                                       ` chad
@ 2021-06-09 20:17                                       ` Dmitry Gutov
  2 siblings, 0 replies; 206+ messages in thread
From: Dmitry Gutov @ 2021-06-09 20:17 UTC (permalink / raw)
  To: Daniel Colascione, Eli Zaretskii, Alan Mackenzie
  Cc: rudalics, emacs-devel, monnier, rms

On 09.06.2021 21:51, Daniel Colascione wrote:
> And we shouldn't add user options to paper over bugs. That's ridiculous. 
> I can't believe we really want to propose a "please make syntax 
> highlighting wrong" user option.

If it's possible to add a user option to disable or enable the 
fontification of type references in CC Mode, and if its nil value would 
disable the additional parsing logic required to get that "mostly 
right", the result could make both Eli happy with increased performance, 
and you (together with a number of other users) happier with more 
predictable, yet less ambitious syntax highlighting.

And one could then optionally add TreeSitter on top of that.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 19:05                                   ` Daniel Colascione
  2021-06-09 19:11                                     ` Eli Zaretskii
@ 2021-06-09 20:20                                     ` Alan Mackenzie
  2021-06-09 20:36                                       ` Stefan Monnier
  2021-06-10  2:21                                       ` Daniel Colascione
  1 sibling, 2 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-09 20:20 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, Eli Zaretskii, monnier, rms, emacs-devel

Hello, Daniel.

On Wed, Jun 09, 2021 at 12:05:27 -0700, Daniel Colascione wrote:

> On June 9, 2021 11:23:04 AM Alan Mackenzie <acm@muc.de> wrote:

> > Hello, Eli.

> > On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
> >>> From: Daniel Colascione <dancol@dancol.org>
> >>> Date: Tue, 8 Jun 2021 11:11:21 -0700
> >>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de

> >>> The whole point of fontification is to provide visual hints about
> >>> the semantic structure of source code. If cc-mode can't do that
> >>> reliably, my preference would be for it to not do it at all.
> >>> Fontification of a type-using expression shouldn't change if I move
> >>> the definition of that type from one file to another.

> >> I think we agree.  Except that for me, it should also not try if it
> >> cannot do it quickly enough, not only reliably enough.

> > Quickly and reliably enough are desirable things, but in competition
> > with eachother.  Reliably enough is a lot easier to measure, quickly
> > enough depends on the machine, the degree of optimisation, and above
> > all, the user's expectations.

> >>> IMHO, we should rely on LSP to figure out what symbols are types, and if
> >>> a LSP isn't available, we shouldn't try to guess.

> > "Shouldn't try to guess" means taking a great deal of
> > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > result would be any better than what we have at the moment.



> I think it would be better in fact. The whole point of fontification is to 
> provide visual clues about the function of a word in a buffer.

That's one of the points.  Another point is to provide colour, thus
giving the eye some pattern to orient around.  I think its most important
function is to point out comments, thus making things like

    if (foo)
      bar (); /* comment about bar
    else
      baz (); /* comment about baz */
    
undangerous.  For that case, fine distinctions about types are
irrelevant.

> If I can't rely on font lock type face actually distinguishing types
> from non-types, what's the point?

Because the information about types, though imperfect, is nevertheless
highly useful.

> If fontification isn't reliable, it's not syntax highlighting, but
> instead a kewl rainbow effect.

Now you seem to be saying that either font lock has to be 100% right, or
it's wholly useless.  Is that a fair summary of your position?  If so, do
you disable font lock mode for CC Mode and other modes which can't
guarantee perfect font locking?

> ISTM we can only correctly do fontification of type references with the 
> help of LSP.

I don't think it would be sensible to try to do it otherwise.

> Without LSP support, I'd rather we not try to get it right, sometimes
> get it wrong, and make font-lock-type-face unreliable.  (We can
> correctly fontify declarations and definitions I think.)

That's a rather negative way of putting things, which is a bit indefinite
and wishy-washy.  You could instead try to specify which tokens should get
font-lock-type-face and which shouldn't, thus giving something concrete
to discuss.  I think this will be difficult to do well, and may lead to
the result which I alluded to above.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:20                                     ` Alan Mackenzie
@ 2021-06-09 20:36                                       ` Stefan Monnier
  2021-06-10  7:01                                         ` Daniel Colascione
  2021-06-10  2:21                                       ` Daniel Colascione
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-09 20:36 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Daniel Colascione, Eli Zaretskii, rudalics, emacs-devel, rms

> That's a rather negative way of putting things, which is a bit indefinite
> and wishy-washy.  You could instead try to specify which tokens should get
> font-lock-type-face and which shouldn't, thus giving something concrete
> to discuss.  I think this will be difficult to do well, and may lead to
> the result which I alluded to above.

It has to be said also that C/C++ is quite unusual in that knowing which
identifier is a type is necessary for correct parsing.  If it weren't
so, we could reliably highlight types not based on their name but based
on their location in the syntax.

I think an approach like that of tree-sitter should be able (at least in
theory) to give reasonably good highlighting of types based on their
position (tho sadly not in those cases where the syntax is ambiguous).

I don't have a good intuition of how often ambiguities come into play in
real code, nor how much work would be needed to disambiguate most cases
(without relying on discovery of the corresponding type declarations).

If ambiguities are rare enough and/or easy enough to disambiguate
via some simple/local heuristic, then maybe CC-mode could try to
highlight types based on their location rather than based on
their identifiers.  This would make it more stable (not dependent on
the order in which chunks are highlighted) and maybe more reliable.
But I suspect that it's not easy to do that kind of parsing, short of
doing a full parse like tree-sitter does.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:36                                   ` Eli Zaretskii
  2021-06-09 18:51                                     ` Daniel Colascione
@ 2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-10  2:21                                       ` Daniel Colascione
                                                         ` (2 more replies)
  1 sibling, 3 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-09 21:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, monnier, rms, emacs-devel

Hello, Eli.

On Wed, Jun 09, 2021 at 21:36:44 +0300, Eli Zaretskii wrote:
> > Date: Wed, 9 Jun 2021 18:22:57 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
> >   rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > I think we agree.  Except that for me, it should also not try if it
> > > cannot do it quickly enough, not only reliably enough.

> > Quickly and reliably enough are desirable things, but in competition
> > with eachother.  Reliably enough is a lot easier to measure, quickly
> > enough depends on the machine, the degree of optimisation, and above
> > all, the user's expectations.

> That's why we had (and still have) font-lock-maximum-decoration: so
> that users could control the tradeoff.  Unfortunately, support for
> that variable is all but absent nowadays, because of the widespread
> mistaken assumption that font-lock is fast enough in all modes.

That variable is still supported by CC Mode (with the exception of AWK
Mode, where it surely is not needed).

Another possibility would be to replace accurate auxiliary functionality
with rough and ready facilities.  In a scroll through xdisp.c, fontifying
as we go, the following three functions are taking around 30% of the
run-time:

(i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
  top level.
(ii) c-determine-limit, c-determine-+ve-limit, which determine search
  limits approximately ARG non-literal characters before or after point.

By replacing these accurate functions with rough ones, the fontification
would be right most of the time, but a mess at other times (for example,
when there are big comments near point).  (i) is more important for C++
that C, but still makes a difference in C.

If we were to try this, I think a user toggle would be needed.

> > > > IMHO, we should rely on LSP to figure out what symbols are types, and if 
> > > > a LSP isn't available, we shouldn't try to guess.

> > "Shouldn't try to guess" means taking a great deal of
> > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > result would be any better than what we have at the moment.

> You don't think it will be better for what reason?

Because many users will still want at least the basic types (int, double,
unsigned long, ....) fontified, leading to the very mess Daniel would
like to avoid.   Declarations with basic types tend to be interleaved
with those using project defined types.

> > > I was talking about what to do (or not to do) with our existing
> > > regexp- and "syntax"-based fontifications.  I still remember the days
> > > when CC Mode handled that well enough without being a snail it
> > > frequently is now, and that was on a machine about 10 times slower
> > > than the one I use nowadays.

> > Those old versions had masses of fontification bugs in them.

> I don't remember bumping into those bugs.  Or maybe they were not
> important enough to affect my UX.  Slow redisplay, by contrast, hits
> me _every_day_, especially if I need to work with an unoptimized
> build.  From where I stand, the balance between performance and
> accuracy have shifted to the worse, unfortunately.

OK.  My above suggestion might give ~50% increase in fontification speed.

> > People wrote bug reports about them and they got fixed.  Those fixes
> > frequently involved a loss of speed.  :-(

> If there's no way of fixing a bug without adversely affecting speed,
> we should add user options to control those "fixes", so that people
> could choose the balance that fits them.

I think this would be a bad thing.  There are no (or very few) similar
user options in CC Mode at the moment, and an option to fix or not fix a
bug seems a strange idea, and would make the code quite a bit more
complicated.

> Sometimes Emacs could itself decide whether to invoke the "slow" code.
> For example, it makes no sense for users of C to be "punished" because
> we want more accurate fontification of C++ sources.

There is some truth in this imputation, yes.

> > There have also been several bug reports about unusual buffers
> > getting fontified at the speed of continental drift, and fixing those
> > has usually led to a little slowdown for ordinary buffers.  I'm
> > thinking, for example, about bug #25706, where a 4 MB file took
> > nearly an hour to scroll through on my machine.  After the fix, it
> > took around 86 seconds.

> Once again, a pathological use case should not punish the usual ones;
> if the punishment is too harsh, there should be a way to disable the
> support for pathological cases for those who never hit them.

The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
bugs would just be unsolvable.

> > > The C language didn't change too much since then, at least not the
> > > flavor I frequently edit.

> > There are two places where CC Mode can be slow: font locking large areas
> > of text, and keeping up with somebody typing quickly.  Which of these
> > bothers you the most?  I have plans for speeding up one of these.

> Both, I guess.  Though the former is probably more prominent, since
> I'm not really such a fast typist, but I do happen to scroll through
> source quite a lot.

Thanks.  I'll try to come up with speedups in the coming weeks (and
months).

Do you have fast-but-imprecise-scrolling enabled?  That can reduce the
pain.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:20                                     ` Alan Mackenzie
  2021-06-09 20:36                                       ` Stefan Monnier
@ 2021-06-10  2:21                                       ` Daniel Colascione
  1 sibling, 0 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-10  2:21 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, Eli Zaretskii, monnier, rms, emacs-devel



On June 9, 2021 1:20:32 PM Alan Mackenzie <acm@muc.de> wrote:

> Hello, Daniel.
>
> On Wed, Jun 09, 2021 at 12:05:27 -0700, Daniel Colascione wrote:
>
>> On June 9, 2021 11:23:04 AM Alan Mackenzie <acm@muc.de> wrote:
>
>>> Hello, Eli.
>
>>> On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
>>>>> From: Daniel Colascione <dancol@dancol.org>
>>>>> Date: Tue, 8 Jun 2021 11:11:21 -0700
>>>>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
>
>>>>> The whole point of fontification is to provide visual hints about
>>>>> the semantic structure of source code. If cc-mode can't do that
>>>>> reliably, my preference would be for it to not do it at all.
>>>>> Fontification of a type-using expression shouldn't change if I move
>>>>> the definition of that type from one file to another.
>
>>>> I think we agree.  Except that for me, it should also not try if it
>>>> cannot do it quickly enough, not only reliably enough.
>
>>> Quickly and reliably enough are desirable things, but in competition
>>> with eachother.  Reliably enough is a lot easier to measure, quickly
>>> enough depends on the machine, the degree of optimisation, and above
>>> all, the user's expectations.
>
>>>>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>>>>> a LSP isn't available, we shouldn't try to guess.
>
>>> "Shouldn't try to guess" means taking a great deal of
>>> font-lock-type-faces out of CC Mode.  I don't honestly think the end
>>> result would be any better than what we have at the moment.
>
>
>
>> I think it would be better in fact. The whole point of fontification is to
>> provide visual clues about the function of a word in a buffer.
>
> That's one of the points.  Another point is to provide colour, thus
> giving the eye some pattern to orient around.  I think its most important
> function is to point out comments, thus making things like
>
>    if (foo)
>      bar (); /* comment about bar
>    else
>      baz (); /* comment about baz */
>
> undangerous.  For that case, fine distinctions about types are
> irrelevant.
>
>> If I can't rely on font lock type face actually distinguishing types
>> from non-types, what's the point?
>
> Because the information about types, though imperfect, is nevertheless
> highly useful.
>
>> If fontification isn't reliable, it's not syntax highlighting, but
>> instead a kewl rainbow effect.
>
> Now you seem to be saying that either font lock has to be 100% right, or
> it's wholly useless.  Is that a fair summary of your position?  If so, do
> you disable font lock mode for CC Mode and other modes which can't
> guarantee perfect font locking?
>
>> ISTM we can only correctly do fontification of type references with the
>> help of LSP.
>
> I don't think it would be sensible to try to do it otherwise.
>
>> Without LSP support, I'd rather we not try to get it right, sometimes
>> get it wrong, and make font-lock-type-face unreliable.  (We can
>> correctly fontify declarations and definitions I think.)
>
> That's a rather negative way of putting things, which is a bit indefinite
> and wishy-washy.  You could instead try to specify which tokens should get
> font-lock-type-face and which shouldn't, thus giving something concrete
> to discuss.  I think this will be difficult to do well, and may lead to
> the result which I alluded to above.

Sure. To be more precise: what I propose is not applying 
font-lock-type-face to symbols when we think that symbol is a type solely 
because it's been entered into cc-mode's table of dynamically discovered 
types for the current buffer.


>
> --
> Alan Mackenzie (Nuremberg, Germany).






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 21:03                                     ` Alan Mackenzie
@ 2021-06-10  2:21                                       ` Daniel Colascione
  2021-06-10  6:55                                         ` Eli Zaretskii
  2021-06-10  6:39                                       ` Eli Zaretskii
  2021-06-10 15:16                                       ` Ergus
  2 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-10  2:21 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: rudalics, monnier, rms, emacs-devel



On June 9, 2021 2:03:07 PM Alan Mackenzie <acm@muc.de> wrote:

> Hello, Eli.
>
> On Wed, Jun 09, 2021 at 21:36:44 +0300, Eli Zaretskii wrote:
>>> Date: Wed, 9 Jun 2021 18:22:57 +0000
>>> Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>>> rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
>>> From: Alan Mackenzie <acm@muc.de>
>
>>>> I think we agree.  Except that for me, it should also not try if it
>>>> cannot do it quickly enough, not only reliably enough.
>
>>> Quickly and reliably enough are desirable things, but in competition
>>> with eachother.  Reliably enough is a lot easier to measure, quickly
>>> enough depends on the machine, the degree of optimisation, and above
>>> all, the user's expectations.
>
>> That's why we had (and still have) font-lock-maximum-decoration: so
>> that users could control the tradeoff.  Unfortunately, support for
>> that variable is all but absent nowadays, because of the widespread
>> mistaken assumption that font-lock is fast enough in all modes.
>
> That variable is still supported by CC Mode (with the exception of AWK
> Mode, where it surely is not needed).
>
> Another possibility would be to replace accurate auxiliary functionality
> with rough and ready facilities.  In a scroll through xdisp.c, fontifying
> as we go, the following three functions are taking around 30% of the
> run-time:
>
> (i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
>  top level.
> (ii) c-determine-limit, c-determine-+ve-limit, which determine search
>  limits approximately ARG non-literal characters before or after point.



>
> By replacing these accurate functions with rough ones, the fontification
> would be right most of the time, but a mess at other times (for example,
> when there are big comments near point).  (i) is more important for C++
> that C, but still makes a difference i


Another option is adding core support to speed up these operations. I don't 
think we should be sacrificing correctness for speed.

>
>
> If we were to try this, I think a user toggle would be needed.
>
>>>>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>>>>> a LSP isn't available, we shouldn't try to guess.
>
>>> "Shouldn't try to guess" means taking a great deal of
>>> font-lock-type-faces out of CC Mode.  I don't honestly think the end
>>> result would be any better than what we have at the moment.
>
>> You don't think it will be better for what reason?
>
> Because many users will still want at least the basic types (int, double,
> unsigned long, ....) fontified, leading to the very mess Daniel would
> like to avoid.   Declarations with basic types tend to be interleaved
> with those using project defined types.
>
>>>> I was talking about what to do (or not to do) with our existing
>>>> regexp- and "syntax"-based fontifications.  I still remember the days
>>>> when CC Mode handled that well enough without being a snail it
>>>> frequently is now, and that was on a machine about 10 times slower
>>>> than the one I use nowadays.
>
>>> Those old versions had masses of fontification bugs in them.
>
>> I don't remember bumping into those bugs.  Or maybe they were not
>> important enough to affect my UX.  Slow redisplay, by contrast, hits
>> me _every_day_, especially if I need to work with an unoptimized
>> build.  From where I stand, the balance between performance and
>> accuracy have shifted to the worse, unfortunately.
>
> OK.  My above suggestion might give ~50% increase in fontification speed.
>
>>> People wrote bug reports about them and they got fixed.  Those fixes
>>> frequently involved a loss of speed.  :-(
>
>> If there's no way of fixing a bug without adversely affecting speed,
>> we should add user options to control those "fixes", so that people
>> could choose the balance that fits them.
>
> I think this would be a bad thing.  There are no (or very few) similar
> user options in CC Mode at the moment, and an option to fix or not fix a
> bug seems a strange idea, and would make the code quite a bit more
> complicated.
>
>> Sometimes Emacs could itself decide whether to invoke the "slow" code.
>> For example, it makes no sense for users of C to be "punished" because
>> we want more accurate fontification of C++ sources.
>
> There is some truth in this imputation, yes.
>
>>> There have also been several bug reports about unusual buffers
>>> getting fontified at the speed of continental drift, and fixing those
>>> has usually led to a little slowdown for ordinary buffers.  I'm
>>> thinking, for example, about bug #25706, where a 4 MB file took
>>> nearly an hour to scroll through on my machine.  After the fix, it
>>> took around 86 seconds.
>
>> Once again, a pathological use case should not punish the usual ones;
>> if the punishment is too harsh, there should be a way to disable the
>> support for pathological cases for those who never hit them.
>
> The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
> 3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
> bugs would just be unsolvable.
>
>>>> The C language didn't change too much since then, at least not the
>>>> flavor I frequently edit.
>
>>> There are two places where CC Mode can be slow: font locking large areas
>>> of text, and keeping up with somebody typing quickly.  Which of these
>>> bothers you the most?  I have plans for speeding up one of these.
>
>> Both, I guess.  Though the former is probably more prominent, since
>> I'm not really such a fast typist, but I do happen to scroll through
>> source quite a lot.
>
> Thanks.  I'll try to come up with speedups in the coming weeks (and
> months).
>
> Do you have fast-but-imprecise-scrolling enabled?  That can reduce the
> pain.
>
> --
> Alan Mackenzie (Nuremberg, Germany).






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-10  2:21                                       ` Daniel Colascione
@ 2021-06-10  6:39                                       ` Eli Zaretskii
  2021-06-10 16:46                                         ` Alan Mackenzie
  2021-06-10 15:16                                       ` Ergus
  2 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10  6:39 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, monnier, rms, emacs-devel

> Date: Wed, 9 Jun 2021 21:03:03 +0000
> Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
>   emacs-devel@gnu.org, rms@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > That's why we had (and still have) font-lock-maximum-decoration: so
> > that users could control the tradeoff.  Unfortunately, support for
> > that variable is all but absent nowadays, because of the widespread
> > mistaken assumption that font-lock is fast enough in all modes.
> 
> That variable is still supported by CC Mode (with the exception of AWK
> Mode, where it surely is not needed).

Does it make a difference, performance-wise?  If not (which is what
ISTR), then that variable isn't really "supported", because supporting
it means that different values of it cause tangible differences in
performance.

> Another possibility would be to replace accurate auxiliary functionality
> with rough and ready facilities.  In a scroll through xdisp.c, fontifying
> as we go, the following three functions are taking around 30% of the
> run-time:
> 
> (i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
>   top level.
> (ii) c-determine-limit, c-determine-+ve-limit, which determine search
>   limits approximately ARG non-literal characters before or after point.
> 
> By replacing these accurate functions with rough ones, the fontification
> would be right most of the time, but a mess at other times (for example,
> when there are big comments near point).  (i) is more important for C++
> that C, but still makes a difference in C.
> 
> If we were to try this, I think a user toggle would be needed.

How about making font-lock-maximum-decoration control that as well?

> > > "Shouldn't try to guess" means taking a great deal of
> > > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > > result would be any better than what we have at the moment.
> 
> > You don't think it will be better for what reason?
> 
> Because many users will still want at least the basic types (int, double,
> unsigned long, ....) fontified

I'm not sure.  Can you explain why would I care too much about the
basic types (or types in general) standing out?

> > If there's no way of fixing a bug without adversely affecting speed,
> > we should add user options to control those "fixes", so that people
> > could choose the balance that fits them.
> 
> I think this would be a bad thing.  There are no (or very few) similar
> user options in CC Mode at the moment, and an option to fix or not fix a
> bug seems a strange idea

It depends on the bug.  If the bug causes Emacs to infloop or work
very slowly, then sure, no toggle for the fix would make sense.  But I
was talking about "bugs" that cause inaccurate or incorrect
fontifications, and those are much "softer".  At least IMO such "bugs"
are tolerable if they are rare enough, especially if fixing them hurts
redisplay performance and Emacs responsiveness in general.

Don't forget that the display code invokes fontifications also when it
does internal layout calculations whose results are not immediately
shown (or even not at all).  When that happens, some command not
directly related to display could be adversely affected.  So one idea
would be to turn off these expensive parts in those cases.

> > Once again, a pathological use case should not punish the usual ones;
> > if the punishment is too harsh, there should be a way to disable the
> > support for pathological cases for those who never hit them.
> 
> The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
> 3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
> bugs would just be unsolvable.

Once again: what kind of "bugs" are those?  If they only cause
imperfect faces, I'm not sure it's unthinkable to disable them, given
some optional value of a user knob.

> Do you have fast-but-imprecise-scrolling enabled?

No.  That's a separate issue, and influences all the modes, even those
where font-lock is light-weight.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:07                                       ` chad
@ 2021-06-10  6:43                                         ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10  6:43 UTC (permalink / raw)
  To: chad; +Cc: rms, emacs-devel, rudalics, monnier, acm, dancol

> From: chad <yandros@gmail.com>
> Date: Wed, 9 Jun 2021 13:07:17 -0700
> Cc: Richard Stallman <rms@gnu.org>,
>  EMACS development team <emacs-devel@gnu.org>,
>  martin rudalics <rudalics@gmx.at>, Stefan Monnier <monnier@iro.umontreal.ca>,
>  Alan Mackenzie <acm@muc.de>, Eli Zaretskii <eliz@gnu.org>
> 
> I'm all for keeping context in mind, and I think that part of that is Eli's unusual circumstances: running
> unoptimised builds with extra checking enabled. I don't know what his particular hardware is like, but my
> laptop is a medium-spec i5 from ~4 generations back running debian inside a lightweight VM, and I can both
> scroll from top to bottom of src/xdisp.c and open the file and immediately Esc-> to the end without (being
> aware of?) font-lock falling behind. 

Make a C file that's 10 copies of xdisp.c one after the other, and
repeat the experiment.  Then try the same with Emacs 23 to see the
regression.

My machine is a Core i7, albeit an old model of it.  But it still can
run circles around the one Richard Stallman uses, or the one Stefan
said he was using.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  2:21                                       ` Daniel Colascione
@ 2021-06-10  6:55                                         ` Eli Zaretskii
  2021-06-10  6:58                                           ` Daniel Colascione
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10  6:55 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> Date: Wed, 09 Jun 2021 19:21:23 -0700
> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> 
> > By replacing these accurate functions with rough ones, the fontification
> > would be right most of the time, but a mess at other times (for example,
> > when there are big comments near point).  (i) is more important for C++
> > that C, but still makes a difference i
> 
> Another option is adding core support to speed up these operations. I don't 
> think we should be sacrificing correctness for speed.

If speeding that up is feasible, sure, that's a better alternative.
Sacrificing correctness is a kind-of retreat, justified only when a
better solution is not at hand.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  6:55                                         ` Eli Zaretskii
@ 2021-06-10  6:58                                           ` Daniel Colascione
  2021-06-10  7:19                                             ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-10  6:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, monnier, rms, rudalics



On June 9, 2021 11:55:45 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Wed, 09 Jun 2021 19:21:23 -0700
>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>>
>>> By replacing these accurate functions with rough ones, the fontification
>>> would be right most of the time, but a mess at other times (for example,
>>> when there are big comments near point).  (i) is more important for C++
>>> that C, but still makes a difference i
>>
>> Another option is adding core support to speed up these operations. I don't
>> think we should be sacrificing correctness for speed.
>
> If speeding that up is feasible, sure, that's a better alternative.
> Sacrificing correctness is a kind-of retreat, justified only when a
> better solution is not at hand.

Sure. But I started this thread not because cc-mode was slow, but because 
specific design choices led to inconsistent fontification. It'd be a shame 
for it to result in changes that made cc-mode even more inconsistent.





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:36                                       ` Stefan Monnier
@ 2021-06-10  7:01                                         ` Daniel Colascione
  2021-06-10  7:21                                           ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-10  7:01 UTC (permalink / raw)
  To: Stefan Monnier, Alan Mackenzie; +Cc: rudalics, Eli Zaretskii, rms, emacs-devel



On June 9, 2021 1:36:42 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> That's a rather negative way of putting things, which is a bit indefinite
>> and wishy-washy.  You could instead try to specify which tokens should get
>> font-lock-type-face and which shouldn't, thus giving something concrete
>> to discuss.  I think this will be difficult to do well, and may lead to
>> the result which I alluded to above.
>
> It has to be said also that C/C++ is quite unusual in that knowing which
> identifier is a type is necessary for correct parsing.  If it weren't
> so, we could reliably highlight types not based on their name but based
> on their location in the syntax.
>
> I think an approach like that of tree-sitter should be able (at least in
> theory) to give reasonably good highlighting of types based on their
> position (tho sadly not in those cases where the syntax is ambiguous).

The model I've had in mind for dealing with parse ambiguity is an 
incremental GLR parser generating a parse forest, pruning the forest by 
constraint solving on ad-hoc language specific constraints, then picking 
one of the remaining parse trees incrementally to fontify.





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  6:58                                           ` Daniel Colascione
@ 2021-06-10  7:19                                             ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10  7:19 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <acm@muc.de>, <rudalics@gmx.at>, <monnier@iro.umontreal.ca>, <rms@gnu.org>, <emacs-devel@gnu.org>
> Date: Wed, 09 Jun 2021 23:58:40 -0700
> 
> > If speeding that up is feasible, sure, that's a better alternative.
> > Sacrificing correctness is a kind-of retreat, justified only when a
> > better solution is not at hand.
> 
> Sure. But I started this thread not because cc-mode was slow, but because 
> specific design choices led to inconsistent fontification. It'd be a shame 
> for it to result in changes that made cc-mode even more inconsistent.

Yes, there are two sub-threads here, about two different aspects of CC
Mode's fontifications.  Not unheard of in our discussions ;-)

From my POV, I'd like both of these issues be fixed at some future
time.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  7:01                                         ` Daniel Colascione
@ 2021-06-10  7:21                                           ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10  7:21 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 10 Jun 2021 00:01:38 -0700
> Cc: rudalics@gmx.at, Eli Zaretskii <eliz@gnu.org>, rms@gnu.org,
>  emacs-devel@gnu.org
> 
> The model I've had in mind for dealing with parse ambiguity is an 
> incremental GLR parser generating a parse forest, pruning the forest by 
> constraint solving on ad-hoc language specific constraints, then picking 
> one of the remaining parse trees incrementally to fontify.

I'm not an expert in this area: is this different from what
tree-sitter does?  If so, what are the main differences?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-10  2:21                                       ` Daniel Colascione
  2021-06-10  6:39                                       ` Eli Zaretskii
@ 2021-06-10 15:16                                       ` Ergus
  2021-06-10 15:34                                         ` Óscar Fuentes
                                                           ` (2 more replies)
  2 siblings, 3 replies; 206+ messages in thread
From: Ergus @ 2021-06-10 15:16 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, rudalics, dancol, monnier, rms, emacs-devel

Hi:

Sorry to bother, but just to clarify the conclusions because I lost some
messages:

1) What is finally the most desirable/long path/future feature? I mean,,
finally what is preferred by the developers to support in the future?

lsp or tree-sitter?

2) Alan, some time ago there was an issue related with the indentation
that the proper fix substituted some regex with iterative solutions. In
this case, it seems like that happens relatively often for complex
solutions.

Do you think that there is some missing/needed common use
function/API/feature that we could implement in the C side to improve
such iterative solutions?

  Maybe some vectorized "magic" functions that return pre-processed
vectors or low level data structure and avoid lisp loops and object
constructors and the lisp forth and back overheads and/or stressing the
GC?

3) Eli/Stefan do you think are there any missing feature in the low
level API that may simplify/improve integration with LSP or tree-sitters
in the future?

For things like font-lock/display engine I only consider to do as much
as possible in the C side to improve performance. And reduce as much as
possible interacting with the lisp side... Do you think that it may be
possible?

Best,
Ergus.

  
  

On Wed, Jun 09, 2021 at 09:03:03PM +0000, Alan Mackenzie wrote:
>Hello, Eli.
>
>On Wed, Jun 09, 2021 at 21:36:44 +0300, Eli Zaretskii wrote:
>> > Date: Wed, 9 Jun 2021 18:22:57 +0000
>> > Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>> >   rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
>> > From: Alan Mackenzie <acm@muc.de>
>
>> > > I think we agree.  Except that for me, it should also not try if it
>> > > cannot do it quickly enough, not only reliably enough.
>
>> > Quickly and reliably enough are desirable things, but in competition
>> > with eachother.  Reliably enough is a lot easier to measure, quickly
>> > enough depends on the machine, the degree of optimisation, and above
>> > all, the user's expectations.
>
>> That's why we had (and still have) font-lock-maximum-decoration: so
>> that users could control the tradeoff.  Unfortunately, support for
>> that variable is all but absent nowadays, because of the widespread
>> mistaken assumption that font-lock is fast enough in all modes.
>
>That variable is still supported by CC Mode (with the exception of AWK
>Mode, where it surely is not needed).
>
>Another possibility would be to replace accurate auxiliary functionality
>with rough and ready facilities.  In a scroll through xdisp.c, fontifying
>as we go, the following three functions are taking around 30% of the
>run-time:
>
>(i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
>  top level.
>(ii) c-determine-limit, c-determine-+ve-limit, which determine search
>  limits approximately ARG non-literal characters before or after point.
>
>By replacing these accurate functions with rough ones, the fontification
>would be right most of the time, but a mess at other times (for example,
>when there are big comments near point).  (i) is more important for C++
>that C, but still makes a difference in C.
>
>If we were to try this, I think a user toggle would be needed.
>
>> > > > IMHO, we should rely on LSP to figure out what symbols are types, and if
>> > > > a LSP isn't available, we shouldn't try to guess.
>
>> > "Shouldn't try to guess" means taking a great deal of
>> > font-lock-type-faces out of CC Mode.  I don't honestly think the end
>> > result would be any better than what we have at the moment.
>
>> You don't think it will be better for what reason?
>
>Because many users will still want at least the basic types (int, double,
>unsigned long, ....) fontified, leading to the very mess Daniel would
>like to avoid.   Declarations with basic types tend to be interleaved
>with those using project defined types.
>
>> > > I was talking about what to do (or not to do) with our existing
>> > > regexp- and "syntax"-based fontifications.  I still remember the days
>> > > when CC Mode handled that well enough without being a snail it
>> > > frequently is now, and that was on a machine about 10 times slower
>> > > than the one I use nowadays.
>
>> > Those old versions had masses of fontification bugs in them.
>
>> I don't remember bumping into those bugs.  Or maybe they were not
>> important enough to affect my UX.  Slow redisplay, by contrast, hits
>> me _every_day_, especially if I need to work with an unoptimized
>> build.  From where I stand, the balance between performance and
>> accuracy have shifted to the worse, unfortunately.
>
>OK.  My above suggestion might give ~50% increase in fontification speed.
>
>> > People wrote bug reports about them and they got fixed.  Those fixes
>> > frequently involved a loss of speed.  :-(
>
>> If there's no way of fixing a bug without adversely affecting speed,
>> we should add user options to control those "fixes", so that people
>> could choose the balance that fits them.
>
>I think this would be a bad thing.  There are no (or very few) similar
>user options in CC Mode at the moment, and an option to fix or not fix a
>bug seems a strange idea, and would make the code quite a bit more
>complicated.
>
>> Sometimes Emacs could itself decide whether to invoke the "slow" code.
>> For example, it makes no sense for users of C to be "punished" because
>> we want more accurate fontification of C++ sources.
>
>There is some truth in this imputation, yes.
>
>> > There have also been several bug reports about unusual buffers
>> > getting fontified at the speed of continental drift, and fixing those
>> > has usually led to a little slowdown for ordinary buffers.  I'm
>> > thinking, for example, about bug #25706, where a 4 MB file took
>> > nearly an hour to scroll through on my machine.  After the fix, it
>> > took around 86 seconds.
>
>> Once again, a pathological use case should not punish the usual ones;
>> if the punishment is too harsh, there should be a way to disable the
>> support for pathological cases for those who never hit them.
>
>The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
>3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
>bugs would just be unsolvable.
>
>> > > The C language didn't change too much since then, at least not the
>> > > flavor I frequently edit.
>
>> > There are two places where CC Mode can be slow: font locking large areas
>> > of text, and keeping up with somebody typing quickly.  Which of these
>> > bothers you the most?  I have plans for speeding up one of these.
>
>> Both, I guess.  Though the former is probably more prominent, since
>> I'm not really such a fast typist, but I do happen to scroll through
>> source quite a lot.
>
>Thanks.  I'll try to come up with speedups in the coming weeks (and
>months).
>
>Do you have fast-but-imprecise-scrolling enabled?  That can reduce the
>pain.
>
>-- 
>Alan Mackenzie (Nuremberg, Germany).
>



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:16                                       ` Ergus
@ 2021-06-10 15:34                                         ` Óscar Fuentes
  2021-06-10 19:06                                           ` Ergus
  2021-06-10 15:59                                         ` Jim Porter
  2021-06-10 21:02                                         ` Stefan Monnier
  2 siblings, 1 reply; 206+ messages in thread
From: Óscar Fuentes @ 2021-06-10 15:34 UTC (permalink / raw)
  To: emacs-devel

Ergus <spacibba@aol.com> writes:

> For things like font-lock/display engine I only consider to do as much
> as possible in the C side to improve performance. And reduce as much as
> possible interacting with the lisp side... Do you think that it may be
> possible?

Before going this route, we need to check if native-comp is enough of an
improvement and, if it isn't, try to improve it.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:16                                       ` Ergus
  2021-06-10 15:34                                         ` Óscar Fuentes
@ 2021-06-10 15:59                                         ` Jim Porter
  2021-06-10 21:02                                         ` Stefan Monnier
  2 siblings, 0 replies; 206+ messages in thread
From: Jim Porter @ 2021-06-10 15:59 UTC (permalink / raw)
  To: Ergus, Alan Mackenzie
  Cc: rms, emacs-devel, rudalics, monnier, Eli Zaretskii, dancol

On 6/10/2021 8:16 AM, Ergus wrote:
> 1) What is finally the most desirable/long path/future feature? I mean,,
> finally what is preferred by the developers to support in the future?
> 
> lsp or tree-sitter?

Elsewhere in the thread, I and a few others discussed this briefly. The 
solution other editors use (and which I think is ideal) is to start with 
a base that does its best purely by looking at the syntax of the file, 
and then augment that with LSP. For Emacs and CC-mode, this could mean 
continuing to use the current implementation, or switching to something 
built on tree-sitter. Then on top of that, Emacs can consult LSP for 
more-accurate information. I'm not sure whether this means LSP would 
take over entirely or if it would merely augment the base-level 
syntactic highlighting. Figuring that out would probably require doing 
some experiments to see what the best solution for Emacs would look like.

One of the main benefits of continuing to have some form of (non-LSP) 
syntactic highlighting is that it works for everyone. Even if you don't 
have an LSP server installed, you may want to edit a source file in a 
particular language. Your LSP server of choice may also lack full 
semantic highlighting support (it's a pretty new feature, as I 
understand it). Having a reasonably-correct baseline that works 
everywhere is nice, and hopefully there are no plans to get rid of that.

LSP *may* also be too slow in some situations (though this is just a 
guess). For example, when editing a file over TRAMP, the LSP server runs 
on the remote side; if the network is slow, this could result in delayed 
fontification while editing, which reduces the usefulness of 
fontification. In addition, a new checkout of a large project won't have 
any cached LSP information, so analyzing the code enough to generate 
semantic highlighting may take some time. These might not actually be 
problems, but they do make me a bit skeptical about the performance of a 
purely LSP-based fontification system.

- Jim



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  6:39                                       ` Eli Zaretskii
@ 2021-06-10 16:46                                         ` Alan Mackenzie
  2021-06-10 17:01                                           ` Eli Zaretskii
  2021-06-10 21:06                                           ` Stefan Monnier
  0 siblings, 2 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-10 16:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, monnier, rms, emacs-devel

Hello, Eli.

On Thu, Jun 10, 2021 at 09:39:06 +0300, Eli Zaretskii wrote:
> > Date: Wed, 9 Jun 2021 21:03:03 +0000
> > Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
> >   emacs-devel@gnu.org, rms@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > That's why we had (and still have) font-lock-maximum-decoration: so
> > > that users could control the tradeoff.  Unfortunately, support for
> > > that variable is all but absent nowadays, because of the widespread
> > > mistaken assumption that font-lock is fast enough in all modes.

> > That variable is still supported by CC Mode (with the exception of AWK
> > Mode, where it surely is not needed).

> Does it make a difference, performance-wise?  If not (which is what
> ISTR), then that variable isn't really "supported", because supporting
> it means that different values of it cause tangible differences in
> performance.

Yes, it does make a difference.  On my machine, the times to scroll
through xdisp.c with my favourite benchmark for
font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.

> > Another possibility would be to replace accurate auxiliary functionality
> > with rough and ready facilities.  In a scroll through xdisp.c, fontifying
> > as we go, the following three functions are taking around 30% of the
> > run-time:

> > (i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
> >   top level.
> > (ii) c-determine-limit, c-determine-+ve-limit, which determine search
> >   limits approximately ARG non-literal characters before or after point.

> > By replacing these accurate functions with rough ones, the fontification
> > would be right most of the time, but a mess at other times (for example,
> > when there are big comments near point).  (i) is more important for C++
> > that C, but still makes a difference in C.

> > If we were to try this, I think a user toggle would be needed.

> How about making font-lock-maximum-decoration control that as well?

Maybe.  It seems, though, that f-l-max-decoration is primarily about the
degree of fontification applied, not its accuracy.

> > > > "Shouldn't try to guess" means taking a great deal of
> > > > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > > > result would be any better than what we have at the moment.

> > > You don't think it will be better for what reason?

> > Because many users will still want at least the basic types (int, double,
> > unsigned long, ....) fontified

> I'm not sure.  Can you explain why would I care too much about the
> basic types (or types in general) standing out?

Well, I care for my own personal use, because the type fontifications
help optically to separate the different parts of a function without
needing to look too hard.  The coloured bits are the variable
declarations, to a zeroth order approximation.  I suspect different users
have very different needs here.  Doesn't RMS run with font lock switched
off (or is that just a rumour)?

> > > If there's no way of fixing a bug without adversely affecting speed,
> > > we should add user options to control those "fixes", so that people
> > > could choose the balance that fits them.

> > I think this would be a bad thing.  There are no (or very few) similar
> > user options in CC Mode at the moment, and an option to fix or not fix a
> > bug seems a strange idea

> It depends on the bug.  If the bug causes Emacs to infloop or work
> very slowly, then sure, no toggle for the fix would make sense.  But I
> was talking about "bugs" that cause inaccurate or incorrect
> fontifications, and those are much "softer".  At least IMO such "bugs"
> are tolerable if they are rare enough, especially if fixing them hurts
> redisplay performance and Emacs responsiveness in general.

> Don't forget that the display code invokes fontifications also when it
> does internal layout calculations whose results are not immediately
> shown (or even not at all).  When that happens, some command not
> directly related to display could be adversely affected.  So one idea
> would be to turn off these expensive parts in those cases.

That would be difficult.  Frequently a bug fix involves extensive code
changes rather than simply a block of code one could put an `if' around.

> > > Once again, a pathological use case should not punish the usual ones;
> > > if the punishment is too harsh, there should be a way to disable the
> > > support for pathological cases for those who never hit them.

> > The punishment is rarely too harsh for a single bug.  But a lot of
> > 2%s, 3%s or 5%s add up over time.  If we were to outlaw a "3% fix",
> > then many bugs would just be unsolvable.

> Once again: what kind of "bugs" are those?

They're not of any particular kind.  Any bug fix could slow CC Mode down
marginally.  Some have been known to speed it up.

> If they only cause imperfect faces, I'm not sure it's unthinkable to
> disable them, given some optional value of a user knob.

Well, I've fixed around 550 bugs in CC Mode in the last 20 years.
Identifying and reversing a subset of these to revert the performance
would be difficult.

> > Do you have fast-but-imprecise-scrolling enabled?

> No.  That's a separate issue, and influences all the modes, even those
> where font-lock is light-weight.

You could set it buffer locally in c-mode-common-hook, for example.  It
won't solve the basic problem, but it might brighten your day up.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 16:46                                         ` Alan Mackenzie
@ 2021-06-10 17:01                                           ` Eli Zaretskii
  2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 21:06                                           ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:01 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, monnier, rms, emacs-devel

> Date: Thu, 10 Jun 2021 16:46:11 +0000
> Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
>   emacs-devel@gnu.org, rms@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > > That variable is still supported by CC Mode (with the exception of AWK
> > > Mode, where it surely is not needed).
> 
> > Does it make a difference, performance-wise?  If not (which is what
> > ISTR), then that variable isn't really "supported", because supporting
> > it means that different values of it cause tangible differences in
> > performance.
> 
> Yes, it does make a difference.  On my machine, the times to scroll
> through xdisp.c with my favourite benchmark for
> font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.

Then I suggest to set it to 2 by default.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:01                                           ` Eli Zaretskii
@ 2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 17:22                                               ` Eli Zaretskii
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-10 17:07 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, monnier, rms, emacs-devel



On June 10, 2021 10:01:49 AM Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Thu, 10 Jun 2021 16:46:11 +0000
>> Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
>> emacs-devel@gnu.org, rms@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>>>> That variable is still supported by CC Mode (with the exception of AWK
>>>> Mode, where it surely is not needed).
>>
>>> Does it make a difference, performance-wise?  If not (which is what
>>> ISTR), then that variable isn't really "supported", because supporting
>>> it means that different values of it cause tangible differences in
>>> performance.
>>
>> Yes, it does make a difference.  On my machine, the times to scroll
>> through xdisp.c with my favourite benchmark for
>> font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.
>
> Then I suggest to set it to 2 by default.

Performance is reasonable most of the time. If it weren't, we'd see rampant 
complaints. Emacs should default to maximum fontification. If it doesn't, 
most users won't even know they can get more.







^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:07                                             ` Daniel Colascione
@ 2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:33                                                 ` Daniel Colascione
                                                                   ` (2 more replies)
  2021-06-10 17:26                                               ` Óscar Fuentes
  2021-06-10 17:39                                               ` andrés ramírez
  2 siblings, 3 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:22 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 10 Jun 2021 10:07:52 -0700
> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> 
> > Then I suggest to set it to 2 by default.
> 
> Performance is reasonable most of the time.

Not IME.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 17:22                                               ` Eli Zaretskii
@ 2021-06-10 17:26                                               ` Óscar Fuentes
  2021-06-10 17:39                                               ` andrés ramírez
  2 siblings, 0 replies; 206+ messages in thread
From: Óscar Fuentes @ 2021-06-10 17:26 UTC (permalink / raw)
  To: emacs-devel

Daniel Colascione <dancol@dancol.org> writes:

>>> Yes, it does make a difference.  On my machine, the times to scroll
>>> through xdisp.c with my favourite benchmark for
>>> font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.
>>
>> Then I suggest to set it to 2 by default.
>
> Performance is reasonable most of the time. If it weren't, we'd see
> rampant complaints. Emacs should default to maximum fontification. If
> it doesn't, most users won't even know they can get more.

Yes.

And it is remarkable that a thread about incorrect fontification could
yield a change on the defaults that guarantees even more incorrect
fontification :-)




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:22                                               ` Eli Zaretskii
@ 2021-06-10 17:33                                                 ` Daniel Colascione
  2021-06-10 17:39                                                   ` Eli Zaretskii
  2021-06-10 17:40                                                 ` Óscar Fuentes
  2021-06-11 16:11                                                 ` Alan Mackenzie
  2 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-10 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, rudalics, monnier, rms, emacs-devel

On 6/10/21 10:22 AM, Eli Zaretskii wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Thu, 10 Jun 2021 10:07:52 -0700
>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>>
>>> Then I suggest to set it to 2 by default.
>> Performance is reasonable most of the time.
> Not IME.

Is it true that you run at -O0 and extra checking enabled?




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:33                                                 ` Daniel Colascione
@ 2021-06-10 17:39                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:39 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, rudalics, monnier, rms, emacs-devel

> Cc: acm@muc.de, emacs-devel@gnu.org, monnier@iro.umontreal.ca, rms@gnu.org,
>  rudalics@gmx.at
> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 10 Jun 2021 10:33:56 -0700
> 
> >> Performance is reasonable most of the time.
> > Not IME.
> 
> Is it true that you run at -O0 and extra checking enabled?

Sometimes, yes.  But mostly, no.  My long-term production sessions are
usually a released Emacs compiled with the default options, which
means -O2 and no --enable-checking.  I do use --with-wide-int, but
that incurs only a 30% slowdown.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:26                                               ` Óscar Fuentes
@ 2021-06-10 17:39                                               ` andrés ramírez
  2 siblings, 0 replies; 206+ messages in thread
From: andrés ramírez @ 2021-06-10 17:39 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: rms, emacs-devel, rudalics, monnier, Alan Mackenzie, Eli Zaretskii

Hi. Daniel. Hi. Guys.
    >> Then I suggest to set it to 2 by default.

    Daniel> Performance is reasonable most of the time. If it weren't, we'd see rampant
    Daniel> complaints. Emacs should default to maximum fontification. If it doesn't, most users
    Daniel> won't even know they can get more.

On my SBC-opiplus2e I have it set to nil. And It is very slow. But I am
aware that people using slow devices are not the majority. That's one of
the reasons I sometimes fire up emacs23 (the speed of light emacs).

GC and timers should also add some weight to the slowness. Take in
account also that on phones these slowness means battery-life. Again not
all people have emacs on their phones.

Best Regards








^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:33                                                 ` Daniel Colascione
@ 2021-06-10 17:40                                                 ` Óscar Fuentes
  2021-06-10 17:44                                                   ` Eli Zaretskii
  2021-06-11 16:11                                                 ` Alan Mackenzie
  2 siblings, 1 reply; 206+ messages in thread
From: Óscar Fuentes @ 2021-06-10 17:40 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Thu, 10 Jun 2021 10:07:52 -0700
>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>> 
>> > Then I suggest to set it to 2 by default.
>> 
>> Performance is reasonable most of the time.
>
> Not IME.

But your use case is not representative, isn't it? Using a debug build
with checks enabled have a large impact on performance.

BTW, from time to time I use a 2011 netbook with an Atom CPU and have no
complaints while working with 30k lines-long machine-generated (read:
code-dense, comment-sparse, almost no withespace) C++ files.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:40                                                 ` Óscar Fuentes
@ 2021-06-10 17:44                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:44 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Thu, 10 Jun 2021 19:40:56 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Daniel Colascione <dancol@dancol.org>
> >> Date: Thu, 10 Jun 2021 10:07:52 -0700
> >> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> >> 
> >> > Then I suggest to set it to 2 by default.
> >> 
> >> Performance is reasonable most of the time.
> >
> > Not IME.
> 
> But your use case is not representative, isn't it? Using a debug build
> with checks enabled have a large impact on performance.

See my other message: you have an inaccurate impression about my use
cases.

And I don't really agree that debug builds are uninteresting: if they
are so slow, it means our fontification is borderline even on
relatively fast machines.

> BTW, from time to time I use a 2011 netbook with an Atom CPU and have no
> complaints while working with 30k lines-long machine-generated (read:
> code-dense, comment-sparse, almost no withespace) C++ files.

Exactly.  So why cannot we have the same level of performance, and
need to rely on compiler optimizations even on a fast i7?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:34                                         ` Óscar Fuentes
@ 2021-06-10 19:06                                           ` Ergus
  2021-06-10 19:28                                             ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-10 19:06 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

On Thu, Jun 10, 2021 at 05:34:31PM +0200, �scar Fuentes wrote:
>Ergus <spacibba@aol.com> writes:
>
>> For things like font-lock/display engine I only consider to do as much
>> as possible in the C side to improve performance. And reduce as much as
>> possible interacting with the lisp side... Do you think that it may be
>> possible?
>
>Before going this route, we need to check if native-comp is enough of an
>improvement and, if it isn't, try to improve it.
>
>
I work very extensively with jit compilers and similar and with
different architectures (ARM, ePIC, Intel).

In my experience the JIT improvement in performance is very significant
compared to bytecode. Specially for a similar code the difference in
time can be 1 or even 2 orders of magnitude better.

BUT

When translating from high level languages (my experience: cpython and
julia) it requires much more effort, optimization and time to improve
the compiler to get just a same order or performance than a similar C
code. Just creating a low level "intrinsic" or "binding" saves time and
relies in many other optimization the C compiler already have.

Ex: AOS vs SOA, vectorization, parallelization and similar optimizations
are very easy to do at low level (or give the hints to the compiler to
do them). But it is extremely hard to teach a high level compiler to do
themq; basically because of the data structures and types we use in high
level languages.

That's why in python it is so extended to use libraries like Pandas or
Numpy. And every time more and more python packages are just interfaces
to C libraries. Julia on the other hand provides C primitives for
everything and has primitive data types to give more hints to the
compiler... but even with that in real code it can't compare to
Python+numpy. Other languages like PHP have a very good compiler
improved for decades, but even with that, they have moved a lot of their
functionalities to C code with some bindings.

Font locking is a dynamic feature and affects responsiveness, and must
be executed in the background constantly, so even fast will be never too
fast. Responsiveness is an usual complain when new users come from
different editors. But also the languages syntax are expected to become
more complex with the time.

In our case, just accessing the buffer content and passing directly to
tree-sitter in C will be almost trivial at the low level, without types
conversions or extra copies; but also when we receive the output,
processing them with the json library we already link against will be
orders of magnitude simpler and faster, instead of converting them to
lisp object and stressing the gc with temporal objects we don't really
need, and then iterate on the lisp level.

Sadly not all the architectures are supported by libgccjit either. So a
low level call-preprocess solution will work faster for
everyone. Specially for those that still use emacs 23 due to the speed
feeling.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 19:06                                           ` Ergus
@ 2021-06-10 19:28                                             ` Eli Zaretskii
  2021-06-10 21:56                                               ` Ergus
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-10 19:28 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Thu, 10 Jun 2021 21:06:22 +0200
> From: Ergus <spacibba@aol.com>
> Cc: emacs-devel@gnu.org
> 
> >Before going this route, we need to check if native-comp is enough of an
> >improvement and, if it isn't, try to improve it.
> >
> >
> I work very extensively with jit compilers and similar and with
> different architectures (ARM, ePIC, Intel).

Our native-compilation feature is not really JIT.  Once a Lisp file
was native-compiled, it is loaded from a file and used without any JIT
step.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:16                                       ` Ergus
  2021-06-10 15:34                                         ` Óscar Fuentes
  2021-06-10 15:59                                         ` Jim Porter
@ 2021-06-10 21:02                                         ` Stefan Monnier
  2021-06-11 20:21                                           ` Ergus
  2 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-10 21:02 UTC (permalink / raw)
  To: Ergus; +Cc: Alan Mackenzie, Eli Zaretskii, rudalics, dancol, rms, emacs-devel

> 1) What is finally the most desirable/long path/future feature?
> I mean, finally what is preferred by the developers to support in the future?
>
> lsp or tree-sitter?
      ^^
     and


-- Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 16:46                                         ` Alan Mackenzie
  2021-06-10 17:01                                           ` Eli Zaretskii
@ 2021-06-10 21:06                                           ` Stefan Monnier
  2021-06-11  6:14                                             ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-10 21:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, dancol, rudalics, emacs-devel, rms

> Well, I've fixed around 550 bugs in CC Mode in the last 20 years.
> Identifying and reversing a subset of these to revert the performance
> would be difficult.

Clearly  what would work better is to have a clear "test case" where the
performance is poor.  Then we could investigate what is the cause of
this particular problem and see how to fix this (and hopefully other
similar) circumstance.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 19:28                                             ` Eli Zaretskii
@ 2021-06-10 21:56                                               ` Ergus
  0 siblings, 0 replies; 206+ messages in thread
From: Ergus @ 2021-06-10 21:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Thu, Jun 10, 2021 at 10:28:17PM +0300, Eli Zaretskii wrote:
>> Date: Thu, 10 Jun 2021 21:06:22 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: emacs-devel@gnu.org
>>
>> >Before going this route, we need to check if native-comp is enough of an
>> >improvement and, if it isn't, try to improve it.
>> >
>> >
>> I work very extensively with jit compilers and similar and with
>> different architectures (ARM, ePIC, Intel).
>
>Our native-compilation feature is not really JIT.  Once a Lisp file
>was native-compiled, it is loaded from a file and used without any JIT
>step.
>
Yes, I know, but it relies on the libgccjit. Doing the compilation at
once or dynamically will generate similar native code any way.

In any case the optimizations it can generate are very limited due to
the limited information about types, alignment, the complexity of data
structures and lisp types are dynamic.

Following the Andrea's blog he actually describes some of the ideas he
has to optimize the compiler. But that's still very limited and will
require a lot of work and probably some small modifications in the Elisp
syntax to make it optimal... something that will require many years and
a lot of discussions.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 21:06                                           ` Stefan Monnier
@ 2021-06-11  6:14                                             ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11  6:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  dancol@dancol.org,  rudalics@gmx.at,
>   emacs-devel@gnu.org,  rms@gnu.org
> Date: Thu, 10 Jun 2021 17:06:31 -0400
> 
> > Well, I've fixed around 550 bugs in CC Mode in the last 20 years.
> > Identifying and reversing a subset of these to revert the performance
> > would be difficult.
> 
> Clearly  what would work better is to have a clear "test case" where the
> performance is poor.  Then we could investigate what is the cause of
> this particular problem and see how to fix this (and hopefully other
> similar) circumstance.

We already have, and we already did.  (It was even mentioned in this
discussion.)



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:33                                                 ` Daniel Colascione
  2021-06-10 17:40                                                 ` Óscar Fuentes
@ 2021-06-11 16:11                                                 ` Alan Mackenzie
  2021-06-11 17:53                                                   ` Eli Zaretskii
  2 siblings, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-11 16:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, emacs-devel, monnier, rms

Hello, Eli.

On Thu, Jun 10, 2021 at 20:22:50 +0300, Eli Zaretskii wrote:
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Thu, 10 Jun 2021 10:07:52 -0700
> > Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org

> > > Then I suggest to set it to 2 by default.

> > Performance is reasonable most of the time.

> Not IME.

I have measured CC Mode's scrolling performance using:

(defmacro time-it (&rest forms)
  "Time the running of a sequence of forms using `float-time'.
Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
  `(let ((start (float-time)))
    ,@forms
    (- (float-time) start)))

together with

M-: (time-it (scroll-up-window) (sit-for 0))

on regions of text which are not yet fontified.  My window has 65 lines
of buffer text.  Starting at the middle of xdisp.c, I see the following
timings for the first few scrolls:

   0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.

That is, with the exception of the fourth timing, the scroll operation
takes a little over 1/40 second.

This is in an Emacs-28 compiled with default optimisation, on a 4
year-old first generation Ryzen machine.

For me personally, this scrolling speed, in conjunction with
fast-but-imprecise-scrolling, is acceptable.  I also accept there are
people with slower machines.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 16:11                                                 ` Alan Mackenzie
@ 2021-06-11 17:53                                                   ` Eli Zaretskii
  2021-06-11 18:02                                                     ` Daniel Colascione
  2021-06-11 18:34                                                     ` Alan Mackenzie
  0 siblings, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 17:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 16:11:19 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
>   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> I have measured CC Mode's scrolling performance using:
> 
> (defmacro time-it (&rest forms)
>   "Time the running of a sequence of forms using `float-time'.
> Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
>   `(let ((start (float-time)))
>     ,@forms
>     (- (float-time) start)))
> 
> together with
> 
> M-: (time-it (scroll-up-window) (sit-for 0))
> 
> on regions of text which are not yet fontified.  My window has 65 lines
> of buffer text.  Starting at the middle of xdisp.c, I see the following
> timings for the first few scrolls:
> 
>    0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
> 
> That is, with the exception of the fourth timing, the scroll operation
> takes a little over 1/40 second.
> 
> This is in an Emacs-28 compiled with default optimisation, on a 4
> year-old first generation Ryzen machine.
> 
> For me personally, this scrolling speed, in conjunction with
> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> people with slower machines.

I suggest to compare these times with Emacs 23 to see how we
regressed.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 17:53                                                   ` Eli Zaretskii
@ 2021-06-11 18:02                                                     ` Daniel Colascione
  2021-06-11 18:22                                                       ` Eli Zaretskii
  2021-06-11 18:42                                                       ` Stefan Monnier
  2021-06-11 18:34                                                     ` Alan Mackenzie
  1 sibling, 2 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-11 18:02 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, emacs-devel, monnier, rms

On 6/11/21 10:53 AM, Eli Zaretskii wrote:

>> Date: Fri, 11 Jun 2021 16:11:19 +0000
>> Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
>>    monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>> I have measured CC Mode's scrolling performance using:
>>
>> (defmacro time-it (&rest forms)
>>    "Time the running of a sequence of forms using `float-time'.
>> Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
>>    `(let ((start (float-time)))
>>      ,@forms
>>      (- (float-time) start)))
>>
>> together with
>>
>> M-: (time-it (scroll-up-window) (sit-for 0))
>>
>> on regions of text which are not yet fontified.  My window has 65 lines
>> of buffer text.  Starting at the middle of xdisp.c, I see the following
>> timings for the first few scrolls:
>>
>>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
>>
>> That is, with the exception of the fourth timing, the scroll operation
>> takes a little over 1/40 second.
>>
>> This is in an Emacs-28 compiled with default optimisation, on a 4
>> year-old first generation Ryzen machine.
>>
>> For me personally, this scrolling speed, in conjunction with
>> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
>> people with slower machines.
> I suggest to compare these times with Emacs 23 to see how we
> regressed.


Regression is acceptable in exchange for correctness so long as absolute 
performance is adequate. We're not using 80486s anymore.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:02                                                     ` Daniel Colascione
@ 2021-06-11 18:22                                                       ` Eli Zaretskii
  2021-06-11 18:28                                                         ` Daniel Colascione
  2021-06-11 18:47                                                         ` Alan Mackenzie
  2021-06-11 18:42                                                       ` Stefan Monnier
  1 sibling, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 18:22 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>  emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 11 Jun 2021 11:02:34 -0700
> 
> >>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
> >>
> >> That is, with the exception of the fourth timing, the scroll operation
> >> takes a little over 1/40 second.
> >>
> >> This is in an Emacs-28 compiled with default optimisation, on a 4
> >> year-old first generation Ryzen machine.
> >>
> >> For me personally, this scrolling speed, in conjunction with
> >> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> >> people with slower machines.
> > I suggest to compare these times with Emacs 23 to see how we
> > regressed.
> 
> Regression is acceptable in exchange for correctness so long as absolute 
> performance is adequate. We're not using 80486s anymore.

Here are my times using an optimized build of Emacs 27.2 on a 3.4GHz
Core i7 box:

  0.015625
  0.03125
  0.015625
  0.046875
  0.09375
  0.0625
  0.015625
  0.03125
  0.015625
  0.03125
  0.015625
  0.03125

You consider this to be adequate performance for a single
window-scroll?  (I don't have an optimized build of Emacs 28, but
there's no reason to believe it is faster; quite the opposite.)

And here's the top part of the profile while running the above
benchmark:

  - redisplay_internal (C function)                                 159  65%
   - jit-lock-function                                              158  65%
    - jit-lock-fontify-now                                          158  65%
     - jit-lock--run-functions                                      158  65%
      - run-hook-wrapped                                            158  65%
       - #<compiled -0x1ffffffff8a67860>                            158  65%
	- font-lock-fontify-region                                  157  65%
	 - c-font-lock-fontify-region                               157  65%
	  - font-lock-default-fontify-region                        146  60%
	   - font-lock-fontify-keywords-region                      143  59%
	    - c-font-lock-declarations                               97  40%
	     - c-find-decl-spots                                     97  40%
	      - #<compiled -0x1ffffffff94b65d0>                      73  30%
	       - c-forward-decl-or-cast-1                            38  15%
		- c-forward-type                                     22   9%
		 - c-check-qualified-type                             7   2%

We can stick our heads in the sand as much as we want, but facts are
stubborn things.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:22                                                       ` Eli Zaretskii
@ 2021-06-11 18:28                                                         ` Daniel Colascione
  2021-06-11 19:12                                                           ` Alan Mackenzie
  2021-06-11 19:23                                                           ` Eli Zaretskii
  2021-06-11 18:47                                                         ` Alan Mackenzie
  1 sibling, 2 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-11 18:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, monnier, rms, rudalics

On 6/11/21 11:22 AM, Eli Zaretskii wrote:

>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>>   emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 11 Jun 2021 11:02:34 -0700
>>
>>>>      0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
>>>>
>>>> That is, with the exception of the fourth timing, the scroll operation
>>>> takes a little over 1/40 second.
>>>>
>>>> This is in an Emacs-28 compiled with default optimisation, on a 4
>>>> year-old first generation Ryzen machine.
>>>>
>>>> For me personally, this scrolling speed, in conjunction with
>>>> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
>>>> people with slower machines.
>>> I suggest to compare these times with Emacs 23 to see how we
>>> regressed.
>> Regression is acceptable in exchange for correctness so long as absolute
>> performance is adequate. We're not using 80486s anymore.
> Here are my times using an optimized build of Emacs 27.2 on a 3.4GHz
> Core i7 box:
>
>    0.015625
>    0.03125
>    0.015625
>    0.046875
>    0.09375
>    0.0625
>    0.015625
>    0.03125
>    0.015625
>    0.03125
>    0.015625
>    0.03125
>
> You consider this to be adequate performance for a single
> window-scroll?  (I don't have an optimized build of Emacs 28, but
> there's no reason to believe it is faster; quite the opposite.)

native-comp?

>
> And here's the top part of the profile while running the above
> benchmark:
>
>    - redisplay_internal (C function)                                 159  65%
>     - jit-lock-function                                              158  65%
>      - jit-lock-fontify-now                                          158  65%
>       - jit-lock--run-functions                                      158  65%
>        - run-hook-wrapped                                            158  65%
>         - #<compiled -0x1ffffffff8a67860>                            158  65%
> 	- font-lock-fontify-region                                  157  65%
> 	 - c-font-lock-fontify-region                               157  65%
> 	  - font-lock-default-fontify-region                        146  60%
> 	   - font-lock-fontify-keywords-region                      143  59%
> 	    - c-font-lock-declarations                               97  40%
> 	     - c-find-decl-spots                                     97  40%
> 	      - #<compiled -0x1ffffffff94b65d0>                      73  30%
> 	       - c-forward-decl-or-cast-1                            38  15%
> 		- c-forward-type                                     22   9%
> 		 - c-check-qualified-type                             7   2%
>
> We can stick our heads in the sand as much as we want, but facts are
> stubborn things.

Hrm. That doesn't seem consistent with Alan's report that we spend a ton 
of time doing work like deciding whether a brace occurs at top-level. My 
question stands: what core facilities can we add to accelerate cc-mode's 
parsing here? There's got to be some efficiency we can gain here.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 17:53                                                   ` Eli Zaretskii
  2021-06-11 18:02                                                     ` Daniel Colascione
@ 2021-06-11 18:34                                                     ` Alan Mackenzie
  1 sibling, 0 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-11 18:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, emacs-devel, monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 20:53:10 +0300, Eli Zaretskii wrote:
> > Date: Fri, 11 Jun 2021 16:11:19 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
> >   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > I have measured CC Mode's scrolling performance using:

> > (defmacro time-it (&rest forms)
> >   "Time the running of a sequence of forms using `float-time'.
> > Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
> >   `(let ((start (float-time)))
> >     ,@forms
> >     (- (float-time) start)))

> > together with

> > M-: (time-it (scroll-up-window) (sit-for 0))

> > on regions of text which are not yet fontified.  My window has 65 lines
> > of buffer text.  Starting at the middle of xdisp.c, I see the following
> > timings for the first few scrolls:

> >    0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.

> > That is, with the exception of the fourth timing, the scroll operation
> > takes a little over 1/40 second.

> > This is in an Emacs-28 compiled with default optimisation, on a 4
> > year-old first generation Ryzen machine.

> > For me personally, this scrolling speed, in conjunction with
> > fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> > people with slower machines.

> I suggest to compare these times with Emacs 23 to see how we
> regressed.

OK, on emacs-23.3 -Q, otherwise exactly the same circumstances,  I get
these timings:

    0.0093s, 0.0089s, 0.0084s, 0.0144s, 0.0094s, 0.0084s.

So the difference is around a factor of 3, perhaps a little more.  "Half
an order of magnitude" perhaps sums it up best.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:02                                                     ` Daniel Colascione
  2021-06-11 18:22                                                       ` Eli Zaretskii
@ 2021-06-11 18:42                                                       ` Stefan Monnier
  2021-06-11 19:31                                                         ` Eli Zaretskii
  2021-06-11 19:48                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-11 18:42 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Eli Zaretskii, Alan Mackenzie, rudalics, rms, emacs-devel

>>> M-: (time-it (scroll-up-window) (sit-for 0))
>>>
>>> on regions of text which are not yet fontified.  My window has 65 lines
>>> of buffer text.  Starting at the middle of xdisp.c, I see the following
>>> timings for the first few scrolls:
>>>
>>>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
>>>
>>> That is, with the exception of the fourth timing, the scroll operation
>>> takes a little over 1/40 second.

FWIW, see below my measurements using Emacs's `master` with the compile
options I happen to use (i.e. extra checks and -Og) on my almost still
new Librem mini (which was running at ~4GHz during that time, so I'd
expect it to be about twice as fast as what I'd see with most of my
other machines).

I used pretty much your above test, except I started it at BOB of
xdisp.c and used:

    M_: (dotimes (_ 700)
          (message "%S" (benchmark-elapse (scroll-up) (sit-for 0)))
          (sleep-for 0.05))

As you can see, the speed doesn't get noticeably worse as we go further
into the file (the first few screenfuls were a bit faster, but after
that it's a wash).

Eli, do you see similar results?
Would you say that this shows the slow behaviors that bother you?

E.g. there used to be a time where I found CC-mode unusably slow in some
cases, but these were typically while editing rather than while
scrolling (i.e. even simple buffer modifications incurred delays
measured in seconds).

FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
about as well as CC-mode, but solves an easier problem since it doesn't
try to handle as much of C as CC-mode does (e.g. no support for K&R, no
highlighting of types), nor does it try to handle C++, Java, ...), and
most of the times for it are between 0.02 and 0.04.


        Stefan


0.064421217
0.053839483
0.044885893
0.043997597
0.092963631
0.042693633
0.099381721
0.128002545
0.094545212
0.163745893
0.105989807
0.055903411
0.10900201
0.111409377
0.194541712
0.153313625
0.109133194
0.245912877
0.283898722
0.322253988
0.193199986
0.118792661
0.199273251
0.130073771
0.196526213
0.115176187
0.172246163
0.126954015
0.167790303
0.109683224
0.127960426
0.171211834
0.114826649
0.114200165
0.189255059
0.116494687
0.130894337
0.096504964
0.159958961
0.109482149
0.114501549
0.108971213
0.177497445
0.108309423
0.105783162
0.168977695
0.100549815
0.103706478
0.092696073
0.10432707
0.173879201
0.175646549
0.111500935
0.118432586
0.12011896
0.172456591
0.177120433
0.104136632
0.099530361
0.127795086
0.176916521
0.138550313
0.114789114
0.17856252
0.118454075
0.117133346
0.101272965
0.099138115
0.116056804
0.126078027
0.163253319
0.127315341
0.185968183
0.124788531
0.182340263
0.130805359
0.121452585
0.125351387
0.139440851
0.107537455
0.186858364
0.120384479
0.133356353
0.131290683
0.168943064
0.136992182
0.12346563
0.112871744
0.122782849
0.106021947
0.262331903
0.118295096
0.185145874
0.118002528
0.24931801
0.104444512
0.120716476
0.167382408
0.11458813
0.125722018
0.098804093
0.179202455
0.12640851
0.174556734
0.11220414
0.109073496
0.111259698
0.11418513
0.108025927
0.122940442
0.191836234
0.113417345
0.120711433
0.172513071
0.114420954
0.106913074
0.120929181
0.112327071
0.115024723
0.112698933
0.117357841
0.171556781
0.108914295
0.122564707
0.171817164
0.124725584
0.116682097
0.110775186
0.189281251
0.123835457
0.116927855
0.122824897
0.195963401
0.127717141
0.142261624
0.209577271
0.13124328
0.105018838
0.140045227
0.117403158
0.170725313
0.11485384
0.09650258
0.110668479
0.117975569
0.113766316
0.112954986
0.174354914
0.112653174
0.127833658
0.180525967
0.108714222
0.114321764
0.181837745
0.105400609
0.116630508
0.118542553
0.110567673
0.110128366
0.118041019
0.118595549
0.115326382
0.109436946
0.115400399
0.111347021
0.18042566
0.118994131
0.115646883
0.104489214
0.130443576
0.115561413
0.11330043
0.170368134
0.119400101
0.110526952
0.114555681
0.112566447
0.115888525
0.113044462
0.188394244
0.119813541
0.126508385
0.108936934
0.188695379
0.128201612
0.097573221
0.204059025
0.122495487
0.116058655
0.201060241
0.212312514
0.190197327
0.149310586
0.260126393
0.115149485
0.125418184
0.18942006
0.118149107
0.117835293
0.172673811
0.119002821
0.126055033
0.187659521
0.185205005
0.182849187
0.116862462
0.113111974
0.12542143
0.175057205
0.126864337
0.176218651
0.105942454
0.191953882
0.109533068
0.126686414
0.12604197
0.110572607
0.169785167
0.192896603
0.124740572
0.105335305
0.181733158
0.126450975
0.193657901
0.109094171
0.121084347
0.119585141
0.18067882
0.124754366
0.121194971
0.113421604
0.199118707
0.120581752
0.123201428
0.112947635
0.199405214
0.118820273
0.194467066
0.139457159
0.122085324
0.207810103
0.127238785
0.142071442
0.135402281
0.185030134
0.117510442
0.130970326
0.203497039
0.112685073
0.123192423
0.114474405
0.117449097
0.119929349
0.178402479
0.044960255
0.184778022
0.13186773
0.113406861
0.121064466
0.121199285
0.17902352
0.126698071
0.121117545
0.120106224
0.105877834
0.122465264
0.119232435
0.122804551
0.181922471
0.108515085
0.137086941
0.183930017
0.115787167
0.11794999
0.121208862
0.1163856
0.112712585
0.125896637
0.116050806
0.122970697
0.209042021
0.114536011
0.12732074
0.11918999
0.126965367
0.114274393
0.110505228
0.124278297
0.126557099
0.139104688
0.187700593
0.148332242
0.122385495
0.119986772
0.13254469
0.11980965
0.120371393
0.118032327
0.125577788
0.116801037
0.134561984
0.123288516
0.203589458
0.133222843
0.120893941
0.115931088
0.055410411
0.189834458
0.122465816
0.113808715
0.125036054
0.130117908
0.118056582
0.122033541
0.116559544
0.125301083
0.116939394
0.111072544
0.058279055
0.18299224
0.109533422
0.127404332
0.049588377
0.126764598
0.123352779
0.178826006
0.142064223
0.123598934
0.135688938
0.116330035
0.132189803
0.11364705
0.123380271
0.122618636
0.121231604
0.124962892
0.127782382
0.115393903
0.127666529
0.128069211
0.127825324
0.202362599
0.148214603
0.047154273
0.207692422
0.140212978
0.131902642
0.126609117
0.131795201
0.146280119
0.132221744
0.15205663
0.149419624
0.12882288
0.143127792
0.127806696
0.108093882
0.127447566
0.125514061
0.151355249
0.142197844
0.128111287
0.126984641
0.111681458
0.059230937
0.18387953
0.131424016
0.127260813
0.123185942
0.12301305
0.198837465
0.201908502
0.118353592
0.116802308
0.220584087
0.122908136
0.143131345
0.195682054
0.056502461
0.120201008
0.127099372
0.111206286
0.120740443
0.139805891
0.130569691
0.121373414
0.128916776
0.116291152
0.129381268
0.12844324
0.132286855
0.127196939
0.119538936
0.119440131
0.055392152
0.133164942
0.123128391
0.119713239
0.122640955
0.140901944
0.232109835
0.05099387
0.157270089
0.120189717
0.149400334
0.148006771
0.135561395
0.114432766
0.124214831
0.127588578
0.133017473
0.127939599
0.129795445
0.124383374
0.130871816
0.059958365
0.13643085
0.116147862
0.126365698
0.126586554
0.073479631
0.113455843
0.138131483
0.122755288
0.130340758
0.123803689
0.133688313
0.133731208
0.058815443
0.129348146
0.21755063
0.042720055
0.205209919
0.053552988
0.135913777
0.129090985
0.126755467
0.054468177
0.143547192
0.127561568
0.125057471
0.135675527
0.153656282
0.135994205
0.124501244
0.126975785
0.212048008
0.052969663
0.208492798
0.136217596
0.128715781
0.130466549
0.133027079
0.142198028
0.139497757
0.107914384
0.157040902
0.132118866
0.059025412
0.231755541
0.072431096
0.237608949
0.132376274
0.128392519
0.048447206
0.143444946
0.216714502
0.113765814
0.138750278
0.054166292
0.125345491
0.123151466
0.130740236
0.130040994
0.132336243
0.131015359
0.120283476
0.0819551
0.12897466
0.119582391
0.099782304
0.146355992
0.071737352
0.12601306
0.119875996
0.149615954
0.164011924
0.113920676
0.148488657
0.204961462
0.046083269
0.132240597
0.120290724
0.133797299
0.118957392
0.050409076
0.132897864
0.130272527
0.123380237
0.124661648
0.117979764
0.051353461
0.137747574
0.135186988
0.144030045
0.139358571
0.143437272
0.150285771
0.123336112
0.13059152
0.134459129
0.057900639
0.128294124
0.130709936
0.142835662
0.124570338
0.157248237
0.05272827
0.159828425
0.131068281
0.129289379
0.159432997
0.050178936
0.137730101
0.139991591
0.117379933
0.152119491
0.130505849
0.131793548
0.133941495
0.132393249
0.064910953
0.135175145
0.137056085
0.128050285
0.124205486
0.054973603
0.137880713
0.13144247
0.130749717
0.132145462
0.129195918
0.133067051
0.050073898
0.137713601
0.127667884
0.066216357
0.218338173
0.137968321
0.063958844
0.134524174
0.129759992
0.132279705
0.05465503
0.136361192
0.137618427
0.135622113
0.135047758
0.076815814
0.134360997
0.131905126
0.135704933
0.132362084
0.058676068
0.131205555
0.138812341
0.129555152
0.137330679
0.054380999
0.240273459
0.062407346
0.150339943
0.141908294
0.07518077
0.12682164
0.152640432
0.147469145
0.054359045
0.157950419
0.246097952
0.061191826
0.228765419
0.145952982
0.053982302
0.137021757
0.138045217
0.123689763
0.069250831
0.125857663
0.117612075
0.138906825
0.064224531
0.14983241
0.142864541
0.141848561
0.12703107
0.134363931
0.129297234
0.143293068
0.05942274
0.151750642
0.129740556
0.141618794
0.157558284
0.15051779
0.130591822
0.147420673
0.129570717
0.066815203
0.127132384
0.129291855
0.237577516
0.150971169
0.133464307
0.136805642
0.137268469
0.138594698
0.058999963
0.151746548
0.148502547
0.126773309
0.079724232
0.134307193
0.164472372
0.159711969
0.148037259
0.14977967
0.16937488
0.048476567
0.154464051
0.041739919
let: End of buffer




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:22                                                       ` Eli Zaretskii
  2021-06-11 18:28                                                         ` Daniel Colascione
@ 2021-06-11 18:47                                                         ` Alan Mackenzie
  2021-06-11 19:32                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-11 18:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, emacs-devel, monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 21:22:56 +0300, Eli Zaretskii wrote:
> > Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
> >  emacs-devel@gnu.org
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Fri, 11 Jun 2021 11:02:34 -0700

> > >>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.

> > >> That is, with the exception of the fourth timing, the scroll operation
> > >> takes a little over 1/40 second.

> > >> This is in an Emacs-28 compiled with default optimisation, on a 4
> > >> year-old first generation Ryzen machine.

> > >> For me personally, this scrolling speed, in conjunction with
> > >> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> > >> people with slower machines.
> > > I suggest to compare these times with Emacs 23 to see how we
> > > regressed.

> > Regression is acceptable in exchange for correctness so long as absolute 
> > performance is adequate. We're not using 80486s anymore.

> Here are my times using an optimized build of Emacs 27.2 on a 3.4GHz
> Core i7 box:

How many buffer lines were in your window?

>   0.015625
>   0.03125
>   0.015625
>   0.046875
>   0.09375
>   0.0625
>   0.015625
>   0.03125
>   0.015625
>   0.03125
>   0.015625
>   0.03125

> You consider this to be adequate performance for a single
> window-scroll?  (I don't have an optimized build of Emacs 28, but
> there's no reason to believe it is faster; quite the opposite.)

What does adequate mean?  With those timings, the font-locking would keep
up with an auto-repeated C-v at around 30 repetitions per second.

[ .... ]

> We can stick our heads in the sand as much as we want, but facts are
> stubborn things.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:28                                                         ` Daniel Colascione
@ 2021-06-11 19:12                                                           ` Alan Mackenzie
  2021-06-11 19:23                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-11 19:12 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, Eli Zaretskii, emacs-devel, monnier, rms

Hello, Daniel.

On Fri, Jun 11, 2021 at 11:28:18 -0700, Daniel Colascione wrote:

[ .... ]

> native-comp?

Native compilation speeds up CC Mode only marginally.  On basically the
same benchmark, it was 13% faster with N.C.

> Hrm. That doesn't seem consistent with Alan's report that we spend a ton 
> of time doing work like deciding whether a brace occurs at top-level. My 
> question stands: what core facilities can we add to accelerate cc-mode's 
> parsing here? There's got to be some efficiency we can gain here.

My gut feeling, not really backed up by much, is that only something like
LSP is really going to help.  There's nothing particularly inefficient in
CC Mode's fontification, it just does a very thorough job.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:28                                                         ` Daniel Colascione
  2021-06-11 19:12                                                           ` Alan Mackenzie
@ 2021-06-11 19:23                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:23 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> Cc: acm@muc.de, rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>  emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 11 Jun 2021 11:28:18 -0700
> 
> >    0.015625
> >    0.03125
> >    0.015625
> >    0.046875
> >    0.09375
> >    0.0625
> >    0.015625
> >    0.03125
> >    0.015625
> >    0.03125
> >    0.015625
> >    0.03125
> >
> > You consider this to be adequate performance for a single
> > window-scroll?  (I don't have an optimized build of Emacs 28, but
> > there's no reason to believe it is faster; quite the opposite.)
> 
> native-comp?

No (it's Emacs 27).  But Alan already timed the native and non-native
versions in Emacs 28, and didn't find any tangible difference.  In
fact, the byte-compiled code was slightly faster.

> > 	- font-lock-fontify-region                                  157  65%
> > 	 - c-font-lock-fontify-region                               157  65%
> > 	  - font-lock-default-fontify-region                        146  60%
> > 	   - font-lock-fontify-keywords-region                      143  59%
> > 	    - c-font-lock-declarations                               97  40%
> > 	     - c-find-decl-spots                                     97  40%
> > 	      - #<compiled -0x1ffffffff94b65d0>                      73  30%
> > 	       - c-forward-decl-or-cast-1                            38  15%
> > 		- c-forward-type                                     22   9%
> > 		 - c-check-qualified-type                             7   2%
> >
> > We can stick our heads in the sand as much as we want, but facts are
> > stubborn things.
> 
> Hrm. That doesn't seem consistent with Alan's report that we spend a ton 
> of time doing work like deciding whether a brace occurs at top-level.

Maybe we should produce profiles on different systems and compare
them, so that we are sure the data is solid and repeatable?

Or maybe Alan was talking about Emacs 28, where something has changed
considerably?

> My question stands: what core facilities can we add to accelerate
> cc-mode's parsing here? There's got to be some efficiency we can
> gain here.

I don't think we have an answer to that.  Alan, do you have some
suggestions?

If we don't have anything we already figured out, I guess the answer
should be found by studying the code of the hot spots and looking for
optimization opportunities or algorithmic changes.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:42                                                       ` Stefan Monnier
@ 2021-06-11 19:31                                                         ` Eli Zaretskii
  2021-06-11 19:57                                                           ` Stefan Monnier
  2021-06-11 20:06                                                           ` Alan Mackenzie
  2021-06-11 19:48                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
>   rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 14:42:31 -0400
> 
> I used pretty much your above test, except I started it at BOB of
> xdisp.c and used:
> 
>     M_: (dotimes (_ 700)
>           (message "%S" (benchmark-elapse (scroll-up) (sit-for 0)))
>           (sleep-for 0.05))
> 
> As you can see, the speed doesn't get noticeably worse as we go further
> into the file (the first few screenfuls were a bit faster, but after
> that it's a wash).

Can you produce a profile for that?

> Eli, do you see similar results?

Will Emacs 27.2 do?  If you must see results from an optimized build
of Emacs 28, I'll have to build one first.

> Would you say that this shows the slow behaviors that bother you?

Of course.  100 msec for a single window-scroll is awfully slow.
Especially since the display code itself takes only a fraction of
that time.

> E.g. there used to be a time where I found CC-mode unusably slow in some
> cases, but these were typically while editing rather than while
> scrolling (i.e. even simple buffer modifications incurred delays
> measured in seconds).

Yes, there are other use cases, but even this simple benchmark already
shows that we have a serious problem, IMO.  Compare this with Emacs 23
or with Emacs 28 in Fundamental mode.

> FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
> about as well as CC-mode, but solves an easier problem since it doesn't
> try to handle as much of C as CC-mode does (e.g. no support for K&R, no
> highlighting of types), nor does it try to handle C++, Java, ...), and
> most of the times for it are between 0.02 and 0.04.

That is much better, but still too slow, IMO.  Think: it's the time
that it takes us to fontify a single windowful, only a couple of
dozens of lines.  Why does it take so long?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:47                                                         ` Alan Mackenzie
@ 2021-06-11 19:32                                                           ` Eli Zaretskii
  2021-06-11 19:46                                                             ` Alan Mackenzie
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:32 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 18:47:37 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
>   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> How many buffer lines were in your window?

34.  It was in "emacs -Q".



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:32                                                           ` Eli Zaretskii
@ 2021-06-11 19:46                                                             ` Alan Mackenzie
  2021-06-11 19:50                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-11 19:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, emacs-devel, monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 22:32:39 +0300, Eli Zaretskii wrote:
> > Date: Fri, 11 Jun 2021 18:47:37 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
> >   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > How many buffer lines were in your window?

> 34.  It was in "emacs -Q".

Thanks.  I didn't know emacs -Q on a GUI always gave the same window
height.  On my tty, I get 65 lines.

So, given your windows are about half the height of mine, our timings
were broadly comparable.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:42                                                       ` Stefan Monnier
  2021-06-11 19:31                                                         ` Eli Zaretskii
@ 2021-06-11 19:48                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
>  rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 14:42:31 -0400
> 
> Eli, do you see similar results?

With Emacs 27.2 built with -O2, I see the same times as for Alan's
benchmark: between 30 and 50 msec per scroll.  With Emacs 28 built
with -O0, I see times that are roughly double of what you show:
average of 200 msec per scroll.  The factor of 2 wrt a -Og build is
expected, so I think our results are consistent.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:46                                                             ` Alan Mackenzie
@ 2021-06-11 19:50                                                               ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:50 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 19:46:10 +0000
> Cc: dancol@dancol.org, rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>   emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > > How many buffer lines were in your window?
> 
> > 34.  It was in "emacs -Q".
> 
> Thanks.  I didn't know emacs -Q on a GUI always gave the same window
> height.  On my tty, I get 65 lines.
> 
> So, given your windows are about half the height of mine, our timings
> were broadly comparable.

??? Your window was twice as high, but your times are 30% shorter.
How do you conclude that the times are comparable?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:31                                                         ` Eli Zaretskii
@ 2021-06-11 19:57                                                           ` Stefan Monnier
  2021-06-11 23:25                                                             ` Ergus
  2021-06-12  6:38                                                             ` Eli Zaretskii
  2021-06-11 20:06                                                           ` Alan Mackenzie
  1 sibling, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-11 19:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, acm, rudalics, rms, emacs-devel

> Will Emacs 27.2 do?  If you must see results from an optimized build
> of Emacs 28, I'll have to build one first.

As mentioned, mine was not an optimized build, on the contrary.

>> FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
>> about as well as CC-mode, but solves an easier problem since it doesn't
>> try to handle as much of C as CC-mode does (e.g. no support for K&R, no
>> highlighting of types), nor does it try to handle C++, Java, ...), and
>> most of the times for it are between 0.02 and 0.04.
>
> That is much better, but still too slow, IMO.  Think: it's the time
> that it takes us to fontify a single windowful, only a couple of
> dozens of lines.  Why does it take so long?

For comparison, here it is for lisp/subr.el: it seems actually slightly
slower than what I for with xdisp.c when using sm-c-mode.


        Stefan


0.075539393
0.030856317
0.040824289
0.029961978
0.012222597
0.020277377
0.08354889
0.027791121
0.040834603
0.029304419
0.040401518
0.042230931
0.041249748
0.041759172
0.022928028
0.088205301
0.01791448
0.039915906
0.041701691
0.036885006
0.037948645
0.039082061
0.03723824
0.090796121
0.021685216
0.040341389
0.041098352
0.012256459
0.038770756
0.047087185
0.036423884
0.04461722
0.082821279
0.02936458
0.038498799
0.029450549
0.039748644
0.037981817
0.041704413
0.03614839
0.040829019
0.03792122
0.088297379
0.031529965
0.038930449
0.035313203
0.040872462
0.040254486
0.043807937
0.037344524
0.041937701
0.086986891
0.02128011
0.038679053
0.037372497
0.042372958
0.045191831
0.026552158
0.038718167
0.040198771
0.086453442
0.020748667
0.036524354
0.038769191
0.036234863
0.0399449
0.040732675
0.039041865
0.037608296
0.078606241
0.022010691
0.03774944
0.028604627
0.040171841
0.039866605
0.035715879
0.041613829
0.035701447
0.037601563
0.085249827
0.018101252
0.041692999
0.033016519
0.037679106
0.039894138
0.036513263
0.04271547
0.038203434
0.089595139
0.022256597
0.040981642
0.037780354
0.036986214
0.033088927
0.03626288
0.037085366
0.023201762
0.088236647
0.019961394
0.033811429
0.040647559
0.034390619
0.039764122
0.022225068
0.026550511
0.037590894
0.085617853
0.022528
0.040158828
0.039360065
0.015918947
let: End of buffer




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:31                                                         ` Eli Zaretskii
  2021-06-11 19:57                                                           ` Stefan Monnier
@ 2021-06-11 20:06                                                           ` Alan Mackenzie
  2021-06-12  6:44                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-11 20:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, emacs-devel, Stefan Monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 22:31:45 +0300, Eli Zaretskii wrote:
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
> >   rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> > Date: Fri, 11 Jun 2021 14:42:31 -0400

[ .... ]

> > Would you say that this shows the slow behaviors that bother you?

> Of course.  100 msec for a single window-scroll is awfully slow.
> Especially since the display code itself takes only a fraction of
> that time.

> > E.g. there used to be a time where I found CC-mode unusably slow in
> > some cases, but these were typically while editing rather than while
> > scrolling (i.e. even simple buffer modifications incurred delays
> > measured in seconds).

> Yes, there are other use cases, but even this simple benchmark already
> shows that we have a serious problem, IMO.  Compare this with Emacs 23
> or with Emacs 28 in Fundamental mode.

Why do we have a problem?  If the time taken to fontify a window is less
than the auto-repeat time (the two times are close on a modern machine),
this is surely not a problem for somebody with such a machine.  It could
be a problem for somebody with a slower machine, or running an
unoptimised Emacs.

> > FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
> > about as well as CC-mode, but solves an easier problem since it doesn't
> > try to handle as much of C as CC-mode does (e.g. no support for K&R, no
> > highlighting of types), nor does it try to handle C++, Java, ...), and
> > most of the times for it are between 0.02 and 0.04.

> That is much better, but still too slow, IMO.  Think: it's the time
> that it takes us to fontify a single windowful, only a couple of
> dozens of lines.  Why does it take so long?

It does a very thorough job.  For example, one bug fix from many years
ago that I remember involved the fontification of foo in the following:

        ....
        int bar;
    } foo;

What face should foo have?  To answer that, you've got to go back over
the brace expression to see what's there.  If it's

    struct foo
    {
        int baz;
        ....

, we need font-lock-variable-name-face for foo.  On the other hand, if we
have

    typedef struct foo
    {
        int baz;
        ....

, we need font-lock-type-face.  Before the bug fix, foo just got variable
name face.  scan-lists backward over the brace expression takes time,
particularly for something the size of struct frame or even bigger.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 21:02                                         ` Stefan Monnier
@ 2021-06-11 20:21                                           ` Ergus
  2021-06-11 20:27                                             ` Stefan Monnier
  0 siblings, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-11 20:21 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Alan Mackenzie, Eli Zaretskii, rudalics, dancol, rms, emacs-devel

On Thu, Jun 10, 2021 at 05:02:09PM -0400, Stefan Monnier wrote:
>> 1) What is finally the most desirable/long path/future feature?
>> I mean, finally what is preferred by the developers to support in the future?
>>
>> lsp or tree-sitter?
>      ^^
>     and
>
>
>-- Stefan
>
For what I know about tree-sitter it does not provide the parsers with
the library. Usually they require to be distributed with the programs
with one parser/language. They are veen in different github
repositories.

In Rust application (like helix editor) that's not an issue because
cargo handles that. But for emacs I don't know how can be solved the
technical and legal issues with the dependencies.

Are we going to add their source code to emacs? In systems like mine,
tree-sitter is a package, but the parsers are not.





^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:21                                           ` Ergus
@ 2021-06-11 20:27                                             ` Stefan Monnier
  2021-06-11 20:37                                               ` Daniel Colascione
  0 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-11 20:27 UTC (permalink / raw)
  To: Ergus; +Cc: Alan Mackenzie, Eli Zaretskii, rudalics, dancol, rms, emacs-devel

> For what I know about tree-sitter it does not provide the parsers with
> the library.

Of course, not, how could it?  There's a never-ending stream of
programming languages out there.

I don't see why you think that's a problem,


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:27                                             ` Stefan Monnier
@ 2021-06-11 20:37                                               ` Daniel Colascione
  2021-06-11 20:52                                                 ` Stefan Monnier
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-11 20:37 UTC (permalink / raw)
  To: Stefan Monnier, Ergus
  Cc: Alan Mackenzie, Eli Zaretskii, emacs-devel, rms, rudalics

On 6/11/21 1:27 PM, Stefan Monnier wrote:

>> For what I know about tree-sitter it does not provide the parsers with
>> the library.
> Of course, not, how could it?  There's a never-ending stream of
> programming languages out there.
>
> I don't see why you think that's a problem,

It's not just licensing.

Another problem with stock tree-sitter is that it makes Emacs less 
self-hosting. Tree-sitter grammars are written in JavaScript. You don't 
need JavaScript to use a grammar, but you do need JavaScript to 
customize a grammar. In addition, Tree-sitter compiles these JavaScript 
grammars to C. To use a customized grammar, an Emacs user would have to 
run node.js (or equally capable JS environment), generate C code, 
compile that C code, and load it into Emacs as a module. That's a big 
departure from the traditional approach to Emacs customization.

These technical choices on the part of the Tree-sitter people are 
unfortunate. I'd prefer an elisp reimplementation of the Tree-sitter 
algorithms, but I doubt we're going to get that any time soon.

Maybe Tree-sitter could be changed to generate an elisp parser and 
compile parsers in a lightweight JS environment like Duktape.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:37                                               ` Daniel Colascione
@ 2021-06-11 20:52                                                 ` Stefan Monnier
  2021-06-12  6:46                                                   ` Eli Zaretskii
  2021-06-12  8:47                                                   ` Daniele Nicolodi
  0 siblings, 2 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-11 20:52 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Ergus, Alan Mackenzie, Eli Zaretskii, rudalics, rms, emacs-devel

> Another problem with stock tree-sitter is that it makes Emacs less
> self-hosting. Tree-sitter grammars are written in JavaScript.

Yes, there are some technical disadvantages to tree-sitter, indeed.
None of them make it unusable, but they do make it less convenient for
ELisp hackers and Emacs users.  So it's not a perfect solution, but
I don't think that should mean we don't want it in our toolbox.


        Stefan


PS: I think we can expect 99% of Emacs users have a Javascript engine
already installed (in the form of a web browser), and with native
compilation Emacs now comes with a runtime dependency on (a substantial
chunk of) GCC, so the extra dependencies introduced by tree-sitter seem
quite workable for the Emacs end-user.  The ELisp hackers working on the
major mode who want to tweak the grammar will suffer more, tho.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:57                                                           ` Stefan Monnier
@ 2021-06-11 23:25                                                             ` Ergus
  2021-06-11 23:52                                                               ` Óscar Fuentes
  2021-06-12  5:20                                                               ` Theodor Thornhill
  2021-06-12  6:38                                                             ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Ergus @ 2021-06-11 23:25 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, dancol, acm, rudalics, rms, emacs-devel

Going a bit more into this. And reconsidering tree-sitter.

As there is already a tree-sitter module package with some interesting
functionalities. (I know Eli didn't like some details in it's
implementation)

But maybe it is a good time to try to disable the cc-mode font-locking
(I don't actually know if it is possible to do that), and repeat the
scrolling benchmark only with the tree-sitter-mode and
tree-sitter-hl-mode enabled?

Just to see how it compares and how much of that approach is useful?



On Fri, Jun 11, 2021 at 03:57:10PM -0400, Stefan Monnier wrote:
>> Will Emacs 27.2 do?  If you must see results from an optimized build
>> of Emacs 28, I'll have to build one first.
>
>As mentioned, mine was not an optimized build, on the contrary.
>
>>> FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
>>> about as well as CC-mode, but solves an easier problem since it doesn't
>>> try to handle as much of C as CC-mode does (e.g. no support for K&R, no
>>> highlighting of types), nor does it try to handle C++, Java, ...), and
>>> most of the times for it are between 0.02 and 0.04.
>>
>> That is much better, but still too slow, IMO.  Think: it's the time
>> that it takes us to fontify a single windowful, only a couple of
>> dozens of lines.  Why does it take so long?
>
>For comparison, here it is for lisp/subr.el: it seems actually slightly
>slower than what I for with xdisp.c when using sm-c-mode.
>
>
>        Stefan
>
>
>0.075539393
>0.030856317
>0.040824289
>0.029961978
>0.012222597
>0.020277377
>0.08354889
>0.027791121
>0.040834603
>0.029304419
>0.040401518
>0.042230931
>0.041249748
>0.041759172
>0.022928028
>0.088205301
>0.01791448
>0.039915906
>0.041701691
>0.036885006
>0.037948645
>0.039082061
>0.03723824
>0.090796121
>0.021685216
>0.040341389
>0.041098352
>0.012256459
>0.038770756
>0.047087185
>0.036423884
>0.04461722
>0.082821279
>0.02936458
>0.038498799
>0.029450549
>0.039748644
>0.037981817
>0.041704413
>0.03614839
>0.040829019
>0.03792122
>0.088297379
>0.031529965
>0.038930449
>0.035313203
>0.040872462
>0.040254486
>0.043807937
>0.037344524
>0.041937701
>0.086986891
>0.02128011
>0.038679053
>0.037372497
>0.042372958
>0.045191831
>0.026552158
>0.038718167
>0.040198771
>0.086453442
>0.020748667
>0.036524354
>0.038769191
>0.036234863
>0.0399449
>0.040732675
>0.039041865
>0.037608296
>0.078606241
>0.022010691
>0.03774944
>0.028604627
>0.040171841
>0.039866605
>0.035715879
>0.041613829
>0.035701447
>0.037601563
>0.085249827
>0.018101252
>0.041692999
>0.033016519
>0.037679106
>0.039894138
>0.036513263
>0.04271547
>0.038203434
>0.089595139
>0.022256597
>0.040981642
>0.037780354
>0.036986214
>0.033088927
>0.03626288
>0.037085366
>0.023201762
>0.088236647
>0.019961394
>0.033811429
>0.040647559
>0.034390619
>0.039764122
>0.022225068
>0.026550511
>0.037590894
>0.085617853
>0.022528
>0.040158828
>0.039360065
>0.015918947
>let: End of buffer
>
>



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:25                                                             ` Ergus
@ 2021-06-11 23:52                                                               ` Óscar Fuentes
  2021-06-12  1:08                                                                 ` Ergus
  2021-06-12  6:50                                                                 ` Eli Zaretskii
  2021-06-12  5:20                                                               ` Theodor Thornhill
  1 sibling, 2 replies; 206+ messages in thread
From: Óscar Fuentes @ 2021-06-11 23:52 UTC (permalink / raw)
  To: emacs-devel

Ergus <spacibba@aol.com> writes:

> Going a bit more into this. And reconsidering tree-sitter.
>
> As there is already a tree-sitter module package with some interesting
> functionalities. (I know Eli didn't like some details in it's
> implementation)
>
> But maybe it is a good time to try to disable the cc-mode font-locking
> (I don't actually know if it is possible to do that), and repeat the
> scrolling benchmark only with the tree-sitter-mode and
> tree-sitter-hl-mode enabled?
>
> Just to see how it compares and how much of that approach is useful?

More easily, you can use some of the editors that already use
tree-sitter for fontification of C/C++ and do the PgDn test (which looks
like a rather silly test to me, because who navigates large files by
holding PgDn and why Emacs should support that terrible use case well?)
This would provide a valuable comparison point for little effort.

Although I'm more interested on accuracy, but it seems that the thread
was effectively and hopelessly hijacked :-/




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:52                                                               ` Óscar Fuentes
@ 2021-06-12  1:08                                                                 ` Ergus
  2021-06-12  3:20                                                                   ` Stefan Monnier
  2021-06-12  6:58                                                                   ` Eli Zaretskii
  2021-06-12  6:50                                                                 ` Eli Zaretskii
  1 sibling, 2 replies; 206+ messages in thread
From: Ergus @ 2021-06-12  1:08 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

On Sat, Jun 12, 2021 at 01:52:12AM +0200, �scar Fuentes wrote:
>Ergus <spacibba@aol.com> writes:
>
>> Going a bit more into this. And reconsidering tree-sitter.
>>
>> As there is already a tree-sitter module package with some interesting
>> functionalities. (I know Eli didn't like some details in it's
>> implementation)
>>
>> But maybe it is a good time to try to disable the cc-mode font-locking
>> (I don't actually know if it is possible to do that), and repeat the
>> scrolling benchmark only with the tree-sitter-mode and
>> tree-sitter-hl-mode enabled?
>>
>> Just to see how it compares and how much of that approach is useful?
>
>More easily, you can use some of the editors that already use
>tree-sitter for fontification of C/C++ and do the PgDn test 

We have all the lisp machine overhead in the middle. So doing this will
be like comparing apples with pears.

>(which looks
>like a rather silly test to me, because who navigates large files by
>holding PgDn and why Emacs should support that terrible use case well?)

The scrolling test is because during scrolling we call re-display,
font-look and some hooks. So it is the easiest way to measure all the
syntax highlighting in action.

>This would provide a valuable comparison point for little effort.
>

I don't think so. The tree-sitter mode does not require special effort
to install. And comparing emacs vs emacs is more realistic and useful
IMHO (neovim redisplay is ridiculously fast).

But any way just to start: tree-sitter parses all the text in xdisp.c,
(in my machine), in 0.12 seconds from scratch and re-parses it (reusing
the tree) 10 times faster; in 0.008 ~ 0.01 seconds.

If we count that we don't need to re-parse the file (buffer), but only
the modified regions (that is possible to specify with the api). Then
the times are ridiculous small.

In this case the parse is mostly already done, so scrolling won't need
to parse the text to add the highlighting... so maybe we need something
else to measure the impact (maybe something that modifies the text)

BTW: Eli was concerned about the extra copy of the buffer text to send
it to tree-sitter. In this case the time to memcopy an array with all
xdisp text is ~0.00085 seconds.

Any way if we don't want the copy we can use
ts_parser_set_included_ranges to exclude the gap and pass the text
pointer directly without any copy.

>Although I'm more interested on accuracy, but it seems that the thread
>was effectively and hopelessly hijacked :-/
>
>
To improve accuracy we need to improve the parsing OR add more work to
cc-mode. So that's why we are looking for alternatives.

There are already some interesting information to see tree sitter in
action:

https://www.youtube.com/watch?v=ZwibVdNtFjs
https://www.youtube.com/watch?v=oSrXK8ovBfQ

Where you can actually see that the accuracy should also improve (and
probably some navigation commands)




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  1:08                                                                 ` Ergus
@ 2021-06-12  3:20                                                                   ` Stefan Monnier
  2021-06-12 11:07                                                                     ` Ergus
  2021-06-12  6:58                                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12  3:20 UTC (permalink / raw)
  To: Ergus; +Cc: Óscar Fuentes, emacs-devel

> But any way just to start: tree-sitter parses all the text in xdisp.c,
> (in my machine), in 0.12 seconds from scratch and re-parses it (reusing
> the tree) 10 times faster; in 0.008 ~ 0.01 seconds.

Do those times include passing the result of the parse to ELisp and
processing it by applying text-properties for highlighting or is it just
the time for tree-sitter to do the actual parse for itself?


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:25                                                             ` Ergus
  2021-06-11 23:52                                                               ` Óscar Fuentes
@ 2021-06-12  5:20                                                               ` Theodor Thornhill
  2021-06-12 13:40                                                                 ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Theodor Thornhill @ 2021-06-12  5:20 UTC (permalink / raw)
  To: Ergus, Stefan Monnier
  Cc: Eli Zaretskii, dancol, acm, rudalics, rms, emacs-devel

Ergus <spacibba@aol.com> writes:

> Going a bit more into this. And reconsidering tree-sitter.
>
> As there is already a tree-sitter module package with some interesting
> functionalities. (I know Eli didn't like some details in it's
> implementation)

This module us used by csharp mode, in its own
`csharp-tree-sitter-mode`, and uses these packages from melpa:

 - tree-sitter-mode
 - tree-sitter-langs
 - tree-sitter-indent

This bug (https://github.com/emacs-csharp/csharp-mode/issues/164) is an
even simpler test of the performance from CC Mode, which alan _is_
addressing right now, but should be interesting given this thread.
csharp-mode grinds to a halt here, but csharp-tree-sitter-mode handles
this perfectly.

As for correctness there is no comparison.  The tree sitter variant
covers things that aren't even possible using CC Mode variant, like
string interpolation, complicated, nested generics, preprocessor
directives and much, much more. 


@Stefan - I'm not sure I understand what you mean by troublesome for
elisp hackers.  These grammars have a lisp-like dsl, and is pretty
usable through C-M-x and defvars, see:
https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.

IME experience it's not the same as normal elisp hacking, but it's good
enough.  That's only an opinion though.

I don't understand why this shouldn't be doable? There could be an nongnu-ELPA
package with defined grammars, and a way to download and compile
parsers.

As a side node, it is easy to design structural editing also, like
`delete-defun` `change-in-string`, `beginning-and-end-of-defun` etc.

Just take a small look at
https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el.

These 410 lines covers way more than what CC Mode is atm.  It would be
*great* to move the tree sitter part to emacs.

Just my two cents :)

Theodor Thornhill



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:57                                                           ` Stefan Monnier
  2021-06-11 23:25                                                             ` Ergus
@ 2021-06-12  6:38                                                             ` Eli Zaretskii
  2021-06-12 13:44                                                               ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: dancol@dancol.org,  acm@muc.de,  rudalics@gmx.at,  rms@gnu.org,
>   emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 15:57:10 -0400
> 
> > Will Emacs 27.2 do?  If you must see results from an optimized build
> > of Emacs 28, I'll have to build one first.
> 
> As mentioned, mine was not an optimized build, on the contrary.

Well, using -Og _is_ optimizing, albeit less than -O2 does.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:06                                                           ` Alan Mackenzie
@ 2021-06-12  6:44                                                             ` Eli Zaretskii
  2021-06-12  8:00                                                               ` Daniel Colascione
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 20:06:30 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, dancol@dancol.org,
>   rudalics@gmx.at, rms@gnu.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> Why do we have a problem?  If the time taken to fontify a window is less
> than the auto-repeat time (the two times are close on a modern machine),
> this is surely not a problem for somebody with such a machine.  It could
> be a problem for somebody with a slower machine, or running an
> unoptimised Emacs.

It is a problem given how much the current fast machines can do during
that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
machine instructions.

> > That is much better, but still too slow, IMO.  Think: it's the time
> > that it takes us to fontify a single windowful, only a couple of
> > dozens of lines.  Why does it take so long?
> 
> It does a very thorough job.


AFAIU, sm-c-mode doesn't.  And it still takes tens of milliseconds.

> For example, one bug fix from many years
> ago that I remember involved the fontification of foo in the following:
> 
>         ....
>         int bar;
>     } foo;
> 
> What face should foo have?  To answer that, you've got to go back over
> the brace expression to see what's there.  If it's
> 
>     struct foo
>     {
>         int baz;
>         ....
> 
> , we need font-lock-variable-name-face for foo.  On the other hand, if we
> have
> 
>     typedef struct foo
>     {
>         int baz;
>         ....
> 
> , we need font-lock-type-face.  Before the bug fix, foo just got variable
> name face.  scan-lists backward over the brace expression takes time,
> particularly for something the size of struct frame or even bigger.

We should either find a way of making this analysis faster, or give up
on fontifying these two cases differently.  It is IMO unacceptable
that redisplay is slowed down so much by mode-specific fontifications.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:52                                                 ` Stefan Monnier
@ 2021-06-12  6:46                                                   ` Eli Zaretskii
  2021-06-12  8:03                                                     ` Daniel Colascione
  2021-06-12  8:47                                                   ` Daniele Nicolodi
  1 sibling, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: spacibba, rms, emacs-devel, rudalics, acm, dancol

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ergus <spacibba@aol.com>,  Alan Mackenzie <acm@muc.de>,  Eli Zaretskii
>  <eliz@gnu.org>,  rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 16:52:37 -0400
> 
> > Another problem with stock tree-sitter is that it makes Emacs less
> > self-hosting. Tree-sitter grammars are written in JavaScript.
> 
> Yes, there are some technical disadvantages to tree-sitter, indeed.
> None of them make it unusable, but they do make it less convenient for
> ELisp hackers and Emacs users.  So it's not a perfect solution, but
> I don't think that should mean we don't want it in our toolbox.

I agree that these issues shouldn't prevent us from trying to use TS,
at least as an option.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:52                                                               ` Óscar Fuentes
  2021-06-12  1:08                                                                 ` Ergus
@ 2021-06-12  6:50                                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:50 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Sat, 12 Jun 2021 01:52:12 +0200
> 
> More easily, you can use some of the editors that already use
> tree-sitter for fontification of C/C++ and do the PgDn test

I don't think that would teach us much, due to stark differences in
architectural design.  AFAIK, those other editors don't even implement
buffer text similar enough to how we do that, and that has significant
influence on the efficiency.

> (which looks like a rather silly test to me, because who navigates
> large files by holding PgDn and why Emacs should support that
> terrible use case well?)

We use it because it's easy, and because it measures the time it takes
to fontify a single window, not necessarily because this is what Emacs
users do most of the time.  If you want to suggest other use cases,
feel free, we can add them to the suite.  I'm quite sure the results
will be similar, modulo the redisplay optimizations.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  1:08                                                                 ` Ergus
  2021-06-12  3:20                                                                   ` Stefan Monnier
@ 2021-06-12  6:58                                                                   ` Eli Zaretskii
  2021-06-12 11:01                                                                     ` Ergus
  2021-06-12 14:00                                                                     ` Stefan Monnier
  1 sibling, 2 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:58 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 03:08:44 +0200
> From: Ergus <spacibba@aol.com>
> Cc: emacs-devel@gnu.org
> 
> BTW: Eli was concerned about the extra copy of the buffer text to send
> it to tree-sitter. In this case the time to memcopy an array with all
> xdisp text is ~0.00085 seconds.

If the intent is to use buffer-(sub)string, then you forget the price
of consing.  That would trigger frequent GC cycles, which will all but
kill the otherwise fast performance.

> Any way if we don't want the copy we can use
> ts_parser_set_included_ranges to exclude the gap and pass the text
> pointer directly without any copy.

I hope someone will try that and report the results.

The other design issue with TS integration is that I'd like it to plug
into the JIT font-lock interface of the display engine, so that we
don't unnecessarily fontify parts of the buffer that won't be
displayed, and always do fontify the parts that will be.  I don't
really care if TS actually processes a much larger chunk of text, if
it does that quickly enough, but processing the resulting faces will
take time on the Emacs side, and that is better avoided.  More
importantly, integration into JIT font-lock machinery means we don't
need to use other hooks, which is a step back, since using such hooks
for fontification was already shown to have serious problems in pre-21
Emacs: they don't always catch all the changes which require
re-fontification.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:44                                                             ` Eli Zaretskii
@ 2021-06-12  8:00                                                               ` Daniel Colascione
  2021-06-12  8:08                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-12  8:00 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, emacs-devel, monnier, rms



On June 11, 2021 11:45:04 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Fri, 11 Jun 2021 20:06:30 +0000
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, dancol@dancol.org,
>> rudalics@gmx.at, rms@gnu.org, emacs-devel@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>> Why do we have a problem?  If the time taken to fontify a window is less
>> than the auto-repeat time (the two times are close on a modern machine),
>> this is surely not a problem for somebody with such a machine.  It could
>> be a problem for somebody with a slower machine, or running an
>> unoptimised Emacs.
>
> It is a problem given how much the current fast machines can do during
> that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
> machine instructions.


And if you count electrons, the numbers are in the trillions. So what? Who 
cares how many machine instructions it is? What matters is the latency.
>
>
>>> That is much better, but still too slow, IMO.  Think: it's the time
>>> that it takes us to fontify a single windowful, only a couple of
>>> dozens of lines.  Why does it take so long?
>>
>> It does a very thorough job.
>
>
> AFAIU, sm-c-mode doesn't.  And it still takes tens of milliseconds.
>
>> For example, one bug fix from many years
>> ago that I remember involved the fontification of foo in the following:
>>
>> ....
>> int bar;
>> } foo;
>>
>> What face should foo have?  To answer that, you've got to go back over
>> the brace expression to see what's there.  If it's
>>
>> struct foo
>> {
>> int baz;
>> ....
>>
>> , we need font-lock-variable-name-face for foo.  On the other hand, if we
>> have
>>
>> typedef struct foo
>> {
>> int baz;
>> ....
>>
>> , we need font-lock-type-face.  Before the bug fix, foo just got variable
>> name face.  scan-lists backward over the brace expression takes time,
>> particularly for something the size of struct frame or even bigger.
>
> We should either find a way of making this analysis faster, or give up
> on fontifying these two cases differently.  It is IMO unacceptable
> that redisplay is slowed down so much by mode-specific fontifications.

This is a great example of where incorrect fontification diminishes the 
utility of syntax highlighting more generally. If I can't trust the color 
of a symbol to distinguish a variable declaration from a type declaration, 
why bother fontifying as either? Maybe you'd be more interested in a basic 
c-mode that fontified only comments and strings, and that very quickly, but 
I wouldn't want that.






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:46                                                   ` Eli Zaretskii
@ 2021-06-12  8:03                                                     ` Daniel Colascione
  2021-06-12  8:13                                                       ` Eli Zaretskii
  2021-06-12 13:51                                                       ` Stefan Monnier
  0 siblings, 2 replies; 206+ messages in thread
From: Daniel Colascione @ 2021-06-12  8:03 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: acm, spacibba, emacs-devel, rms, rudalics



On June 11, 2021 11:46:18 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Ergus <spacibba@aol.com>,  Alan Mackenzie <acm@muc.de>,  Eli Zaretskii
>> <eliz@gnu.org>,  rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
>> Date: Fri, 11 Jun 2021 16:52:37 -0400
>>
>>> Another problem with stock tree-sitter is that it makes Emacs less
>>> self-hosting. Tree-sitter grammars are written in JavaScript.
>>
>> Yes, there are some technical disadvantages to tree-sitter, indeed.
>> None of them make it unusable, but they do make it less convenient for
>> ELisp hackers and Emacs users.  So it's not a perfect solution, but
>> I don't think that should mean we don't want it in our toolbox.
>
> I agree that these issues shouldn't prevent us from trying to use TS,
> at least as an option.


Sure, but it'd be nice to package TS in such a way that it becomes more 
idiomatically lispy, at least if TS becomes the primary fontification 
engine for some modes. At the very least, it should be possible for users 
to apply ad hoc fontification on top of whatever TS supports. And how could 
something like TS work with, say, bison and flex files without fully 
general multi-mode support (which we also lack)?







^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:00                                                               ` Daniel Colascione
@ 2021-06-12  8:08                                                                 ` Eli Zaretskii
  2021-06-12  9:31                                                                   ` Alan Mackenzie
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  8:08 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <rms@gnu.org>, <emacs-devel@gnu.org>
> Date: Sat, 12 Jun 2021 01:00:22 -0700
> 
> > It is a problem given how much the current fast machines can do during
> > that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
> > machine instructions.
> 
> And if you count electrons, the numbers are in the trillions. So what? Who 
> cares how many machine instructions it is? What matters is the latency.

I'm saying that, given how much these machines can do in 30 msec, it
doesn't sound right that we cannot refontify 35 lines of text with all
that processing power.  It tells me that our code is either very
inefficient or does a lot of unnecessary processing (or both).

Alan thought that the performance we have is acceptable.  The numbers
I mentioned would hopefully convince him otherwise.

> > We should either find a way of making this analysis faster, or give up
> > on fontifying these two cases differently.  It is IMO unacceptable
> > that redisplay is slowed down so much by mode-specific fontifications.
> 
> This is a great example of where incorrect fontification diminishes the 
> utility of syntax highlighting more generally. If I can't trust the color 
> of a symbol to distinguish a variable declaration from a type declaration, 
> why bother fontifying as either?

I think we are saying the same, just in different words.

Do you agree that slowing down redisplay so much due to fontification
is unacceptable?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:03                                                     ` Daniel Colascione
@ 2021-06-12  8:13                                                       ` Eli Zaretskii
  2021-06-12 13:51                                                       ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12  8:13 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: spacibba, rms, emacs-devel, rudalics, monnier, acm

> From: Daniel Colascione <dancol@dancol.org>
> CC: <spacibba@aol.com>, <acm@muc.de>, <rudalics@gmx.at>, <rms@gnu.org>, <emacs-devel@gnu.org>
> Date: Sat, 12 Jun 2021 01:03:03 -0700
> 
> >> Yes, there are some technical disadvantages to tree-sitter, indeed.
> >> None of them make it unusable, but they do make it less convenient for
> >> ELisp hackers and Emacs users.  So it's not a perfect solution, but
> >> I don't think that should mean we don't want it in our toolbox.
> >
> > I agree that these issues shouldn't prevent us from trying to use TS,
> > at least as an option.
> 
> Sure, but it'd be nice to package TS in such a way that it becomes more 
> idiomatically lispy, at least if TS becomes the primary fontification 
> engine for some modes. At the very least, it should be possible for users 
> to apply ad hoc fontification on top of whatever TS supports.

I agree.  Do you consider these goals impractical for some reason?  If
not, then (assuming we otherwise like the results of using TS in
Emacs) we could work towards those goals as followup.

> And how could something like TS work with, say, bison and flex files
> without fully general multi-mode support (which we also lack)?

Good question.  Shouldn't limiting TS to the relevant portions of
buffer text provide the solution?  If not, perhaps we should ask the
TS folks what they suggest.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:52                                                 ` Stefan Monnier
  2021-06-12  6:46                                                   ` Eli Zaretskii
@ 2021-06-12  8:47                                                   ` Daniele Nicolodi
  2021-06-12  8:57                                                     ` tomas
  2021-06-12 14:04                                                     ` Stefan Monnier
  1 sibling, 2 replies; 206+ messages in thread
From: Daniele Nicolodi @ 2021-06-12  8:47 UTC (permalink / raw)
  To: emacs-devel

On 11/06/2021 22:52, Stefan Monnier wrote:
> PS: I think we can expect 99% of Emacs users have a Javascript engine
> already installed (in the form of a web browser),

The JS engine in a web browser and node are two very different beasts.
The main problem with node is that it is very hard to get it to work in
a self contained way that does not involve downloading JS packages from
the network. I also anticipate some resistance in the Emacs community on
depending on the JS ecosystem where licensing is much more "liberal"
than within the GNU project.

Cheers,
Dan



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:47                                                   ` Daniele Nicolodi
@ 2021-06-12  8:57                                                     ` tomas
  2021-06-12 14:04                                                     ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: tomas @ 2021-06-12  8:57 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Sat, Jun 12, 2021 at 10:47:28AM +0200, Daniele Nicolodi wrote:
> On 11/06/2021 22:52, Stefan Monnier wrote:
> > PS: I think we can expect 99% of Emacs users have a Javascript engine
> > already installed (in the form of a web browser),
> 
> The JS engine in a web browser and node are two very different beasts.

I try to keep both of them at a safe distance, FWIW.

> The main problem with node is that it is very hard to get it to work in
> a self contained way that does not involve downloading JS packages from
> the network.

Yep. This is one of the reasons. Watching with horror some npm build
process (gotta do that from time to time to earn my beans) has borne
the clear decision: for me, just... no. Not in my free time, not as
a voluntary project.

>              I also anticipate some resistance in the Emacs community on
> depending on the JS ecosystem where licensing is much more "liberal"
> than within the GNU project.

Those "liberal" licenses could easily be re-licensed to GPL. A possible
advantage is that we get to enjoy the loud whining about how that's
unfair, while the whiners are usually fine with other actors taking the
software proprietary ;-)

(I know, I know)

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:08                                                                 ` Eli Zaretskii
@ 2021-06-12  9:31                                                                   ` Alan Mackenzie
  0 siblings, 0 replies; 206+ messages in thread
From: Alan Mackenzie @ 2021-06-12  9:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, emacs-devel, monnier, rms

Hello, Eli.

On Sat, Jun 12, 2021 at 11:08:30 +0300, Eli Zaretskii wrote:
> > From: Daniel Colascione <dancol@dancol.org>
> > CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <rms@gnu.org>, <emacs-devel@gnu.org>
> > Date: Sat, 12 Jun 2021 01:00:22 -0700

> > > It is a problem given how much the current fast machines can do during
> > > that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
> > > machine instructions.

> > And if you count electrons, the numbers are in the trillions. So what? Who 
> > cares how many machine instructions it is? What matters is the latency.

> I'm saying that, given how much these machines can do in 30 msec, it
> doesn't sound right that we cannot refontify 35 lines of text with all
> that processing power.  It tells me that our code is either very
> inefficient or does a lot of unnecessary processing (or both).

Or, due to the quirks of the CC Mode languages, it simply needs that
much processing to do an accurate job (or all three).

> Alan thought that the performance we have is acceptable.  The numbers
> I mentioned would hopefully convince him otherwise.

I think the performance is fully acceptable to a normal user on a 3.4
GHz modern machine.  If the processing power is available, why not make
use of it?

> > > We should either find a way of making this analysis faster, or give up
> > > on fontifying these two cases differently.  It is IMO unacceptable
> > > that redisplay is slowed down so much by mode-specific fontifications.

As mentioned already, we have the facility of setting
font-lock-maximum-decoration to 2, which triples fontification speed.
This comes at the cost of accuracy.  A lot of the bug reports I've
fielded over the years have been about fontification inaccuracies.

> > This is a great example of where incorrect fontification diminishes the 
> > utility of syntax highlighting more generally. If I can't trust the color 
> > of a symbol to distinguish a variable declaration from a type declaration, 
> > why bother fontifying as either?

> I think we are saying the same, just in different words.

> Do you agree that slowing down redisplay so much due to fontification
> is unacceptable?

I think I would answer that on a modern machine (certainly from the last
5 years), using a normally optimised Emacs build, the fontification
isn't slowed down.  On a somewhat slower machine, it could become
unacceptable, but that there are accpetable workarounds (setting
font-lock-maximum-decoration, or enabling fast-but-imprecise-scrolling,
or enabling deferred fontification).  On a much slower machine, the
above doesn't hold, no.

I can't agree that we should expect C Mode to fontify with around the
same amount of processing as Emacs Lisp Mode.  This isn't reasonable.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:58                                                                   ` Eli Zaretskii
@ 2021-06-12 11:01                                                                     ` Ergus
  2021-06-12 11:25                                                                       ` Eli Zaretskii
  2021-06-12 14:00                                                                     ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-12 11:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Sat, Jun 12, 2021 at 09:58:58AM +0300, Eli Zaretskii wrote:
>> Date: Sat, 12 Jun 2021 03:08:44 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: emacs-devel@gnu.org
>>
>> BTW: Eli was concerned about the extra copy of the buffer text to send
>> it to tree-sitter. In this case the time to memcopy an array with all
>> xdisp text is ~0.00085 seconds.
>
>If the intent is to use buffer-(sub)string, then you forget the price
>of consing.  That would trigger frequent GC cycles, which will all but
>kill the otherwise fast performance.
>
>> Any way if we don't want the copy we can use
>> ts_parser_set_included_ranges to exclude the gap and pass the text
>> pointer directly without any copy.
>
>I hope someone will try that and report the results.
>
>The other design issue with TS integration is that I'd like it to plug
>into the JIT font-lock interface of the display engine, so that we
>don't unnecessarily fontify parts of the buffer that won't be
>displayed, and always do fontify the parts that will be. 

If I understand something about our cc-mode functionalities (and many of
those functionalities we don't want to loose like indentation and code
navigation). Probably the "right" way to use tree-sitter (maybe Alan
wants give a more precise technical description) is not only fontify but
use the tree information to add contextual information to the text
(something that I think cc-mode does.) And then let font-lock do the
magic.

The tree-sitter tree is basically contextual information, and (for
example) if we have processed the whole buffer and we already have the
tree, then scrolling won't need to parse anything, adding or removing
text is a localized modification, so with the previous tree we can
re-parse only the modified region. The choice may be then if we
propertize the text of the whole buffer or just the visible region OR if
we want to "propertize on demand".

This will save us from the hard parsing in cc-mode to fontify "on the
fly".


> I don't
>really care if TS actually processes a much larger chunk of text, if
>it does that quickly enough, but processing the resulting faces will
>take time on the Emacs side, and that is better avoided. 

But then we won't get all the contextual information we need for
indentation, code navigation or fold the code right?

so we'll be still "sub-utilizing" the tree sitter features that may give
useful functionalities we already have in cc-mode, and we may also like
to have in other more "limited" modes.

> More
>importantly, integration into JIT font-lock machinery means we don't
>need to use other hooks, which is a step back, since using such hooks
>for fontification was already shown to have serious problems in pre-21
>Emacs: they don't always catch all the changes which require
>re-fontification.
>
I see two approaches here:

1) add the tree-sitter properties/faces to the buffer text (fully or
partially on the visible regions)

2) use the tree-sitter information directly from the tree and add the
visible properties from there.

This second one will require a more complete api of tree-sitter
functions exposed to elisp, but in my opinion it worth it in accuracy,
speed and simplicity (a single API to rule them all). And to support
many languages we don't actually have like rust or the fancy C++ > 11. 

+

Remember that TS has the partial parsing options (specifying the regions
to parse), the re-parsing option (using a previous tree for the same
buffer as a hint which reduces the times abruptly), or even a tree
comparison function that produces a new tree with the differences with
the "hint" tree to know what needs to be updated.

Plus all the navigation function like find parent or child nodes,
parsing error handling, iterate over nodes and so on.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  3:20                                                                   ` Stefan Monnier
@ 2021-06-12 11:07                                                                     ` Ergus
  0 siblings, 0 replies; 206+ messages in thread
From: Ergus @ 2021-06-12 11:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Óscar Fuentes, emacs-devel

On Fri, Jun 11, 2021 at 11:20:58PM -0400, Stefan Monnier wrote:
>> But any way just to start: tree-sitter parses all the text in xdisp.c,
>> (in my machine), in 0.12 seconds from scratch and re-parses it (reusing
>> the tree) 10 times faster; in 0.008 ~ 0.01 seconds.
>


>or is it just
>the time for tree-sitter to do the actual parse for itself?
>

This one. 

It was just a 5 minutes benchmark:

https://github.com/Ergus/tree-sitter-benchmark


>
>        Stefan
>



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 11:01                                                                     ` Ergus
@ 2021-06-12 11:25                                                                       ` Eli Zaretskii
  2021-06-12 15:04                                                                         ` Ergus
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12 11:25 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 13:01:03 +0200
> From: Ergus <spacibba@aol.com>
> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
> 
> If I understand something about our cc-mode functionalities (and many of
> those functionalities we don't want to loose like indentation and code
> navigation). Probably the "right" way to use tree-sitter (maybe Alan
> wants give a more precise technical description) is not only fontify but
> use the tree information to add contextual information to the text
> (something that I think cc-mode does.) And then let font-lock do the
> magic.
> 
> The tree-sitter tree is basically contextual information, and (for
> example) if we have processed the whole buffer and we already have the
> tree, then scrolling won't need to parse anything, adding or removing
> text is a localized modification, so with the previous tree we can
> re-parse only the modified region. The choice may be then if we
> propertize the text of the whole buffer or just the visible region OR if
> we want to "propertize on demand".
> 
> This will save us from the hard parsing in cc-mode to fontify "on the
> fly".

I'm not sure I understand what you are suggesting.  Can you describe
your suggestion in terms of 'face' text properties and the 'fontified'
property, and explain how those should fit into the existing redisplay
mechanisms?

> > I don't
> >really care if TS actually processes a much larger chunk of text, if
> >it does that quickly enough, but processing the resulting faces will
> >take time on the Emacs side, and that is better avoided. 
> 
> But then we won't get all the contextual information we need for
> indentation, code navigation or fold the code right?

Why not?

> I see two approaches here:
> 
> 1) add the tree-sitter properties/faces to the buffer text (fully or
> partially on the visible regions)
> 
> 2) use the tree-sitter information directly from the tree and add the
> visible properties from there.
> 
> This second one will require a more complete api of tree-sitter
> functions exposed to elisp, but in my opinion it worth it in accuracy,
> speed and simplicity (a single API to rule them all). And to support
> many languages we don't actually have like rust or the fancy C++ > 11. 

Why can't we have both?  The information you are talking about, which
is needed by Emacs features other than fontification, can be used by
those other Emacs features when needed.  You seem to be saying that
these two alternatives are mutually-exclusive, but you didn't explain
why.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  5:20                                                               ` Theodor Thornhill
@ 2021-06-12 13:40                                                                 ` Stefan Monnier
  2021-06-12 15:56                                                                   ` Theodor Thornhill
  0 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 13:40 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, rms, emacs-devel

> @Stefan - I'm not sure I understand what you mean by troublesome for
> elisp hackers.  These grammars have a lisp-like dsl, and is pretty
> usable through C-M-x and defvars, see:
> https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.

AFAIK the grammar itself is still written in Javascript.

> IME experience it's not the same as normal elisp hacking, but it's good
> enough.  That's only an opinion though.

The disadvantages I see for ELisp hackers are just technical hurdles
that can be overcome with extra tooling.  I'm not particularly worried
about them, indeed.

> These 410 lines covers way more than what CC Mode is atm.  It would be
> *great* to move the tree sitter part to emacs.

Agreed.  Maybe a first step would be to get copyright assignments and
include the tree sitter module in GNU ELPA?


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:38                                                             ` Eli Zaretskii
@ 2021-06-12 13:44                                                               ` Stefan Monnier
  2021-06-12 14:14                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 13:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, acm, rudalics, rms, emacs-devel

Eli Zaretskii [2021-06-12 09:38:09] wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: dancol@dancol.org,  acm@muc.de,  rudalics@gmx.at,  rms@gnu.org,
>>   emacs-devel@gnu.org
>> Date: Fri, 11 Jun 2021 15:57:10 -0400
>> > Will Emacs 27.2 do?  If you must see results from an optimized build
>> > of Emacs 28, I'll have to build one first.
>> As mentioned, mine was not an optimized build, on the contrary.
> Well, using -Og _is_ optimizing, albeit less than -O2 does.

Ah, I see what you mean.  For me the first step of optimizing is to
disable the extra checks, whereas this was built with

    --enable-checking --enable-check-lisp-object-type


-- Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:03                                                     ` Daniel Colascione
  2021-06-12  8:13                                                       ` Eli Zaretskii
@ 2021-06-12 13:51                                                       ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 13:51 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Eli Zaretskii, spacibba, acm, rudalics, rms, emacs-devel

> Sure, but it'd be nice to package TS in such a way that it becomes more
> idiomatically lispy, at least if TS becomes the primary fontification engine
> for some modes. At the very least, it should be possible for users to apply
> ad hoc fontification on top of whatever TS supports. And how could something
> like TS work with, say, bison and flex files without fully general
> multi-mode support (which we also lack)?

While I don't think you can compose compiled tree-sitter grammars, the
source grammars can easily be composed, so tree-sitter is perfectly able
to handle "multi-mode" buffers.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:58                                                                   ` Eli Zaretskii
  2021-06-12 11:01                                                                     ` Ergus
@ 2021-06-12 14:00                                                                     ` Stefan Monnier
  2021-06-12 14:20                                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 14:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ergus, ofv, emacs-devel

> The other design issue with TS integration is that I'd like it to plug
> into the JIT font-lock interface of the display engine, so that we
> don't unnecessarily fontify parts of the buffer that won't be
> displayed, and always do fontify the parts that will be.

Hm... AFAIK that's already what emacs-tree-sitter does.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:47                                                   ` Daniele Nicolodi
  2021-06-12  8:57                                                     ` tomas
@ 2021-06-12 14:04                                                     ` Stefan Monnier
  1 sibling, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 14:04 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: emacs-devel

Daniele Nicolodi [2021-06-12 10:47:28] wrote:
> On 11/06/2021 22:52, Stefan Monnier wrote:
>> PS: I think we can expect 99% of Emacs users have a Javascript engine
>> already installed (in the form of a web browser),
> The JS engine in a web browser and node are two very different beasts.
> The main problem with node is that it is very hard to get it to work in
> a self contained way that does not involve downloading JS packages from
> the network. I also anticipate some resistance in the Emacs community on
> depending on the JS ecosystem where licensing is much more "liberal"
> than within the GNU project.

I prefer to look at it as "how will we do it" than "what problems may
prevent us from doing it".  Otherwise, we'll never get there.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 13:44                                                               ` Stefan Monnier
@ 2021-06-12 14:14                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12 14:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: dancol@dancol.org,  acm@muc.de,  rudalics@gmx.at,  rms@gnu.org,
>   emacs-devel@gnu.org
> Date: Sat, 12 Jun 2021 09:44:10 -0400
> 
> > Well, using -Og _is_ optimizing, albeit less than -O2 does.
> 
> Ah, I see what you mean.  For me the first step of optimizing is to
> disable the extra checks, whereas this was built with
> 
>     --enable-checking --enable-check-lisp-object-type

I think the effect of these on redisplay speed is exaggerated.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 14:00                                                                     ` Stefan Monnier
@ 2021-06-12 14:20                                                                       ` Eli Zaretskii
  2021-06-12 14:33                                                                         ` Stefan Monnier
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12 14:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: ofv, spacibba, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ergus <spacibba@aol.com>,  ofv@wanadoo.es,  emacs-devel@gnu.org
> Date: Sat, 12 Jun 2021 10:00:26 -0400
> 
> > The other design issue with TS integration is that I'd like it to plug
> > into the JIT font-lock interface of the display engine, so that we
> > don't unnecessarily fontify parts of the buffer that won't be
> > displayed, and always do fontify the parts that will be.
> 
> Hm... AFAIK that's already what emacs-tree-sitter does.

Can you point me to the code which does that?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 14:20                                                                       ` Eli Zaretskii
@ 2021-06-12 14:33                                                                         ` Stefan Monnier
  2021-06-12 15:06                                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 14:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, ofv, emacs-devel

Eli Zaretskii [2021-06-12 17:20:36] wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Ergus <spacibba@aol.com>,  ofv@wanadoo.es,  emacs-devel@gnu.org
>> Date: Sat, 12 Jun 2021 10:00:26 -0400
>> 
>> > The other design issue with TS integration is that I'd like it to plug
>> > into the JIT font-lock interface of the display engine, so that we
>> > don't unnecessarily fontify parts of the buffer that won't be
>> > displayed, and always do fontify the parts that will be.
>> 
>> Hm... AFAIK that's already what emacs-tree-sitter does.
>
> Can you point me to the code which does that?

The code is in `tree-sitter-hl.el`, where they define
`tree-sitter-hl-mode` which is enabled by `tree-sitter-hl--setup`
where they

    (add-function :override (local 'font-lock-fontify-region-function)
                  #'tree-sitter-hl--highlight-region)


-- Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 11:25                                                                       ` Eli Zaretskii
@ 2021-06-12 15:04                                                                         ` Ergus
  2021-06-12 15:16                                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-12 15:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Sat, Jun 12, 2021 at 02:25:45PM +0300, Eli Zaretskii wrote:
>> Date: Sat, 12 Jun 2021 13:01:03 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
>>
>> If I understand something about our cc-mode functionalities (and many of
>> those functionalities we don't want to loose like indentation and code
>> navigation). Probably the "right" way to use tree-sitter (maybe Alan
>> wants give a more precise technical description) is not only fontify but
>> use the tree information to add contextual information to the text
>> (something that I think cc-mode does.) And then let font-lock do the
>> magic.
>>
>> The tree-sitter tree is basically contextual information, and (for
>> example) if we have processed the whole buffer and we already have the
>> tree, then scrolling won't need to parse anything, adding or removing
>> text is a localized modification, so with the previous tree we can
>> re-parse only the modified region. The choice may be then if we
>> propertize the text of the whole buffer or just the visible region OR if
>> we want to "propertize on demand".
>>
>> This will save us from the hard parsing in cc-mode to fontify "on the
>> fly".
>
>I'm not sure I understand what you are suggesting.  Can you describe
>your suggestion in terms of 'face' text properties and the 'fontified'
>property, and explain how those should fit into the existing redisplay
>mechanisms?
>
cc-mode have something similar to the tree sitter properties. It is the
information we get in c-syntactic-context or c-langelem-sym. 

I don't actually know where is this information stored now by cc-mode.

But right now it is set in the text just by regions (visible ones) that
are parsed on demand (that's why they impact commands like
scrolling). So there are two operation, 1) the parsing and then 2) setting
this properties to the text (or where they are stored somehow).

In the other hand when we want to get things like
c-defun-name-and-limits we also search on the fly with functions like
c-declaration-limits-1 or c-go-list-backward, that search on the fly and
try to recognize or find the contextual information.

With tree sitter on the other hand:

suppose we have a buffer like:

int main()
{
	int i = 5;

	return 0;
}
  

The tree sitter parser returns a tree that may be represented like:

(translation_unit
  (function_definition type:
		      (primitive_type) declarator:
		      (function_declarator declarator: (identifier)
					   parameters: (parameter_list))
		      body:
		      (compound_statement
		       (declaration type: (primitive_type)
				    declarator:
				    (init_declarator
				     declarator: (identifier)
				     value: (number_literal)))
		       (return_statement (number_literal)))))

This tree can be traversed, accessed and recalculated very fast; but
after a change, it can be updated even faster and only by sections if we
know the rest haven't change.

When we have a visible region (suppose that we only see the line: int i
= 5; because our screen is very small for this example)

as we know where that line starts in the buffer then we can find the
nearest node that extends in this region using functions like:

ts_node_first_child_for_byte
ts_node_descendant_for_byte_range
ts_node_named_descendant_for_byte_range

the design choice comes here.

1) We can iterate (or traverse) the "usefull" subtree over them to
convert that information in text properties directly (using
ts_tree_cursor_current_field_id). 

But If I remember correctly that could have some implications in
redisplay... right?. Even when we modify properties that are not visible
or belong to an outer node.

2) We never convert the tree information into properties (as we know
them in the text now), but just use the ts_tree_cursor_* set of
functions to access the information and tell to the display engine to
use some faces for it.

So in the lisp side instead of accessing stored information in the
properties we just call a wrapper around tree-sitter C functions.

----

The first approach may be probably simpler to implement, but less
optimal because of the translation between C-Lisp types and adding
properties constantly on every update adds extra work on the lisp side.

This may be optimized a bit using for example
ts_tree_get_changed_ranges.

The second approach may require a bit more of work, but will solve the
issue of indentation and code navigation for all the modes with a common
pattern and a single api. While the display engine could access directly
to all the information from C to C.

The key difference may be that (for example) basic commands like: up-list

1) with the first approach will search on the buffer for text properties
changes, syntax-ppss and so on.

2) with the second one will just call ts_node_parent and go to
ts_node_start_byte.


>> > I don't
>> >really care if TS actually processes a much larger chunk of text, if
>> >it does that quickly enough, but processing the resulting faces will
>> >take time on the Emacs side, and that is better avoided.
>>
>> But then we won't get all the contextual information we need for
>> indentation, code navigation or fold the code right?
>
>Why not?
>
translating also that information may be a lot of work too. 

>> I see two approaches here:
>>
>> 1) add the tree-sitter properties/faces to the buffer text (fully or
>> partially on the visible regions)
>>
>> 2) use the tree-sitter information directly from the tree and add the
>> visible properties from there.
>>
>> This second one will require a more complete api of tree-sitter
>> functions exposed to elisp, but in my opinion it worth it in accuracy,
>> speed and simplicity (a single API to rule them all). And to support
>> many languages we don't actually have like rust or the fancy C++ > 11.
>
>Why can't we have both?  The information you are talking about, which
>is needed by Emacs features other than fontification, can be used by
>those other Emacs features when needed.  You seem to be saying that
>these two alternatives are mutually-exclusive, but you didn't explain
>why.
>
They are not exclusive, but redundant. If we use the current
infrastructure then we will spend a lot of time translating properties
and contextual information. And avoiding to have part of them
outdated. Navigation and indentation will continue to be based on
properties we need to set and update all the time to make the match one
by one.

Basically we will be duplicating the information that is already in the
tree. Creating many list objects, overloading the gc, and so on. So we
potentially will save only the parsing time.

The first one may work with a very primitive api to handle and iterate
the tree-sitter tree. The second one will require to use cursors,
finders and some other features from the tree-sitter API; improving
performance for sure but replacing a lot of the work lisp is doing now.

The second approach will probably make happy the C developers more than
the Lisp ones.




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 14:33                                                                         ` Stefan Monnier
@ 2021-06-12 15:06                                                                           ` Eli Zaretskii
  2021-06-12 15:46                                                                             ` Stefan Monnier
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12 15:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: ofv, spacibba, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: spacibba@aol.com,  ofv@wanadoo.es,  emacs-devel@gnu.org
> Date: Sat, 12 Jun 2021 10:33:59 -0400
> 
> The code is in `tree-sitter-hl.el`, where they define
> `tree-sitter-hl-mode` which is enabled by `tree-sitter-hl--setup`
> where they
> 
>     (add-function :override (local 'font-lock-fontify-region-function)
>                   #'tree-sitter-hl--highlight-region)

I've seen that, but it's full of FIXMEs that basically tell me this is
incomplete and perhaps even kludgey?

I don't really understand why the workarounds are needed (nor why
font-lock-keywords would need to still be supported with TS).

And I cannot say I'm happy with the uses of buffer-substring and the
many conversions between character positions and byte positions.

Maybe these could be cleaned up?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:04                                                                         ` Ergus
@ 2021-06-12 15:16                                                                           ` Eli Zaretskii
  2021-06-12 15:23                                                                             ` Ergus
  0 siblings, 1 reply; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12 15:16 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 17:04:02 +0200
> From: Ergus <spacibba@aol.com>
> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
> 
> >> I see two approaches here:
> >>
> >> 1) add the tree-sitter properties/faces to the buffer text (fully or
> >> partially on the visible regions)
> >>
> >> 2) use the tree-sitter information directly from the tree and add the
> >> visible properties from there.
> >>
> >> This second one will require a more complete api of tree-sitter
> >> functions exposed to elisp, but in my opinion it worth it in accuracy,
> >> speed and simplicity (a single API to rule them all). And to support
> >> many languages we don't actually have like rust or the fancy C++ > 11.
> >
> >Why can't we have both?  The information you are talking about, which
> >is needed by Emacs features other than fontification, can be used by
> >those other Emacs features when needed.  You seem to be saying that
> >these two alternatives are mutually-exclusive, but you didn't explain
> >why.
> >
> They are not exclusive, but redundant. If we use the current
> infrastructure then we will spend a lot of time translating properties
> and contextual information.

That depends on what you mean by "current infrastructure".

> And avoiding to have part of them outdated. Navigation and
> indentation will continue to be based on properties we need to set
> and update all the time to make the match one by one.
> 
> Basically we will be duplicating the information that is already in the
> tree. Creating many list objects, overloading the gc, and so on. So we
> potentially will save only the parsing time.

Why would we do a silly thing like that?

> The first one may work with a very primitive api to handle and iterate
> the tree-sitter tree. The second one will require to use cursors,
> finders and some other features from the tree-sitter API; improving
> performance for sure but replacing a lot of the work lisp is doing now.
> 
> The second approach will probably make happy the C developers more than
> the Lisp ones.

So where's the dilemma?



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:16                                                                           ` Eli Zaretskii
@ 2021-06-12 15:23                                                                             ` Ergus
  2021-06-12 15:35                                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-12 15:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Sat, Jun 12, 2021 at 06:16:02PM +0300, Eli Zaretskii wrote:
>> Date: Sat, 12 Jun 2021 17:04:02 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
>>
>> >> I see two approaches here:
>> >>
>> >> 1) add the tree-sitter properties/faces to the buffer text (fully or
>> >> partially on the visible regions)
>> >>
>> >> 2) use the tree-sitter information directly from the tree and add the
>> >> visible properties from there.
>> >>
>> >> This second one will require a more complete api of tree-sitter
>> >> functions exposed to elisp, but in my opinion it worth it in accuracy,
>> >> speed and simplicity (a single API to rule them all). And to support
>> >> many languages we don't actually have like rust or the fancy C++ > 11.
>> >
>> >Why can't we have both?  The information you are talking about, which
>> >is needed by Emacs features other than fontification, can be used by
>> >those other Emacs features when needed.  You seem to be saying that
>> >these two alternatives are mutually-exclusive, but you didn't explain
>> >why.
>> >
>> They are not exclusive, but redundant. If we use the current
>> infrastructure then we will spend a lot of time translating properties
>> and contextual information.
>
>That depends on what you mean by "current infrastructure".
>
Properties, properties navigation.

>> And avoiding to have part of them outdated. Navigation and
>> indentation will continue to be based on properties we need to set
>> and update all the time to make the match one by one.
>>
>> Basically we will be duplicating the information that is already in the
>> tree. Creating many list objects, overloading the gc, and so on. So we
>> potentially will save only the parsing time.
>
>Why would we do a silly thing like that?
>
to convert the tree into some lisp objects we can use with lisp
functions (to check, read, compare and so on)

>> The first one may work with a very primitive api to handle and iterate
>> the tree-sitter tree. The second one will require to use cursors,
>> finders and some other features from the tree-sitter API; improving
>> performance for sure but replacing a lot of the work lisp is doing now.
>>
>> The second approach will probably make happy the C developers more than
>> the Lisp ones.
>
>So where's the dilemma?
>
For me none, but lisp developers may not like to rely so much in an
external library.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:23                                                                             ` Ergus
@ 2021-06-12 15:35                                                                               ` Eli Zaretskii
  0 siblings, 0 replies; 206+ messages in thread
From: Eli Zaretskii @ 2021-06-12 15:35 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 17:23:44 +0200
> From: Ergus <spacibba@aol.com>
> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
> 
> >> They are not exclusive, but redundant. If we use the current
> >> infrastructure then we will spend a lot of time translating properties
> >> and contextual information.
> >
> >That depends on what you mean by "current infrastructure".
> >
> Properties, properties navigation.

If you mean the special properties used by CC Mode, then we are not
restricted by using them.  We can invent new ones, if needed.  Or use
something other than text properties, if that makes sense.  IOW, I
don't see why this would be something we need to bother about at this
point.

> >> And avoiding to have part of them outdated. Navigation and
> >> indentation will continue to be based on properties we need to set
> >> and update all the time to make the match one by one.
> >>
> >> Basically we will be duplicating the information that is already in the
> >> tree. Creating many list objects, overloading the gc, and so on. So we
> >> potentially will save only the parsing time.
> >
> >Why would we do a silly thing like that?
> >
> to convert the tree into some lisp objects we can use with lisp
> functions (to check, read, compare and so on)
> 
> >> The first one may work with a very primitive api to handle and iterate
> >> the tree-sitter tree. The second one will require to use cursors,
> >> finders and some other features from the tree-sitter API; improving
> >> performance for sure but replacing a lot of the work lisp is doing now.
> >>
> >> The second approach will probably make happy the C developers more than
> >> the Lisp ones.
> >
> >So where's the dilemma?
> >
> For me none, but lisp developers may not like to rely so much in an
> external library.

We could have accessor functions exposed to Lisp, if that's needed.

Again, I don't see why this should bother us now.  We have enough
means to solve these problems.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:06                                                                           ` Eli Zaretskii
@ 2021-06-12 15:46                                                                             ` Stefan Monnier
  0 siblings, 0 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 15:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, ofv, emacs-devel

> I've seen that, but it's full of FIXMEs that basically tell me this is
> incomplete and perhaps even kludgey?

I haven't looked in detail at how it works, but w.r.t its interaction
with font-lock and jit-lock it seems sane.

> I don't really understand why the workarounds are needed (nor why
> font-lock-keywords would need to still be supported with TS).

`font-lock-keywords` is (ab)used by several other packages, like
hi-lock, so a major mode that uses font-lock but sets it up in a way
that ignores `font-lock-keywords` introduces problems.

Maybe instead of overriding `font-lock-fontify-region-function` it would
be better to use a single entry in `font-lock-keywords` which calls something
like `tree-sitter-hl--highlight-region`, but these are minor details
that don't affect the general approach.

> And I cannot say I'm happy with the uses of buffer-substring and the
> many conversions between character positions and byte positions.
> Maybe these could be cleaned up?

I'm pretty sure the code (and the authors) would welcome help making it
cleaner, yes.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 13:40                                                                 ` Stefan Monnier
@ 2021-06-12 15:56                                                                   ` Theodor Thornhill
  2021-06-12 16:59                                                                     ` Ergus
  2021-06-12 17:25                                                                     ` Stefan Monnier
  0 siblings, 2 replies; 206+ messages in thread
From: Theodor Thornhill @ 2021-06-12 15:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> @Stefan - I'm not sure I understand what you mean by troublesome for
>> elisp hackers.  These grammars have a lisp-like dsl, and is pretty
>> usable through C-M-x and defvars, see:
>> https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.
>
> AFAIK the grammar itself is still written in Javascript.
>

Yeah, but compiled parsers can be supplied through CI or something like that.


[...]
>
> Agreed.  Maybe a first step would be to get copyright assignments and
> include the tree sitter module in GNU ELPA?
>

If I read some of these mails correctly it seems like that wouldn't be
possible due to interest from some of the parties involved in the main
package.  I don't know the details on that, though.  And Eli seems
unhappy with what's there.

As for making a little more concrete proposal for how to move forward,
would this be something like what we want?

- create/use c or rust bindings
- create an elisp-layer for interaction with the parse tree
- hook fontification and indentation into that elisp-layer

It feels like the elisp-layer will be the easiest part.  I'm not really
well versed in where to look in the c code of emacs for where and how to
link this, so some pointers would be nice.

It looks like most people agree that tree sitter support is wanted, so
maybe it's time to start doing it?  I can surely have a stab at it, but
I'd like some guidance for how to proceed best - if it's wanted, that
is.

--
Theodor



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:56                                                                   ` Theodor Thornhill
@ 2021-06-12 16:59                                                                     ` Ergus
  2021-06-12 17:51                                                                       ` Theodor Thornhill
  2021-06-12 17:25                                                                     ` Stefan Monnier
  1 sibling, 1 reply; 206+ messages in thread
From: Ergus @ 2021-06-12 16:59 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Stefan Monnier, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

On Sat, Jun 12, 2021 at 05:56:34PM +0200, Theodor Thornhill wrote:
>Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> @Stefan - I'm not sure I understand what you mean by troublesome for
>>> elisp hackers.  These grammars have a lisp-like dsl, and is pretty
>>> usable through C-M-x and defvars, see:
>>> https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.
>>
>> AFAIK the grammar itself is still written in Javascript.
>>
>
>Yeah, but compiled parsers can be supplied through CI or something like that.
>
>
>[...]
>>
>> Agreed.  Maybe a first step would be to get copyright assignments and
>> include the tree sitter module in GNU ELPA?
>>
>
>If I read some of these mails correctly it seems like that wouldn't be
>possible due to interest from some of the parties involved in the main
>package.  I don't know the details on that, though.  And Eli seems
>unhappy with what's there.
>
>As for making a little more concrete proposal for how to move forward,
>would this be something like what we want?
>
>- create/use c or rust bindings

Hi: 

Eli and the others will give better info for sure, but just to start
(and also they may correct my ideas):

First there is needed a "mode-local" initialization for the parser based
on the major mode (as explained in the TS doc). The parser probably must
be stored somewhere in the "mode" to avoid parser duplication for the
same language. This should be executed probably once/mode (it may be
perfectly in the lisp side then) and will be a wrapper to call:

ts_parser_new
ts_parser_set_language

After that in the C side I think that all we need is in buffer.{h,c}.

to pass the current_buffer->text->beg (or similar) directly to
ts_parser_parse_string or ts_parser_parse_string_encoding. 

Here we must exclude the gap region maybe with ts_parser_included_ranges
(all that information seems to be there as macros in buffer.h).

Once we have a tree we associate it with the buffer it belongs to. And
then comes the rest.

>- create an elisp-layer for interaction with the parse tree

Basically we need to expose some of them, but it is better if we can
handle the most we can in the C side. Using simpler data types and
handling entire regions with the ts_tree_cursor_* functionalities. Must
of course, some of the will be needed for other functionalities. 

I don't know if we can manage the font-locking from C? But I think that
text properties can.

So the next step is just traverse the visible region of the tree to
convert the info in text properties.

Here will be needed a sort of translation between
ts_language_symbol_count and font-lock faces.

>- hook fontification and indentation into that elisp-layer
>

If I understood what Eli wants to prevent, if we set the properties and
faces in step 2; then these hooks may not be needed.

In most cases we will need to call ts_parser_parse_string somewhere
`after-change-functions` (or maybe earlier I don't know) passing it the
old tree and getting the differences with the new one with
ts_tree_get_changed_ranges.

This returns something much smaller than the tree so maybe we can
convert it into a lisp list to use it in font-lock in the lisp side if
we can't handle most of it in C.

>It feels like the elisp-layer will be the easiest part.  I'm not really
>well versed in where to look in the c code of emacs for where and how to
>link this, so some pointers would be nice.
>
>It looks like most people agree that tree sitter support is wanted, so
>maybe it's time to start doing it?  I can surely have a stab at it, but
>I'd like some guidance for how to proceed best - if it's wanted, that
>is.
>
>--
>Theodor
>



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:56                                                                   ` Theodor Thornhill
  2021-06-12 16:59                                                                     ` Ergus
@ 2021-06-12 17:25                                                                     ` Stefan Monnier
  2021-06-12 17:53                                                                       ` Theodor Thornhill
                                                                                         ` (2 more replies)
  1 sibling, 3 replies; 206+ messages in thread
From: Stefan Monnier @ 2021-06-12 17:25 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

>> Agreed.  Maybe a first step would be to get copyright assignments and
>> include the tree sitter module in GNU ELPA?
> If I read some of these mails correctly it seems like that wouldn't be
> possible due to interest from some of the parties involved in the main
> package.  I don't know the details on that, though.

Before we start a parallel effort, we definitely should make every effort
to get copyright assignments for the existing code.  Maybe we can't take
the package as-is because some contributors won't accept to sign the
paperwork, but we can probably get paperwork for a significant fraction
of the code.

That would already help reduce duplicated efforts.

This is very important, not just to reduce the amount of work, but also
to avoid alienating interested parties.

> And Eli seems unhappy with what's there.

That doesn't mean we have to start over from scratch.

> As for making a little more concrete proposal for how to move forward,
> would this be something like what we want?
> - create/use c or rust bindings

I think we'd want to link to the C API of tree-sitter.
There's no point going through Rust at this point, AFAICT.

> - create an elisp-layer for interaction with the parse tree
> - hook fontification and indentation into that elisp-layer

Sounds about right.


        Stefan




^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09  8:34                         ` martin rudalics
  2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
@ 2021-06-12 17:29                           ` João Távora
  1 sibling, 0 replies; 206+ messages in thread
From: João Távora @ 2021-06-12 17:29 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

[-- Attachment #1: Type: text/plain, Size: 628 bytes --]

On Wed, Jun 9, 2021, 09:34 martin rudalics <rudalics@gmx.at> wrote:

> I do not like, for example, that inserting a quotation mark somewhere
> into a Lisp buffer, with some delay repaints the entire rest of the
> buffer just to undo that when I insert the closing quotation mark.
>

Since recently, that shouldn't happen anymore unless you wait a relatively
long time. That time is configurable. Search for "antiblink". I added the
feature and am interested in knowing if it's not performing as it should.

Alternatively, you can also try a parenthesis pairing solution such as
electric-pair-mode.

João

>
>

[-- Attachment #2: Type: text/html, Size: 1202 bytes --]

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 16:59                                                                     ` Ergus
@ 2021-06-12 17:51                                                                       ` Theodor Thornhill
  0 siblings, 0 replies; 206+ messages in thread
From: Theodor Thornhill @ 2021-06-12 17:51 UTC (permalink / raw)
  To: Ergus; +Cc: Stefan Monnier, Eli Zaretskii, dancol, acm, rudalics, emacs-devel



> Hi: 
>
> Eli and the others will give better info for sure, but just to start
> (and also they may correct my ideas):
>
> First there is needed a "mode-local" initialization for the parser based
> on the major mode (as explained in the TS doc). The parser probably must
> be stored somewhere in the "mode" to avoid parser duplication for the

[...]

Thanks for the input!  Will probably prove invaluable down the line :)
I was hoping it would be as "simple" as this.

--
Theo



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:25                                                                     ` Stefan Monnier
@ 2021-06-12 17:53                                                                       ` Theodor Thornhill
  2021-06-12 17:54                                                                       ` Ergus
  2021-06-12 18:02                                                                       ` Daniel Colascione
  2 siblings, 0 replies; 206+ messages in thread
From: Theodor Thornhill @ 2021-06-12 17:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> Agreed.  Maybe a first step would be to get copyright assignments and
>>> include the tree sitter module in GNU ELPA?
>> If I read some of these mails correctly it seems like that wouldn't be
>> possible due to interest from some of the parties involved in the main
>> package.  I don't know the details on that, though.
>
> Before we start a parallel effort, we definitely should make every effort
> to get copyright assignments for the existing code.  Maybe we can't take
> the package as-is because some contributors won't accept to sign the
> paperwork, but we can probably get paperwork for a significant fraction
> of the code.

Sure - I can open an issue and see where we're at.

>
>> And Eli seems unhappy with what's there.
>
> That doesn't mean we have to start over from scratch.
>

No, absolutely.

>> As for making a little more concrete proposal for how to move forward,
>> would this be something like what we want?
>> - create/use c or rust bindings
>
> I think we'd want to link to the C API of tree-sitter.
> There's no point going through Rust at this point, AFAICT.
>
>> - create an elisp-layer for interaction with the parse tree
>> - hook fontification and indentation into that elisp-layer
>
> Sounds about right.
>

Ok good!

--
Theo



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:25                                                                     ` Stefan Monnier
  2021-06-12 17:53                                                                       ` Theodor Thornhill
@ 2021-06-12 17:54                                                                       ` Ergus
  2021-06-12 18:02                                                                       ` Daniel Colascione
  2 siblings, 0 replies; 206+ messages in thread
From: Ergus @ 2021-06-12 17:54 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Theodor Thornhill, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

On Sat, Jun 12, 2021 at 01:25:14PM -0400, Stefan Monnier wrote:
>>> Agreed.  Maybe a first step would be to get copyright assignments and
>>> include the tree sitter module in GNU ELPA?
>> If I read some of these mails correctly it seems like that wouldn't be
>> possible due to interest from some of the parties involved in the main
>> package.  I don't know the details on that, though.
>
>Before we start a parallel effort, we definitely should make every effort
>to get copyright assignments for the existing code.  Maybe we can't take
>the package as-is because some contributors won't accept to sign the
>paperwork, but we can probably get paperwork for a significant fraction
>of the code.
>
>That would already help reduce duplicated efforts.
>
>This is very important, not just to reduce the amount of work, but also
>to avoid alienating interested parties.
>
I agree, but it looks like Eli wants a different approach for the
calling and a part of the performance issues come from the font-lock and
Lisp hooks and translations forth and back. Something that a package
implemented with modules still do at some level.

Will you write to the authors on GH? There are only 17 contributors, so
not a crazy number of copyrights to get.

What is wondering me is that managing copyright usually is a never
ending problem. We are still waiting for use-package to get all of
them. And every time we say to gather copyrights then there is a dead
time and the topic is forgotten after a while.

The package was designed to be an external feature so it may use some
"inefficient" solutions (lisp calls from C, substring, font-lock init
functions) that could be cleaned and improvement to access internal C
code directly; that will require a deeper knowledge of the package and
emacs C code to be handled, and I don't know how available will be the
developers to do so.



>> And Eli seems unhappy with what's there.
>
>That doesn't mean we have to start over from scratch.
>
That's true. But the approach implemented with modules or internally in
emacs may be very different right?

>> As for making a little more concrete proposal for how to move forward,
>> would this be something like what we want?
>> - create/use c or rust bindings
>
>I think we'd want to link to the C API of tree-sitter.
>There's no point going through Rust at this point, AFAICT.
>
>> - create an elisp-layer for interaction with the parse tree
>> - hook fontification and indentation into that elisp-layer
>
>Sounds about right.
>
>
>        Stefan
>
>



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:25                                                                     ` Stefan Monnier
  2021-06-12 17:53                                                                       ` Theodor Thornhill
  2021-06-12 17:54                                                                       ` Ergus
@ 2021-06-12 18:02                                                                       ` Daniel Colascione
  2021-06-12 18:39                                                                         ` Ergus
  2 siblings, 1 reply; 206+ messages in thread
From: Daniel Colascione @ 2021-06-12 18:02 UTC (permalink / raw)
  To: Stefan Monnier, Theodor Thornhill
  Cc: acm, Ergus, emacs-devel, Eli Zaretskii, rudalics



On June 12, 2021 10:25:19 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>>> Agreed.  Maybe a first step would be to get copyright assignments and
>>> include the tree sitter module in GNU ELPA?
>> If I read some of these mails correctly it seems like that wouldn't be
>> possible due to interest from some of the parties involved in the main
>> package.  I don't know the details on that, though.
>
> Before we start a parallel effort, we definitely should make every effort
> to get copyright assignments for the existing code.  Maybe we can't take
> the package as-is because some contributors won't accept to sign the
> paperwork, but we can probably get paperwork for a significant fraction
> of the code.
>
> That would already help reduce duplicated efforts.
>
> This is very important, not just to reduce the amount of work, but also
> to avoid alienating interested parties.
>
>> And Eli seems unhappy with what's there.
>
> That doesn't mean we have to start over from scratch.
>
>> As for making a little more concrete proposal for how to move forward,
>> would this be something like what we want?
>> - create/use c or rust bindings
>
> I think we'd want to link to the C API of tree-sitter.
> There's no point going through Rust at this point, AFAICT.
>
>> - create an elisp-layer for interaction with the parse tree
>> - hook fontification and indentation into that elisp-layer
>
> Sounds about right.
>
>
>        Stefan

It's very important that the actual parsers be modules, at least 
optionally. It must be possible to customize and develop on a running 
Emacs, without a restart. To do that, if we stick with a model where 
generated parsers are in C, we must unload and reload compiled code. I am 
convinced we can make the module interface efficient enough for this to 
work without measurable overhead.






^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 18:02                                                                       ` Daniel Colascione
@ 2021-06-12 18:39                                                                         ` Ergus
  0 siblings, 0 replies; 206+ messages in thread
From: Ergus @ 2021-06-12 18:39 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Stefan Monnier, Theodor Thornhill, Eli Zaretskii, acm, rudalics,
	emacs-devel

On Sat, Jun 12, 2021 at 11:02:36AM -0700, Daniel Colascione wrote:
>
>
>On June 12, 2021 10:25:19 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
>>>>Agreed.  Maybe a first step would be to get copyright assignments and
>>>>include the tree sitter module in GNU ELPA?
>>>If I read some of these mails correctly it seems like that wouldn't be
>>>possible due to interest from some of the parties involved in the main
>>>package.  I don't know the details on that, though.
>>
>>Before we start a parallel effort, we definitely should make every effort
>>to get copyright assignments for the existing code.  Maybe we can't take
>>the package as-is because some contributors won't accept to sign the
>>paperwork, but we can probably get paperwork for a significant fraction
>>of the code.
>>
>>That would already help reduce duplicated efforts.
>>
>>This is very important, not just to reduce the amount of work, but also
>>to avoid alienating interested parties.
>>
>>>And Eli seems unhappy with what's there.
>>
>>That doesn't mean we have to start over from scratch.
>>
>>>As for making a little more concrete proposal for how to move forward,
>>>would this be something like what we want?
>>>- create/use c or rust bindings
>>
>>I think we'd want to link to the C API of tree-sitter.
>>There's no point going through Rust at this point, AFAICT.
>>
>>>- create an elisp-layer for interaction with the parse tree
>>>- hook fontification and indentation into that elisp-layer
>>
>>Sounds about right.
>>
>>
>>       Stefan
>
>It's very important that the actual parsers be modules, at least 
>optionally. It must be possible to customize and develop on a running 
>Emacs, without a restart. To do that, if we stick with a model where 
>generated parsers are in C, we must unload and reload compiled code. I 
>am convinced we can make the module interface efficient enough for 
>this to work without measurable overhead.
>
Yes of course. Once we have the internal infrastructure the parsers
should be modules that autocompile during the installation (like
vterm). 

If we make a simple infrastructure, those modules won't even require any
lisp code, just some instructions to download and compile the shared
object somewhere with a proper name?




^ permalink raw reply	[flat|nested] 206+ messages in thread

end of thread, other threads:[~2021-06-12 18:39 UTC | newest]

Thread overview: 206+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
2021-06-04  6:10 ` Eli Zaretskii
2021-06-04  7:10   ` Theodor Thornhill
2021-06-04 10:08     ` João Távora
2021-06-04 10:39       ` Eli Zaretskii
2021-06-04 10:59         ` Philipp
2021-06-04 11:05           ` João Távora
2021-06-04 11:22             ` Eli Zaretskii
2021-06-04 12:44               ` Dmitry Gutov
2021-06-04 13:46               ` João Távora
2021-06-04 14:11                 ` Eli Zaretskii
2021-06-04 11:18           ` Eli Zaretskii
2021-06-04 16:43       ` Jim Porter
     [not found]         ` <83k0n9l9pv.fsf@gnu.org>
2021-06-04 19:41           ` Jim Porter
2021-06-04 19:53             ` Eli Zaretskii
2021-06-04 20:05               ` Jim Porter
2021-06-04 20:11                 ` Joost Kremers
2021-06-05  6:51                   ` Eli Zaretskii
2021-06-05 10:14                     ` Joost Kremers
2021-06-05 11:31                       ` Eli Zaretskii
2021-06-05 12:12                         ` Joost Kremers
2021-06-05 13:23                     ` Stefan Monnier
2021-06-05 17:08                       ` Óscar Fuentes
2021-06-05 17:31                         ` Stefan Monnier
2021-06-05 17:32                         ` Eli Zaretskii
2021-06-05 18:46                     ` João Távora
2021-06-05  6:41                 ` Eli Zaretskii
2021-06-05  9:32                   ` João Távora
2021-06-05  9:59                     ` Ergus
2021-06-05 11:29                       ` Eli Zaretskii
2021-06-05 11:55                         ` Daniel Colascione
2021-06-05 12:27                           ` Eli Zaretskii
2021-06-05 17:59                             ` Jim Porter
2021-06-05 18:56                               ` Daniel Martín
2021-06-05 12:43                         ` Ergus
2021-06-05 13:59                       ` Remote GUI Emacs really works (was: cc-mode fontification feels random) Óscar Fuentes
2021-06-05 11:25                     ` cc-mode fontification feels random Eli Zaretskii
2021-06-05  9:46                   ` Ergus
2021-06-05 11:27                     ` Eli Zaretskii
2021-06-04 20:14               ` Yuri Khan
2021-06-04 10:25     ` Eli Zaretskii
2021-06-04 10:05   ` Daniel Colascione
2021-06-04 10:22     ` Eli Zaretskii
2021-06-04 10:34       ` João Távora
2021-06-04 10:43         ` Eli Zaretskii
2021-06-04 18:25         ` Stefan Monnier
2021-06-04 18:36           ` Daniel Colascione
2021-06-04 19:11             ` Eli Zaretskii
2021-06-04 19:16               ` Daniel Colascione
2021-06-04 19:26                 ` Eli Zaretskii
2021-06-04 19:33                   ` Daniel Colascione
2021-06-04 19:51                     ` Eli Zaretskii
2021-06-05  0:29             ` Stefan Monnier
2021-06-05  6:32               ` Eli Zaretskii
2021-06-04 19:07           ` Eli Zaretskii
2021-06-04 19:26             ` Daniel Colascione
2021-06-04 19:32               ` Eli Zaretskii
2021-06-04 10:41       ` Eli Zaretskii
2021-06-04 10:42 ` Ergus
2021-06-04 15:54 ` Alan Mackenzie
2021-06-04 18:30   ` Daniel Colascione
2021-06-06 11:37     ` Alan Mackenzie
2021-06-06 11:57       ` Eli Zaretskii
2021-06-06 12:27         ` Alan Mackenzie
2021-06-06 12:44           ` Eli Zaretskii
2021-06-06 14:19             ` Alan Mackenzie
2021-06-06 17:06               ` Eli Zaretskii
2021-06-06 17:44       ` Stefan Monnier
2021-06-06 18:00         ` Eli Zaretskii
2021-06-06 18:18           ` Stefan Monnier
2021-06-06 18:33             ` Daniel Colascione
2021-06-06 20:24               ` Stefan Monnier
2021-06-06 20:27                 ` Daniel Colascione
2021-06-06 20:38                   ` Stefan Monnier
2021-06-06 19:03             ` Eli Zaretskii
2021-06-06 20:28               ` Stefan Monnier
2021-06-07  7:35                 ` martin rudalics
2021-06-07 13:20                   ` Stefan Monnier
2021-06-07 13:37                     ` Eli Zaretskii
2021-06-08  0:06                       ` Daniel Colascione
2021-06-08 15:16                       ` Stefan Monnier
2021-06-07 15:58                     ` martin rudalics
2021-06-08  4:01                     ` Richard Stallman
2021-06-08 15:29                       ` Stefan Monnier
2021-06-08 15:52                         ` Eli Zaretskii
2021-06-08 16:36                           ` Stefan Monnier
2021-06-08 18:11                             ` Daniel Colascione
2021-06-08 18:25                               ` Eli Zaretskii
2021-06-08 18:28                                 ` Daniel Colascione
2021-06-08 18:54                                   ` Eli Zaretskii
2021-06-09 18:22                                 ` Alan Mackenzie
2021-06-09 18:36                                   ` Eli Zaretskii
2021-06-09 18:51                                     ` Daniel Colascione
2021-06-09 19:04                                       ` Eli Zaretskii
2021-06-09 20:07                                       ` chad
2021-06-10  6:43                                         ` Eli Zaretskii
2021-06-09 20:17                                       ` Dmitry Gutov
2021-06-09 21:03                                     ` Alan Mackenzie
2021-06-10  2:21                                       ` Daniel Colascione
2021-06-10  6:55                                         ` Eli Zaretskii
2021-06-10  6:58                                           ` Daniel Colascione
2021-06-10  7:19                                             ` Eli Zaretskii
2021-06-10  6:39                                       ` Eli Zaretskii
2021-06-10 16:46                                         ` Alan Mackenzie
2021-06-10 17:01                                           ` Eli Zaretskii
2021-06-10 17:07                                             ` Daniel Colascione
2021-06-10 17:22                                               ` Eli Zaretskii
2021-06-10 17:33                                                 ` Daniel Colascione
2021-06-10 17:39                                                   ` Eli Zaretskii
2021-06-10 17:40                                                 ` Óscar Fuentes
2021-06-10 17:44                                                   ` Eli Zaretskii
2021-06-11 16:11                                                 ` Alan Mackenzie
2021-06-11 17:53                                                   ` Eli Zaretskii
2021-06-11 18:02                                                     ` Daniel Colascione
2021-06-11 18:22                                                       ` Eli Zaretskii
2021-06-11 18:28                                                         ` Daniel Colascione
2021-06-11 19:12                                                           ` Alan Mackenzie
2021-06-11 19:23                                                           ` Eli Zaretskii
2021-06-11 18:47                                                         ` Alan Mackenzie
2021-06-11 19:32                                                           ` Eli Zaretskii
2021-06-11 19:46                                                             ` Alan Mackenzie
2021-06-11 19:50                                                               ` Eli Zaretskii
2021-06-11 18:42                                                       ` Stefan Monnier
2021-06-11 19:31                                                         ` Eli Zaretskii
2021-06-11 19:57                                                           ` Stefan Monnier
2021-06-11 23:25                                                             ` Ergus
2021-06-11 23:52                                                               ` Óscar Fuentes
2021-06-12  1:08                                                                 ` Ergus
2021-06-12  3:20                                                                   ` Stefan Monnier
2021-06-12 11:07                                                                     ` Ergus
2021-06-12  6:58                                                                   ` Eli Zaretskii
2021-06-12 11:01                                                                     ` Ergus
2021-06-12 11:25                                                                       ` Eli Zaretskii
2021-06-12 15:04                                                                         ` Ergus
2021-06-12 15:16                                                                           ` Eli Zaretskii
2021-06-12 15:23                                                                             ` Ergus
2021-06-12 15:35                                                                               ` Eli Zaretskii
2021-06-12 14:00                                                                     ` Stefan Monnier
2021-06-12 14:20                                                                       ` Eli Zaretskii
2021-06-12 14:33                                                                         ` Stefan Monnier
2021-06-12 15:06                                                                           ` Eli Zaretskii
2021-06-12 15:46                                                                             ` Stefan Monnier
2021-06-12  6:50                                                                 ` Eli Zaretskii
2021-06-12  5:20                                                               ` Theodor Thornhill
2021-06-12 13:40                                                                 ` Stefan Monnier
2021-06-12 15:56                                                                   ` Theodor Thornhill
2021-06-12 16:59                                                                     ` Ergus
2021-06-12 17:51                                                                       ` Theodor Thornhill
2021-06-12 17:25                                                                     ` Stefan Monnier
2021-06-12 17:53                                                                       ` Theodor Thornhill
2021-06-12 17:54                                                                       ` Ergus
2021-06-12 18:02                                                                       ` Daniel Colascione
2021-06-12 18:39                                                                         ` Ergus
2021-06-12  6:38                                                             ` Eli Zaretskii
2021-06-12 13:44                                                               ` Stefan Monnier
2021-06-12 14:14                                                                 ` Eli Zaretskii
2021-06-11 20:06                                                           ` Alan Mackenzie
2021-06-12  6:44                                                             ` Eli Zaretskii
2021-06-12  8:00                                                               ` Daniel Colascione
2021-06-12  8:08                                                                 ` Eli Zaretskii
2021-06-12  9:31                                                                   ` Alan Mackenzie
2021-06-11 19:48                                                         ` Eli Zaretskii
2021-06-11 18:34                                                     ` Alan Mackenzie
2021-06-10 17:26                                               ` Óscar Fuentes
2021-06-10 17:39                                               ` andrés ramírez
2021-06-10 21:06                                           ` Stefan Monnier
2021-06-11  6:14                                             ` Eli Zaretskii
2021-06-10 15:16                                       ` Ergus
2021-06-10 15:34                                         ` Óscar Fuentes
2021-06-10 19:06                                           ` Ergus
2021-06-10 19:28                                             ` Eli Zaretskii
2021-06-10 21:56                                               ` Ergus
2021-06-10 15:59                                         ` Jim Porter
2021-06-10 21:02                                         ` Stefan Monnier
2021-06-11 20:21                                           ` Ergus
2021-06-11 20:27                                             ` Stefan Monnier
2021-06-11 20:37                                               ` Daniel Colascione
2021-06-11 20:52                                                 ` Stefan Monnier
2021-06-12  6:46                                                   ` Eli Zaretskii
2021-06-12  8:03                                                     ` Daniel Colascione
2021-06-12  8:13                                                       ` Eli Zaretskii
2021-06-12 13:51                                                       ` Stefan Monnier
2021-06-12  8:47                                                   ` Daniele Nicolodi
2021-06-12  8:57                                                     ` tomas
2021-06-12 14:04                                                     ` Stefan Monnier
2021-06-09 19:05                                   ` Daniel Colascione
2021-06-09 19:11                                     ` Eli Zaretskii
2021-06-09 20:20                                     ` Alan Mackenzie
2021-06-09 20:36                                       ` Stefan Monnier
2021-06-10  7:01                                         ` Daniel Colascione
2021-06-10  7:21                                           ` Eli Zaretskii
2021-06-10  2:21                                       ` Daniel Colascione
2021-06-08 18:11                             ` Eli Zaretskii
2021-06-08 21:25                               ` Stefan Monnier
2021-06-09  3:39                         ` Richard Stallman
2021-06-09  8:34                         ` martin rudalics
2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
2021-06-09 15:15                             ` Yuri Khan
2021-06-09 15:16                               ` Yuri Khan
2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
2021-06-07 12:08                 ` Eli Zaretskii
2021-06-08 15:22                   ` Stefan Monnier
2021-06-08 15:46                     ` Eli Zaretskii
2021-06-05 20:25   ` Dmitry Gutov
2021-06-06 11:53     ` Alan Mackenzie
2021-06-06 17:08       ` Dmitry Gutov

unofficial mirror of emacs-devel@gnu.org 

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://yhetil.org/emacs-devel/0 emacs-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 emacs-devel emacs-devel/ https://yhetil.org/emacs-devel \
		emacs-devel@gnu.org
	public-inbox-index emacs-devel

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.emacs.devel
	nntp://news.gmane.io/gmane.emacs.devel


code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs.git

AGPL code for this site: git clone http://ou63pmih66umazou.onion/public-inbox.git