cc-mode fontification feels random

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* cc-mode fontification feels random
@ 2021-06-04  3:16 Daniel Colascione
  2021-06-04  6:10 ` Eli Zaretskii
                   ` (3 more replies)
  0 siblings, 4 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04  3:16 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 621 bytes --]

As long as I can remember, cc-mode fontification has felt totally 
random, with actual faces depending on happenstance of previously-parsed 
types, luck of the draw in jit-lock chunking, and so on. Is there any 
*general* way that we can make fontification more robust and consistent?

For years and years now, I've been thinking we just need more 
deterministic parser-and-based mode support, and I still think that, but 
on a realistic level, that doesn't seem to be coming any time soon.

In the meantime, is there any general approach we might be able to use 
to get stuff like the attached to stop happening?

[-- Attachment #2: types.png --]
[-- Type: image/png, Size: 33446 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
@ 2021-06-04  6:10 ` Eli Zaretskii
  2021-06-04  7:10   ` Theodor Thornhill
  2021-06-04 10:05   ` Daniel Colascione
  2021-06-04 10:42 ` Ergus
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04  6:10 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 3 Jun 2021 20:16:53 -0700
> 
> As long as I can remember, cc-mode fontification has felt totally 
> random, with actual faces depending on happenstance of previously-parsed 
> types, luck of the draw in jit-lock chunking, and so on. Is there any 
> *general* way that we can make fontification more robust and consistent?
> 
> For years and years now, I've been thinking we just need more 
> deterministic parser-and-based mode support, and I still think that, but 
> on a realistic level, that doesn't seem to be coming any time soon.

Full agreement.  And not only for C and C-like languages, IMO.

See

  https://lists.gnu.org/archive/html/emacs-devel/2020-01/msg00059.html

See also Eglot and LSP.

Patches more than welcome, I think having this (whether tree-sitter or
some other similar technology) in core is long overdue.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  6:10 ` Eli Zaretskii
@ 2021-06-04  7:10   ` Theodor Thornhill
  2021-06-04 10:08     ` João Távora
  2021-06-04 10:25     ` Eli Zaretskii
  2021-06-04 10:05   ` Daniel Colascione
  1 sibling, 2 replies; 274+ messages in thread
From: Theodor Thornhill @ 2021-06-04  7:10 UTC (permalink / raw)
  To: Eli Zaretskii, Daniel Colascione; +Cc: emacs-devel, ubolonton, joaotavora

>> As long as I can remember, cc-mode fontification has felt totally 
>> random, with actual faces depending on happenstance of previously-parsed 
>> types, luck of the draw in jit-lock chunking, and so on. Is there any 
>> *general* way that we can make fontification more robust and consistent?

Yes, tree-sitter.  Ubolonton has made a tremendous package implementing
this for emacs.  It is used in csharp-mode already, with success.  At
least for the fontification.  There are still some kinks to work out in
the indentation part of the mode.

In C#-mode we use tree sitter for:

- Fontification
- Indentation

There is also a normal CC mode version, which is enabled by default.  So
you need to install the third party packages as well as enabling
csharp-tree-sitter-mode.  You can try it out and see if it has some
benefits.  Performance wise the tree-sitter mode is leagues above the CC
mode one.  Also one benefit is that it is extremely easy to define
these grammars.

> See also Eglot and LSP.

LSP-mode supports the semantic fontification from lsp servers, which
usually uses tree-sitter.  Examples for this is Rust, F# and others.
Eglot does not yet support this, though I believe there is an issue
somewhere for it.

>
> Patches more than welcome, I think having this (whether tree-sitter or
> some other similar technology) in core is long overdue.

Pinging @Ubolonton and Joao, as they probably know way more than
me about this.

--
Theodor Thornhill

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  6:10 ` Eli Zaretskii
  2021-06-04  7:10   ` Theodor Thornhill
@ 2021-06-04 10:05   ` Daniel Colascione
  2021-06-04 10:22     ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04 10:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On 6/3/21 11:10 PM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Thu, 3 Jun 2021 20:16:53 -0700
>>
>> As long as I can remember, cc-mode fontification has felt totally
>> random, with actual faces depending on happenstance of previously-parsed
>> types, luck of the draw in jit-lock chunking, and so on. Is there any
>> *general* way that we can make fontification more robust and consistent?
>>
>> For years and years now, I've been thinking we just need more
>> deterministic parser-and-based mode support, and I still think that, but
>> on a realistic level, that doesn't seem to be coming any time soon.
> Full agreement.  And not only for C and C-like languages, IMO.
>
> See
>
>    https://lists.gnu.org/archive/html/emacs-devel/2020-01/msg00059.html
>
> See also Eglot and LSP.
>
> Patches more than welcome, I think having this (whether tree-sitter or
> some other similar technology) in core is long overdue.

We could just vendor tree-sitter.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  7:10   ` Theodor Thornhill
@ 2021-06-04 10:08     ` João Távora
  2021-06-04 10:39       ` Eli Zaretskii
  2021-06-04 16:43       ` Jim Porter
  2021-06-04 10:25     ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: João Távora @ 2021-06-04 10:08 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Eli Zaretskii, Daniel Colascione, ubolonton, emacs-devel

Theodor Thornhill <theo@thornhill.no> writes:

> Pinging @Ubolonton and Joao, as they probably know way more than
> me about this.

Here are my quick views on this:

- Eglot can add LSP fontification support, that doesn't seem hard.

- However, LSP support for fontification seems like it's potentially
  _less_ efficient than integrating something like tree-sitter as a C
  module in Emacs.  That's because the contents of the buffer and
  fontification results are continually transmitted back and forth via
  pipes and JSON format.

- Moreover, if one wishes 100% out-of-the-box support for LSP (this or
  any other feature), one needs to also distribute a capable server
  program.  For C/C++ this is potentially problematic due to licensing
  issues: the most capable such program for C/C++, is to the best of my
  limited knowldge, clangd.  There are others, though.

- The past few weeks I've been trying to get back to the long-stated
  goal of integrating Eglot into Emacs proper, as discussed some time
  ago.  The idea is to first let it be an independent extension much
  like it is now, then experiment with integrating its functionality
  directly in major modes, eventually evolving into an out-of-the-box,
  seamless
  "i-dont-even-know-that-LSP-is-being-leveraged-in-the-background"
  experience for documentation, definition-finding, diagnostics, etc.
  And also fontification, of course, but my gut feeling says that
  tree-sitter (or any other integrated parser) approach is more
  efficient and "tighter" for such a basic thing.

João





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:05   ` Daniel Colascione
@ 2021-06-04 10:22     ` Eli Zaretskii
  2021-06-04 10:34       ` João Távora
  2021-06-04 10:41       ` Eli Zaretskii
  0 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:22 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 4 Jun 2021 03:05:53 -0700
> Cc: emacs-devel@gnu.org
> 
> We could just vendor tree-sitter.

Sorry, I don't understand what that means.

My problem is that I know of now package that integrates tree-sitter
into Emacs with architecture that makes sense to me.  The ones I saw
all send the entire buffer to tree-sitter using buffer-string (and
failing to encode it), which doesn't scale.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  7:10   ` Theodor Thornhill
  2021-06-04 10:08     ` João Távora
@ 2021-06-04 10:25     ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:25 UTC (permalink / raw)
  To: Theodor Thornhill; +Cc: ubolonton, dancol, joaotavora, emacs-devel

> From: Theodor Thornhill <theo@thornhill.no>
> Cc: emacs-devel@gnu.org, ubolonton@gmail.com, joaotavora@gmail.com
> Date: Fri, 04 Jun 2021 09:10:33 +0200
> 
> >> As long as I can remember, cc-mode fontification has felt totally 
> >> random, with actual faces depending on happenstance of previously-parsed 
> >> types, luck of the draw in jit-lock chunking, and so on. Is there any 
> >> *general* way that we can make fontification more robust and consistent?
> 
> Yes, tree-sitter.  Ubolonton has made a tremendous package implementing
> this for emacs.  It is used in csharp-mode already, with success.  At
> least for the fontification.  There are still some kinks to work out in
> the indentation part of the mode.

Not from my POV, see my other message.

I welcome patches submitted to the project with the goal of
integrating that into Emacs core.  Past discussions indicated to me
that authors of the existing packages are not interested in that
enough to modify the packages according to our suggestions.  Sorry to
be blunt.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:22     ` Eli Zaretskii
@ 2021-06-04 10:34       ` João Távora
  2021-06-04 10:43         ` Eli Zaretskii
  2021-06-04 18:25         ` Stefan Monnier
  2021-06-04 10:41       ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: João Távora @ 2021-06-04 10:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> My problem is that I know of now package that integrates tree-sitter
> into Emacs with architecture that makes sense to me.  The ones I saw
> all send the entire buffer to tree-sitter using buffer-string (and
> failing to encode it), which doesn't scale.

In this matter, the LSP approach may be more efficient, since it
transmits only changes/differences, and should (in principle) handle the
encoding troubles.

But I don't understand what's stopping these tree-sitter C modules (like
[1] and [2]) to have access to the buffer's contents directly and have
the best of both worlds.

João

[1]: https://github.com/karlotness/tree-sitter.el
[2]: https://github.com/ubolonton/emacs-tree-sitter






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:08     ` João Távora
@ 2021-06-04 10:39       ` Eli Zaretskii
  2021-06-04 10:59         ` Philipp
  2021-06-04 16:43       ` Jim Porter
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:39 UTC (permalink / raw)
  To: João Távora; +Cc: ubolonton, dancol, theo, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>   emacs-devel@gnu.org,  ubolonton@gmail.com
> Date: Fri, 04 Jun 2021 11:08:48 +0100
> 
> - However, LSP support for fontification seems like it's potentially
>   _less_ efficient than integrating something like tree-sitter as a C
>   module in Emacs.  That's because the contents of the buffer and
>   fontification results are continually transmitted back and forth via
>   pipes and JSON format.

The communication of buffer contents to these agents/servers is indeed
one aspect of the existing packages (those I had time to look at) that
I personally am unhappy about.  Sending the whole buffer or its large
chunks down the wire as buffer-substring (which requires encoding to
be correct) is non-scalable, especially if it also requires conversion
to JSON.  A core feature cannot work that way, IMO.

Unfortunately, every discussion about the alternatives, at least those
in which I participated, ended with nothing, although I think a much
better solution is possible and even not too hard.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:22     ` Eli Zaretskii
  2021-06-04 10:34       ` João Távora
@ 2021-06-04 10:41       ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:41 UTC (permalink / raw)
  To: dancol; +Cc: emacs-devel

> Date: Fri, 04 Jun 2021 13:22:38 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> My problem is that I know of now package that integrates tree-sitter
                               ^^^
Sorry, should have been "no".

> into Emacs with architecture that makes sense to me.  The ones I saw
> all send the entire buffer to tree-sitter using buffer-string (and
> failing to encode it), which doesn't scale.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
  2021-06-04  6:10 ` Eli Zaretskii
@ 2021-06-04 10:42 ` Ergus
  2021-06-04 15:54 ` Alan Mackenzie
  2021-08-30 18:50 ` [PATCH] " Alan Mackenzie
  3 siblings, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-04 10:42 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

On Thu, Jun 03, 2021 at 08:16:53PM -0700, Daniel Colascione wrote:

>For years and years now, I've been thinking we just need more 
>deterministic parser-and-based mode support, and I still think that, 
>but on a realistic level, that doesn't seem to be coming any time 
>soon.
>

There is something going on with lsp and lsp-mode I think, but not in
vanilla for sure. And I am not aware of the actual status.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:34       ` João Távora
@ 2021-06-04 10:43         ` Eli Zaretskii
  2021-06-04 18:25         ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 10:43 UTC (permalink / raw)
  To: João Távora; +Cc: dancol, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Daniel Colascione <dancol@dancol.org>,  emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 11:34:31 +0100
> 
> But I don't understand what's stopping these tree-sitter C modules (like
> [1] and [2]) to have access to the buffer's contents directly and have
> the best of both worlds.

Exactly.  I proposed that much in past discussions, but no one seemed
to be interested enough to pick up the gauntlet.  I still offer help
in making this happen to anyone who'd like to work on this (and needs
help).

TIA



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:39       ` Eli Zaretskii
@ 2021-06-04 10:59         ` Philipp
  2021-06-04 11:05           ` João Távora
  2021-06-04 11:18           ` Eli Zaretskii
  0 siblings, 2 replies; 274+ messages in thread
From: Philipp @ 2021-06-04 10:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ubolonton, dancol, theo, João Távora, emacs-devel



> Am 04.06.2021 um 12:39 schrieb Eli Zaretskii <eliz@gnu.org>:
> 
>> From: João Távora <joaotavora@gmail.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>>  emacs-devel@gnu.org,  ubolonton@gmail.com
>> Date: Fri, 04 Jun 2021 11:08:48 +0100
>> 
>> - However, LSP support for fontification seems like it's potentially
>>  _less_ efficient than integrating something like tree-sitter as a C
>>  module in Emacs.  That's because the contents of the buffer and
>>  fontification results are continually transmitted back and forth via
>>  pipes and JSON format.
> 
> The communication of buffer contents to these agents/servers is indeed
> one aspect of the existing packages (those I had time to look at) that
> I personally am unhappy about.  Sending the whole buffer or its large
> chunks down the wire as buffer-substring (which requires encoding to
> be correct) is non-scalable, especially if it also requires conversion
> to JSON.

How bad is is actually; are there good numbers on this?
A while ago, I tested this hypothesis by transferring the `buffer-string' of xdisp.c to a Go module.  This goes through a full UTF-8 encoding and makes three copies (first, to create the string object; then, to copy it to the module interface; lastly, to make a Go string out of it), and it still only took a few milliseconds.
Modern CPUs are very good at copying memory, so maybe we're optimizing the wrong thing here.  We definitely should have good benchmarks and profiling data before deciding what to optimize.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:59         ` Philipp
@ 2021-06-04 11:05           ` João Távora
  2021-06-04 11:22             ` Eli Zaretskii
  2021-06-04 11:18           ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-04 11:05 UTC (permalink / raw)
  To: Philipp; +Cc: Eli Zaretskii, Daniel Colascione, theo, ubolonton, emacs-devel

On Fri, Jun 4, 2021 at 11:59 AM Philipp <p.stephani2@gmail.com> wrote:

> > Am 04.06.2021 um 12:39 schrieb Eli Zaretskii <eliz@gnu.org>:
> >
> >> From: João Távora <joaotavora@gmail.com>
> >> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
> >>  emacs-devel@gnu.org,  ubolonton@gmail.com
> >> Date: Fri, 04 Jun 2021 11:08:48 +0100
> >>
> >> - However, LSP support for fontification seems like it's potentially
> >>  _less_ efficient than integrating something like tree-sitter as a C
> >>  module in Emacs.  That's because the contents of the buffer and
> >>  fontification results are continually transmitted back and forth via
> >>  pipes and JSON format.
> >
> > The communication of buffer contents to these agents/servers is indeed
> > one aspect of the existing packages (those I had time to look at) that
> > I personally am unhappy about.  Sending the whole buffer or its large
> > chunks down the wire as buffer-substring (which requires encoding to
> > be correct) is non-scalable, especially if it also requires conversion
> > to JSON.
>
> How bad is is actually; are there good numbers on this?

Not from me.  Only gut feeling.  But I have seen latency from servers before.
That just depends on the server and its architecture, I guess.

 However there are reports of enormous latency on Emacs side when JSON
messages get very long and complex. Part of this related simply to JSON
parsing and allocation of lots of lisp objects.  My hunch is that
fontification of
a big and complex buffer would give rise to one of these big and complex
JSON messages.

João



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:59         ` Philipp
  2021-06-04 11:05           ` João Távora
@ 2021-06-04 11:18           ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 11:18 UTC (permalink / raw)
  To: Philipp; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

> From: Philipp <p.stephani2@gmail.com>
> Date: Fri, 4 Jun 2021 12:59:45 +0200
> Cc: João Távora <joaotavora@gmail.com>,
>  ubolonton@gmail.com,
>  dancol@dancol.org,
>  theo@thornhill.no,
>  emacs-devel@gnu.org
> 
> > The communication of buffer contents to these agents/servers is indeed
> > one aspect of the existing packages (those I had time to look at) that
> > I personally am unhappy about.  Sending the whole buffer or its large
> > chunks down the wire as buffer-substring (which requires encoding to
> > be correct) is non-scalable, especially if it also requires conversion
> > to JSON.
> 
> How bad is is actually; are there good numbers on this?

It doesn't matter to me; we cannot go that way in core.  And there's
no reason, really.

> A while ago, I tested this hypothesis by transferring the `buffer-string' of xdisp.c to a Go module.  This goes through a full UTF-8 encoding and makes three copies (first, to create the string object; then, to copy it to the module interface; lastly, to make a Go string out of it), and it still only took a few milliseconds.
> Modern CPUs are very good at copying memory, so maybe we're optimizing the wrong thing here.  We definitely should have good benchmarks and profiling data before deciding what to optimize.

First, for LSP  this is not a memory copy.

Second, buffer-string (or buffer-substring) conses a Lisp string,
which increases memory pressure and GC.  Imagine doing this for many
buffers.  E.g., I have jit-lock-stealth enabled, so Emacs fontifies
buffers in the background whenever it is idle.

And third, why settle for an inferior solution that scales badly, when
a superior one is just around the corner?  I understand why we would
want to compromise if there were no alternatives, but why compromise
up front when a better alternative exists?  It makes no sense to me.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 11:05           ` João Távora
@ 2021-06-04 11:22             ` Eli Zaretskii
  2021-06-04 12:44               ` Dmitry Gutov
  2021-06-04 13:46               ` João Távora
  0 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 11:22 UTC (permalink / raw)
  To: João Távora; +Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Date: Fri, 4 Jun 2021 12:05:18 +0100
> Cc: Eli Zaretskii <eliz@gnu.org>, ubolonton@gmail.com, 
> 	Daniel Colascione <dancol@dancol.org>, theo@thornhill.no, emacs-devel <emacs-devel@gnu.org>
> 
> > > The communication of buffer contents to these agents/servers is indeed
> > > one aspect of the existing packages (those I had time to look at) that
> > > I personally am unhappy about.  Sending the whole buffer or its large
> > > chunks down the wire as buffer-substring (which requires encoding to
> > > be correct) is non-scalable, especially if it also requires conversion
> > > to JSON.
> >
> > How bad is is actually; are there good numbers on this?
> 
> Not from me.  Only gut feeling.  But I have seen latency from servers before.
> That just depends on the server and its architecture, I guess.
> 
>  However there are reports of enormous latency on Emacs side when JSON
> messages get very long and complex. Part of this related simply to JSON
> parsing and allocation of lots of lisp objects.  My hunch is that
> fontification of
> a big and complex buffer would give rise to one of these big and complex
> JSON messages.

Ask Dmitry about performance problems with native JSON support, and
the effort we invested (a year ago?) into optimizing UTF-8 encoding of
strings, to squeeze every last percent of performance.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 11:22             ` Eli Zaretskii
@ 2021-06-04 12:44               ` Dmitry Gutov
  2021-06-04 13:46               ` João Távora
  1 sibling, 0 replies; 274+ messages in thread
From: Dmitry Gutov @ 2021-06-04 12:44 UTC (permalink / raw)
  To: Eli Zaretskii, João Távora
  Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

On 04.06.2021 14:22, Eli Zaretskii wrote:

> Ask Dmitry about performance problems with native JSON support, and
> the effort we invested (a year ago?) into optimizing UTF-8 encoding of
> strings, to squeeze every last percent of performance.

About a year ago, yes (bug#31138 plus some follow-ups).

With string encoding taken care of, IIUC the current bottleneck is in 
parsing: Lisp object allocation which still has to happen on the current 
thread (some way to use parallel heaps could help with that).

And to get all of the highlightings for the current buffer, we will need 
to parse the response JSON document, probably also fairly large.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 11:22             ` Eli Zaretskii
  2021-06-04 12:44               ` Dmitry Gutov
@ 2021-06-04 13:46               ` João Távora
  2021-06-04 14:11                 ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-04 13:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> a big and complex buffer would give rise to one of these big and complex
>> JSON messages.
> Ask Dmitry about performance problems with native JSON support, and
> the effort we invested (a year ago?) into optimizing UTF-8 encoding of
> strings, to squeeze every last percent of performance.

As I remember, the biggest bottleneck was parsing and allocating Lisp
objects.  Commonly, it means parsing a big JSON message even if you're
only interested in a fraction of it (and this happens in LSP when
e.g. some servers decide to serve up huge buckets of diagnostics
unrelated to the current file being edited, for instance).  The json.c
parser is faster, but ultimately borks here, too.  

My idea at the time was to develop a technique to only parse the bits of
JSON we're interested in, which dramatically improved performance.  I
had a prototype for json.el lying around (can't seem to find it) based
on lazy evaluation.  If I remember correctly, Dmitry proposed another
technique based on a "path/selector language", which can also work but
is not quite so elegant IMO.

Of course, this is only useful if the starting assumption of much
useless JSON garbage is indeed true.  And I don't get a lot of bug
reports in Eglot about big-and-slow JSON, so it's been off the radar for
a while.

And again, for fontification, this point is probably moot if we're going
to integrate tree-sitter directly with direct access to the buffer
(which just makes sense).

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 13:46               ` João Távora
@ 2021-06-04 14:11                 ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 14:11 UTC (permalink / raw)
  To: João Távora; +Cc: p.stephani2, dancol, theo, ubolonton, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: p.stephani2@gmail.com,  ubolonton@gmail.com,  dancol@dancol.org,
>   theo@thornhill.no,  emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 14:46:12 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> a big and complex buffer would give rise to one of these big and complex
> >> JSON messages.
> > Ask Dmitry about performance problems with native JSON support, and
> > the effort we invested (a year ago?) into optimizing UTF-8 encoding of
> > strings, to squeeze every last percent of performance.
> 
> As I remember, the biggest bottleneck was parsing and allocating Lisp
> objects.

But that's exactly the problem: the packages I've seen try to solve
this on the Lisp level, and that just has got to involve consing of
Lisp objects, so there's no way around that problem with this
approach.

By contrast, fast access to buffer text is on the C level, similar to
what we do with regexp search, and doesn't require any Lisp objects as
intermediates.

The other problem with the integration of this packages into Emacs
(again, those few packages that I took a good enough look at) is that
they don't plug themselves into the JIT lock mechanism triggered by
redisplay, and instead use all kinds of hooks to put text properties
on buffer text (and turn off font-lock for that to work).  That's
another aspect of IMO poor integration into the Emacs core, probably
again because of the desire to stay away of C and the innards of the
display engine.

> Commonly, it means parsing a big JSON message even if you're
> only interested in a fraction of it (and this happens in LSP when
> e.g. some servers decide to serve up huge buckets of diagnostics
> unrelated to the current file being edited, for instance).  The json.c
> parser is faster, but ultimately borks here, too.  
> 
> My idea at the time was to develop a technique to only parse the bits of
> JSON we're interested in, which dramatically improved performance.

I think this is a separate issue.  I guess if the percentage of
"garbage" is large, then this will indeed be a win, but it must come
with some overhead (to figure out what is "garbage"), so it isn't
going to produce significant speedup with milder amounts of "garbage".

And this is only relevant if the protocol is based on JSON.

> And again, for fontification, this point is probably moot if we're going
> to integrate tree-sitter directly with direct access to the buffer
> (which just makes sense).

Only if someone does the job.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
  2021-06-04  6:10 ` Eli Zaretskii
  2021-06-04 10:42 ` Ergus
@ 2021-06-04 15:54 ` Alan Mackenzie
  2021-06-04 18:30   ` Daniel Colascione
  2021-06-05 20:25   ` Dmitry Gutov
  2021-08-30 18:50 ` [PATCH] " Alan Mackenzie
  3 siblings, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-04 15:54 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

Hello, Daniel.

On Thu, Jun 03, 2021 at 20:16:53 -0700, Daniel Colascione wrote:
> As long as I can remember, cc-mode fontification has felt totally 
> random, .....

Hmmm.  It is anything but totally random.

> ..... with actual faces depending on happenstance of previously-parsed
> types, .....

Whether a type is recognised as such depends on that, yes.  It's hard to
think of a better way without having the resources of a compiler,
particularly for ill-behaved languages like C++.

> ..... luck of the draw in jit-lock chunking, .....

That should be a thing of the past, much effort having been put into
eradicating such errors.  That is one of the main reasons for the
relative slowness of CC Mode, as compared with, say, Emacs Lisp Mode.

> ..... and so on.

And so on???

> Is there any *general* way that we can make fontification more robust
> and consistent?

Like other people have said on the thread, rewriting CC Mode to use an
LSP parser.

Less drastically, it would be possible to fix the specific bug you
allude to, by the user making a list of types and configuring CC Mode
with them, rather than attempting to recognise such types.  This feels
as though it would be tedious to use, though.

> For years and years now, I've been thinking we just need more 
> deterministic parser-and-based mode support, and I still think that, but 
> on a realistic level, that doesn't seem to be coming any time soon.

What does "parser-and-based" mean?

> In the meantime, is there any general approach we might be able to use 
> to get stuff like the attached to stop happening?

Probably none that we'd like.  Fontifying types only at their point of
declaration would be one, but I don't think people would want that.  My
impression is that the approach taken by CC Mode, like that of most
language modes in Emacs, has pretty much reached the limits of what's
possible, and it is unreasonable to expect perfect fontification (and
indentation) from languages like C++ in all cases.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:08     ` João Távora
  2021-06-04 10:39       ` Eli Zaretskii
@ 2021-06-04 16:43       ` Jim Porter
       [not found]         ` <83k0n9l9pv.fsf@gnu.org>
  1 sibling, 1 reply; 274+ messages in thread
From: Jim Porter @ 2021-06-04 16:43 UTC (permalink / raw)
  To: emacs-devel; +Cc: Eli Zaretskii, Daniel Colascione, ubolonton

On 6/4/2021 3:08 AM, João Távora wrote:
> - However, LSP support for fontification seems like it's potentially
>    _less_ efficient than integrating something like tree-sitter as a C
>    module in Emacs.  That's because the contents of the buffer and
>    fontification results are continually transmitted back and forth via
>    pipes and JSON format.

I imagine these potential performance issues would also be exacerbated 
by editing over TRAMP. Currently, the latest development builds of Eglot 
work nicely with TRAMP files, but having to send fontification results 
back to the local Emacs instance could be a problem over slow connections.

Having something built into Emacs (as much as possible) would also have 
the benefit of allowing users to read a properly-fontified source file 
even for languages they haven't installed tools for. For example, I 
might want to read a C# source file occasionally, despite not having a 
C# compiler/LSP server.

- Jim

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 10:34       ` João Távora
  2021-06-04 10:43         ` Eli Zaretskii
@ 2021-06-04 18:25         ` Stefan Monnier
  2021-06-04 18:36           ` Daniel Colascione
  2021-06-04 19:07           ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-04 18:25 UTC (permalink / raw)
  To: João Távora; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel

> But I don't understand what's stopping these tree-sitter C modules (like
> [1] and [2]) to have access to the buffer's contents directly and have
> the best of both worlds.

I think it's a direct result of them being "modules": the API doesn't
let modules access a buffer's content directly, so it's more efficient
copy the content via `buffer-substring` and toss it on the other side
than having to use something like `char-after`.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 15:54 ` Alan Mackenzie
@ 2021-06-04 18:30   ` Daniel Colascione
  2021-06-06 11:37     ` Alan Mackenzie
  2021-06-05 20:25   ` Dmitry Gutov
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04 18:30 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On 6/4/21 8:54 AM, Alan Mackenzie wrote:
>> Is there any *general* way that we can make fontification more robust
>> and consistent?
> Like other people have said on the thread, rewriting CC Mode to use an
> LSP parser.
>
> Less drastically, it would be possible to fix the specific bug you
> allude to, by the user making a list of types and configuring CC Mode
> with them, rather than attempting to recognise such types.  This feels
> as though it would be tedious to use, though.

I understand that cc-mode can't always get it right. It's only 
asymptotically omniscient. :-) Some deficiencies in highlighting are 
bound to happen.

What's striking to me is the inconsistency in the highlighting. None of 
the types in the std::variant declaration in my screenshot is special. 
They're all declared in the same file as the std::variant typedef. So 
why is PrimitiveType fontified while the others aren't?

FWIW, fontification is correct and consistent when I set 
font-lock-support-mode to nil, so this really does look like another 
case of getting unlucky with jit-lock block divisions.

Yes, I'm sure that this particular problem is caused by some bug, and 
with the right repro, we can quickly isolate and fix it. But this kind 
of seemingly-inexplicable inconsistent highlighting has been happening 
for years and years now. There's something fundamental about the way 
cc-mode is written that makes bugs like this keep popping up. Is there 
some internal abstraction we can add, some algorithmic test suite we can 
write, that would make this whole class of bug less likely?
>> For years and years now, I've been thinking we just need more
>> deterministic parser-and-based mode support, and I still think that, but
>> on a realistic level, that doesn't seem to be coming any time soon.
> What does "parser-and-based" mean?

I'd meant to type "parser-and-ast" I think.

>> In the meantime, is there any general approach we might be able to use
>> to get stuff like the attached to stop happening?
> Probably none that we'd like.  Fontifying types only at their point of
> declaration would be one, but I don't think people would want that.  My
> impression is that the approach taken by CC Mode, like that of most
> language modes in Emacs, has pretty much reached the limits of what's
> possible, and it is unreasonable to expect perfect fontification (and
> indentation) from languages like C++ in all cases.
>



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:25         ` Stefan Monnier
@ 2021-06-04 18:36           ` Daniel Colascione
  2021-06-04 19:11             ` Eli Zaretskii
  2021-06-05  0:29             ` Stefan Monnier
  2021-06-04 19:07           ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04 18:36 UTC (permalink / raw)
  To: Stefan Monnier, João Távora; +Cc: Eli Zaretskii, emacs-devel

On 6/4/21 11:25 AM, Stefan Monnier wrote:
>> But I don't understand what's stopping these tree-sitter C modules (like
>> [1] and [2]) to have access to the buffer's contents directly and have
>> the best of both worlds.
> I think it's a direct result of them being "modules": the API doesn't
> let modules access a buffer's content directly, so it's more efficient
> copy the content via `buffer-substring` and toss it on the other side
> than having to use something like `char-after`.

The problem is more fundamental than that. Internally, each buffer has a 
gap. External tools that operate on char arrays don't expect a gap. 
(They also don't expect to operate on Emacs internal coding, but that's 
another issue.) If we *did* grant direct buffer access via modules, we'd 
at least have to memcpy half (on average) the buffer to close the gap, 
then memcpy half the buffer (on average) to open the gap again when we 
began editing. If we're going to copy anyway, let's just copy via the 
buffer-substring interface. There's no reason that it has to be 
particularly inefficient.

Besides, memory copies are really, really, ridiculously fast. My system 
can cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's 
contents so we can give it to tree-sitter should be no issue at all.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:25         ` Stefan Monnier
  2021-06-04 18:36           ` Daniel Colascione
@ 2021-06-04 19:07           ` Eli Zaretskii
  2021-06-04 19:26             ` Daniel Colascione
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dancol, joaotavora, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>   emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 14:25:23 -0400
> 
> > But I don't understand what's stopping these tree-sitter C modules (like
> > [1] and [2]) to have access to the buffer's contents directly and have
> > the best of both worlds.
> 
> I think it's a direct result of them being "modules"

In the _real_ integration of those into Emacs, there's no reason for
them to be modules.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:36           ` Daniel Colascione
@ 2021-06-04 19:11             ` Eli Zaretskii
  2021-06-04 19:16               ` Daniel Colascione
  2021-06-05  0:29             ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:11 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 4 Jun 2021 11:36:05 -0700
> 
> On 6/4/21 11:25 AM, Stefan Monnier wrote:
> >> But I don't understand what's stopping these tree-sitter C modules (like
> >> [1] and [2]) to have access to the buffer's contents directly and have
> >> the best of both worlds.
> > I think it's a direct result of them being "modules": the API doesn't
> > let modules access a buffer's content directly, so it's more efficient
> > copy the content via `buffer-substring` and toss it on the other side
> > than having to use something like `char-after`.
> 
> The problem is more fundamental than that. Internally, each buffer has a 
> gap. External tools that operate on char arrays don't expect a gap. 
> (They also don't expect to operate on Emacs internal coding, but that's 
> another issue.) If we *did* grant direct buffer access via modules, we'd 
> at least have to memcpy half (on average) the buffer to close the gap, 
> then memcpy half the buffer (on average) to open the gap again when we 
> began editing.

I see no reason for copying, nor for making these tools aware of the
gap.  At least tree-sitter allows the application to provide a
function through which tree-sitter will access the edited text.  It
should be simple to write such a function, because on the C level we
always know where the gap is.

> Besides, memory copies are really, really, ridiculously fast. My system 
> can cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's 
> contents so we can give it to tree-sitter should be no issue at all.

Why copy at all? all these libraries need is access to buffer text.
We can just give it to them.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:11             ` Eli Zaretskii
@ 2021-06-04 19:16               ` Daniel Colascione
  2021-06-04 19:26                 ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04 19:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, monnier, joaotavora



On June 4, 2021 12:11:35 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 4 Jun 2021 11:36:05 -0700
>>
>> On 6/4/21 11:25 AM, Stefan Monnier wrote:
>>>> But I don't understand what's stopping these tree-sitter C modules (like
>>>> [1] and [2]) to have access to the buffer's contents directly and have
>>>> the best of both worlds.
>>> I think it's a direct result of them being "modules": the API doesn't
>>> let modules access a buffer's content directly, so it's more efficient
>>> copy the content via `buffer-substring` and toss it on the other side
>>> than having to use something like `char-after`.
>>
>> The problem is more fundamental than that. Internally, each buffer has a
>> gap. External tools that operate on char arrays don't expect a gap.
>> (They also don't expect to operate on Emacs internal coding, but that's
>> another issue.) If we *did* grant direct buffer access via modules, we'd
>> at least have to memcpy half (on average) the buffer to close the gap,
>> then memcpy half the buffer (on average) to open the gap again when we
>> began editing.
>
> I see no reason for copying, nor for making these tools aware of the
> gap.  At least tree-sitter allows the application to provide a
> function through which tree-sitter will access the edited text.  It
> should be simple to write such a function, because on the C level we
> always know where the gap is.

So you propose providing a "char get_buffer_char(size_t POS)" function? 
That *is* copying If you run that over all values of POS, all you've done 
is make a slow and shitty memcpy.

So you want to amortize the call over several characters? Okay. Now you've 
reinvented buffer-substring.

>
>
>> Besides, memory copies are really, really, ridiculously fast. My system
>> can cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's
>> contents so we can give it to tree-sitter should be no issue at all.
>
> Why copy at all? all these libraries need is access to buffer text.
> We can just give it to them.

Because any kind of "access" to the buffer that doesn't expose the gap is 
going to be a copy anyway.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:07           ` Eli Zaretskii
@ 2021-06-04 19:26             ` Daniel Colascione
  2021-06-04 19:32               ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04 19:26 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: joaotavora, emacs-devel



On June 4, 2021 12:09:32 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  Daniel Colascione <dancol@dancol.org>,
>> emacs-devel@gnu.org
>> Date: Fri, 04 Jun 2021 14:25:23 -0400
>>
>>> But I don't understand what's stopping these tree-sitter C modules (like
>>> [1] and [2]) to have access to the buffer's contents directly and have
>>> the best of both worlds.
>>
>> I think it's a direct result of them being "modules"
>
> In the _real_ integration of those into Emacs, there's no reason for
> them to be modules.

Eh. There's a benefit to keeping components loosely coupled






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:16               ` Daniel Colascione
@ 2021-06-04 19:26                 ` Eli Zaretskii
  2021-06-04 19:33                   ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:26 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <joaotavora@gmail.com>, <emacs-devel@gnu.org>
> Date: Fri, 04 Jun 2021 12:16:47 -0700
> 
> > I see no reason for copying, nor for making these tools aware of the
> > gap.  At least tree-sitter allows the application to provide a
> > function through which tree-sitter will access the edited text.  It
> > should be simple to write such a function, because on the C level we
> > always know where the gap is.
> 
> So you propose providing a "char get_buffer_char(size_t POS)" function? 
> That *is* copying If you run that over all values of POS, all you've done 
> is make a slow and shitty memcpy.

What do you think tree-sitter does with the fast copy you hand to it?
doesn't it walk it one character at a time?

And if you studied the tree-sitter's internals, and it uses
get_buffer_char as a means of copying text into its own buffer, then
perhaps we could ask tree-sitter developers to avoid the copy and use
the text directly.

> So you want to amortize the call over several characters? Okay. Now you've 
> reinvented buffer-substring.

buffer-substring is not just a copy of a chunk of text, it's much
more.  Even if eventually we need to use a memory copy, that'll run
circles around buffer-substring, and will avoid triggering GC.

> Because any kind of "access" to the buffer that doesn't expose the gap is 
> going to be a copy anyway.

The regexp routines aren't.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:26             ` Daniel Colascione
@ 2021-06-04 19:32               ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:32 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> From: Daniel Colascione <dancol@dancol.org>
> CC: <joaotavora@gmail.com>, <emacs-devel@gnu.org>
> Date: Fri, 04 Jun 2021 12:26:28 -0700
> 
> >> I think it's a direct result of them being "modules"
> >
> > In the _real_ integration of those into Emacs, there's no reason for
> > them to be modules.
> 
> Eh. There's a benefit to keeping components loosely coupled

That's an advantage, but we need to weigh it against the
disadvantages.  Maybe eventually we will decide it's worth making that
a module, but my point is that it isn't a restriction we cannot lift.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:26                 ` Eli Zaretskii
@ 2021-06-04 19:33                   ` Daniel Colascione
  2021-06-04 19:51                     ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-04 19:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, monnier, joaotavora



On June 4, 2021 12:26:47 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> CC: <monnier@iro.umontreal.ca>, <joaotavora@gmail.com>, <emacs-devel@gnu.org>
>> Date: Fri, 04 Jun 2021 12:16:47 -0700
>>
>>> I see no reason for copying, nor for making these tools aware of the
>>> gap.  At least tree-sitter allows the application to provide a
>>> function through which tree-sitter will access the edited text.  It
>>> should be simple to write such a function, because on the C level we
>>> always know where the gap is.
>>
>> So you propose providing a "char get_buffer_char(size_t POS)" function?
>> That *is* copying If you run that over all values of POS, all you've done
>> is make a slow and shitty memcpy.
>
> What do you think tree-sitter does with the fast copy you hand to it?
> doesn't it walk it one character at a time?
>
> And if you studied the tree-sitter's internals, and it uses
> get_buffer_char as a means of copying text into its own buffer, then
> perhaps we could ask tree-sitter developers to avoid the copy and use
> the text directly.

Teaching TS to use a generic cursor interface would be great.
>
>
>> So you want to amortize the call over several characters? Okay. Now you've
>> reinvented buffer-substring.
>
> buffer-substring is not just a copy of a chunk of text, it's much
> more.

The variant without text properties doesn't do much.

> Even if eventually we need to use a memory copy, that'll run
> circles around buffer-substring, and will avoid triggering GC.

Sure. I'm not opposed to adding an API that's basically a more efficient 
buffer substring for C callers. I'm just pointing out that the idea of 
giving TS "direct access" to a buffer without any copy at all doesn't make 
a lot of sense.


>
>
>> Because any kind of "access" to the buffer that doesn't expose the gap is
>> going to be a copy anyway.
>
> The regexp routines aren't.

The regexp routines have Emacs specific knowledge. My argument doesn't 
apply to code we can customize for Emacs.






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
       [not found]         ` <83k0n9l9pv.fsf@gnu.org>
@ 2021-06-04 19:41           ` Jim Porter
  2021-06-04 19:53             ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Jim Porter @ 2021-06-04 19:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

(Note: re-adding emacs-devel here, since I posted through Gmane and
attempted to eliminate dupe messages by posting only to the Gmane
mirror and not mailing the list directly. That was backwards, and I
should have removed the Gmane mirror, or perhaps just ignored the
issue and let the mailing list handle dupes.)

On Fri, Jun 4, 2021 at 12:18 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > Cc: Eli Zaretskii <eliz@gnu.org>, Daniel Colascione <dancol@dancol.org>,
> >  ubolonton@gmail.com
> > From: Jim Porter <jporterbugs@gmail.com>
> > Date: Fri, 4 Jun 2021 09:43:26 -0700
> >
> > On 6/4/2021 3:08 AM, João Távora wrote:
> > > - However, LSP support for fontification seems like it's potentially
> > >    _less_ efficient than integrating something like tree-sitter as a C
> > >    module in Emacs.  That's because the contents of the buffer and
> > >    fontification results are continually transmitted back and forth via
> > >    pipes and JSON format.
> >
> > I imagine these potential performance issues would also be exacerbated
> > by editing over TRAMP.
>
> Why?  Fontification is always local, even if the files you edit are on
> a remote host.

The way I understand this particular hypothetical is that Eglot would
be responsible for asking the LSP server for syntax highlighting and
would then do the necessary work to tell Emacs how to fontify the
buffer. Currently, the way Eglot works for remote files is that it
runs the LSP server on the remote host via TRAMP. That works out
nicely right now, but if we wanted to get the syntax highlighting from
the (remote) LSP server to the (local) Emacs instance, that data would
have to go through TRAMP. I'm not sure how much data we're talking
about here, but if there are performance concerns about doing this
locally via pipes, it would be exacerbated by going through a slow
network.

- Jim

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:33                   ` Daniel Colascione
@ 2021-06-04 19:51                     ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:51 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, monnier, joaotavora

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <joaotavora@gmail.com>, <emacs-devel@gnu.org>
> Date: Fri, 04 Jun 2021 12:33:25 -0700
> 
> > What do you think tree-sitter does with the fast copy you hand to it?
> > doesn't it walk it one character at a time?
> >
> > And if you studied the tree-sitter's internals, and it uses
> > get_buffer_char as a means of copying text into its own buffer, then
> > perhaps we could ask tree-sitter developers to avoid the copy and use
> > the text directly.
> 
> Teaching TS to use a generic cursor interface would be great.

I don't remember if I looked at how it does it now, but are you sure
it doesn't already know how to do that?  Sounds like a natural thing
to me, but maybe I'm missing something.

> > buffer-substring is not just a copy of a chunk of text, it's much
> > more.
> 
> The variant without text properties doesn't do much.

It allocates memory!  For a large buffer (think xdisp.c) that is best
avoided.  I hope if we need to memcpy, we could at least use a pointer
to a buffer allocated by the parser library, so we won't need to.

> > Even if eventually we need to use a memory copy, that'll run
> > circles around buffer-substring, and will avoid triggering GC.
> 
> Sure. I'm not opposed to adding an API that's basically a more efficient 
> buffer substring for C callers. I'm just pointing out that the idea of 
> giving TS "direct access" to a buffer without any copy at all doesn't make 
> a lot of sense.

If it can use that wisely, I don't see why it wouldn't make sense.  If
it cannot, then I agree.  But still, I'd rather not give up from the
get-go and use buffer-substring just because it's there, I'd try
looking for something more scalable and less Lisp-consing.

Also, I hope we could arrange the copying to be driven by the display
engine through the JIT font-lock machinery, rather than sending the
entire buffer or its large parts.

> >> Because any kind of "access" to the buffer that doesn't expose the gap is
> >> going to be a copy anyway.
> >
> > The regexp routines aren't.
> 
> The regexp routines have Emacs specific knowledge.

I mean the way regexp routines use the buffer text as a C string (as 2
C strings, actually).  That doesn't use any Emacs specific knowledge
except the gap, and even the latter is largely solved by the caller.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:41           ` Jim Porter
@ 2021-06-04 19:53             ` Eli Zaretskii
  2021-06-04 20:05               ` Jim Porter
  2021-06-04 20:14               ` Yuri Khan
  0 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-04 19:53 UTC (permalink / raw)
  To: Jim Porter; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

> From: Jim Porter <jporterbugs@gmail.com>
> Date: Fri, 4 Jun 2021 12:41:56 -0700
> Cc: joaotavora@gmail.com, theo@thornhill.no, dancol@dancol.org, 
> 	ubolonton@gmail.com, emacs-devel@gnu.org
> 
> Currently, the way Eglot works for remote files is that it
> runs the LSP server on the remote host via TRAMP.

Why does it do that?  Does the LSP server have to access the file
itself?  We have all the contents of that file locally in a buffer, so
we could hand it to LSP locally.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:53             ` Eli Zaretskii
@ 2021-06-04 20:05               ` Jim Porter
  2021-06-04 20:11                 ` Joost Kremers
  2021-06-05  6:41                 ` Eli Zaretskii
  2021-06-04 20:14               ` Yuri Khan
  1 sibling, 2 replies; 274+ messages in thread
From: Jim Porter @ 2021-06-04 20:05 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: ubolonton, Daniel Colascione, Theodor Thornhill,
	João Távora, emacs-devel

On Fri, Jun 4, 2021 at 12:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Jim Porter <jporterbugs@gmail.com>
> > Date: Fri, 4 Jun 2021 12:41:56 -0700
> > Cc: joaotavora@gmail.com, theo@thornhill.no, dancol@dancol.org,
> >       ubolonton@gmail.com, emacs-devel@gnu.org
> >
> > Currently, the way Eglot works for remote files is that it
> > runs the LSP server on the remote host via TRAMP.
>
> Why does it do that?  Does the LSP server have to access the file
> itself?  We have all the contents of that file locally in a buffer, so
> we could hand it to LSP locally.

I'm not an expert on the internals of LSP servers, but it's my
understanding that for a language server like clangd, it needs access
not just to the current file, but the entire source tree[1]. That
allows for things like completion of member function names of classes
defined in another file, etc. For clangd in particular, it might be
possible to run a local clangd that pulls from a remote index[2], but
I don't know if every LSP server has such capabilities.

Moreover, in my own usage of Eglot, I find it very convenient that it
runs the LSP server remotely. I often find myself files remotely over
TRAMP from a local machine with a minimal set of devtools. While I
could install all the LSP servers I need on all the machines I connect
from, it's less effort to rely on the fact that the machine that'll be
doing the compilation has all the devtools I need.

- Jim

[1] As well as instructions about how to *build* the source, contained
in `compile_commands.json'.
[2] https://clangd.llvm.org/remote-index.html

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 20:05               ` Jim Porter
@ 2021-06-04 20:11                 ` Joost Kremers
  2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05  6:41                 ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Joost Kremers @ 2021-06-04 20:11 UTC (permalink / raw)
  To: emacs-devel


On Fri, Jun 04 2021, Jim Porter wrote:
> On Fri, Jun 4, 2021 at 12:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
>> Why does it do that?  Does the LSP server have to access the file
>> itself?  We have all the contents of that file locally in a buffer, so
>> we could hand it to LSP locally.
>
> I'm not an expert on the internals of LSP servers, but it's my
> understanding that for a language server like clangd, it needs access
> not just to the current file, but the entire source tree[1].

And speaking from my experience with lsp-mode (not eglot) and Python, it needs
access to the entire virtual env so it can provide type information and
completions for built-in Python packages and for 3rd-party packages that you use
your code.


-- 
Joost Kremers
Life has its moments



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 19:53             ` Eli Zaretskii
  2021-06-04 20:05               ` Jim Porter
@ 2021-06-04 20:14               ` Yuri Khan
  1 sibling, 0 replies; 274+ messages in thread
From: Yuri Khan @ 2021-06-04 20:14 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Jim Porter, ubolonton, theo, Emacs developers,
	João Távora, Daniel Colascione

On Sat, 5 Jun 2021 at 02:53, Eli Zaretskii <eliz@gnu.org> wrote:

> > Currently, the way Eglot works for remote files is that it
> > runs the LSP server on the remote host via TRAMP.
>
> Why does it do that?  Does the LSP server have to access the file
> itself?  We have all the contents of that file locally in a buffer, so
> we could hand it to LSP locally.

The contents of the file are not sufficient to parse it. A C file will
#include some headers. A Python program will import some modules. Some
of these (e.g. the standard library) will likely be installed both
locally and remotely, but might be different versions. Some
(first-party dependencies) will be resolvable from the source file.
Some (third-party dependencies) will be installed remotely but not
locally.

A useful pattern is to build a Docker container, mount the source tree
as a volume, install the toolchain and any third-party dependencies
into the container, and run the LSP server in there. This way, these
dependencies do not contaminate the developer’s machine, while still
being available to the LSP server. The container can also run
different versions of compilers, interpreters, etc. than are installed
on the developer’s machine.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:36           ` Daniel Colascione
  2021-06-04 19:11             ` Eli Zaretskii
@ 2021-06-05  0:29             ` Stefan Monnier
  2021-06-05  6:32               ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-05  0:29 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: João Távora, Eli Zaretskii, emacs-devel

> The problem is more fundamental than that. Internally, each buffer has
> a gap. External tools that operate on char arrays don't expect a gap. (They

Yes, there's that as well.

> Besides, memory copies are really, really, ridiculously fast. My system can
>  cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's contents
> so we can give it to tree-sitter should be no issue at all.

Yes, beside the potential difficulty of giving direct access to the
buffer's content, there's the fact that the time needed to make a copy
will be dwarfed by the time needed by tree-sitter to parse it, turn it
into a tree, and for us to process the returned parse tree (unless we
copy a lot more than the part that tree-sitter parses, admittedly, but
presumably we shouldn't need to copy text at which tree-sitter won't
look).

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  0:29             ` Stefan Monnier
@ 2021-06-05  6:32               ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05  6:32 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dancol, joaotavora, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: João Távora <joaotavora@gmail.com>,  Eli
>  Zaretskii <eliz@gnu.org>,
>   emacs-devel@gnu.org
> Date: Fri, 04 Jun 2021 20:29:02 -0400
> 
> > Besides, memory copies are really, really, ridiculously fast. My system can
> >  cat from /dev/zero to /dev/null at ~18GB/sec. Copying a buffer's contents
> > so we can give it to tree-sitter should be no issue at all.
> 
> Yes, beside the potential difficulty of giving direct access to the
> buffer's content, there's the fact that the time needed to make a copy
> will be dwarfed by the time needed by tree-sitter to parse it, turn it
> into a tree, and for us to process the returned parse tree

Are you sure?  Tree-sitter advertises itself as being very fast in
that department.  Do we have any benchmark somewhere showing its
parsing speed?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 20:05               ` Jim Porter
  2021-06-04 20:11                 ` Joost Kremers
@ 2021-06-05  6:41                 ` Eli Zaretskii
  2021-06-05  9:32                   ` João Távora
  2021-06-05  9:46                   ` Ergus
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05  6:41 UTC (permalink / raw)
  To: Jim Porter; +Cc: ubolonton, dancol, theo, joaotavora, emacs-devel

> From: Jim Porter <jporterbugs@gmail.com>
> Date: Fri, 4 Jun 2021 13:05:40 -0700
> Cc: João Távora <joaotavora@gmail.com>, 
> 	Theodor Thornhill <theo@thornhill.no>, Daniel Colascione <dancol@dancol.org>, ubolonton@gmail.com, 
> 	emacs-devel@gnu.org
> 
> On Fri, Jun 4, 2021 at 12:53 PM Eli Zaretskii <eliz@gnu.org> wrote:
> >
> > > From: Jim Porter <jporterbugs@gmail.com>
> > > Date: Fri, 4 Jun 2021 12:41:56 -0700
> > > Cc: joaotavora@gmail.com, theo@thornhill.no, dancol@dancol.org,
> > >       ubolonton@gmail.com, emacs-devel@gnu.org
> > >
> > > Currently, the way Eglot works for remote files is that it
> > > runs the LSP server on the remote host via TRAMP.
> >
> > Why does it do that?  Does the LSP server have to access the file
> > itself?  We have all the contents of that file locally in a buffer, so
> > we could hand it to LSP locally.
> 
> I'm not an expert on the internals of LSP servers, but it's my
> understanding that for a language server like clangd, it needs access
> not just to the current file, but the entire source tree[1].

I see, thanks.

So is Emacs the only editor using LSP with remote files?  If other
editors support that, how do they solve this problem without incurring
delays?

> Moreover, in my own usage of Eglot, I find it very convenient that it
> runs the LSP server remotely. I often find myself files remotely over
> TRAMP from a local machine with a minimal set of devtools. While I
> could install all the LSP servers I need on all the machines I connect
> from, it's less effort to rely on the fact that the machine that'll be
> doing the compilation has all the devtools I need.

That sounds like a use case for running Emacs on the remote machine,
and only having the display on the local machine, like via X
forwarding or similar technology?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 20:11                 ` Joost Kremers
@ 2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05 10:14                     ` Joost Kremers
                                       ` (2 more replies)
  0 siblings, 3 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05  6:51 UTC (permalink / raw)
  To: Joost Kremers; +Cc: emacs-devel

> From: Joost Kremers <joostkremers@fastmail.fm>
> Date: Fri, 04 Jun 2021 22:11:06 +0200
> 
> > I'm not an expert on the internals of LSP servers, but it's my
> > understanding that for a language server like clangd, it needs access
> > not just to the current file, but the entire source tree[1].
> 
> And speaking from my experience with lsp-mode (not eglot) and Python, it needs
> access to the entire virtual env so it can provide type information and
> completions for built-in Python packages and for 3rd-party packages that you use
> your code.

That cannot be a mandatory requirement, right?  Because otherwise LSP
wouldn't be able to support editing of an unfinished project, where
not everything is laid out 100% yet.  The user will expect that some
completion cases could be inaccurate when not everything is coded yet,
but the user will NOT expect to see inaccurate "syntax highlighting"
or indentation, nor incorrect "show definition" and "show callers"
results for the code that was already written, and in particular for
the code in the file being edited.

Thus, I'd expect LSP to be able to deal with missing information,
which then means it shouldn't require access to the entire tree as a
prerequisite for useful functionality.

Am I missing something?

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:41                 ` Eli Zaretskii
@ 2021-06-05  9:32                   ` João Távora
  2021-06-05  9:59                     ` Ergus
  2021-06-05 11:25                     ` cc-mode fontification feels random Eli Zaretskii
  2021-06-05  9:46                   ` Ergus
  1 sibling, 2 replies; 274+ messages in thread
From: João Távora @ 2021-06-05  9:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jim Porter, ubolonton, dancol, theo, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> That sounds like a use case for running Emacs on the remote machine,
> and only having the display on the local machine, like via X
> forwarding or similar technology?

Then why use TRAMP at all?  Anyway, this is just an aside, but running
the server remotely and also editing the files via TRAMP is still
popular and predates LSP by many years: Many use the SLIME or SLY Common
Lisp IDEs like that, pretty effectively.  Personally I prefer running an
Emacs on the remote machine, but there's clearly a share of users who
like to keep use one local Emacs for everything.

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:41                 ` Eli Zaretskii
  2021-06-05  9:32                   ` João Távora
@ 2021-06-05  9:46                   ` Ergus
  2021-06-05 11:27                     ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-05  9:46 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Jim Porter, ubolonton, dancol, theo, joaotavora, emacs-devel

On Sat, Jun 05, 2021 at 09:41:15AM +0300, Eli Zaretskii wrote:
>
>I see, thanks.
>
>So is Emacs the only editor using LSP with remote files?  If other
>editors support that, how do they solve this problem without incurring
>delays?
>
I work with servers all the time and I have tried all kind of tools for
remote editing.

  So far the only other editor for remote files supporting completions
that just works and I am aware of; it is Visual Studio [code]
family. And actually, they use LSP protocol for that (completion and
indentation).

If I understand more or less how it works in VS it seems like the LSP
server runs locally (because it does not require any remote
modification/installation or so).

They do a kind of local mirror for completion (probably something
similar to sshfs to access all the unmodified files "on demand" and get
the best possible information) and they store a cache of the project in
the local filesystem to avoid recompiling everything the next time they
use the project..

Of course, there are some problems when the remote environment is not
available locally (missing modules, compilers, libraries). But in
general it is easier to install/modify locally than remotely (in our
tramp approach + lsp-mode or eglot; if clangd is not installed in the
remote server the user have nothing at all... and installing clangd in
every single remote system we use is not an option due to time,
permissions or resources.).

It seems that there are some heuristics there too, to reduce errors
exposes to the user and do the best possible, but in general it works
pretty well, specially for C/C++, jacascript, nodejs, and python.

>> Moreover, in my own usage of Eglot, I find it very convenient that it
>> runs the LSP server remotely. I often find myself files remotely over
>> TRAMP from a local machine with a minimal set of devtools. While I
>> could install all the LSP servers I need on all the machines I connect
>> from, it's less effort to rely on the fact that the machine that'll be
>> doing the compilation has all the devtools I need.
>
>That sounds like a use case for running Emacs on the remote machine,
>and only having the display on the local machine, like via X
>forwarding or similar technology?
>
Some time ago, one of my first questions in this mailing list was how to
run emacsserver on a remote machine and connect to that with the local
emacsclient. On that moment that was not supported.

The approach any way has some practical issues. 

1) We can't always open ports in the remote server; so we need at the
end to do a proxy throw ssh.

2) Many remote servers (for example login nodes in HPC servers) kill
the running processes if the user disconnects or if the process is in
the background.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:32                   ` João Távora
@ 2021-06-05  9:59                     ` Ergus
  2021-06-05 11:29                       ` Eli Zaretskii
  2021-06-05 13:59                       ` Remote GUI Emacs really works (was: cc-mode fontification feels random) Óscar Fuentes
  2021-06-05 11:25                     ` cc-mode fontification feels random Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Ergus @ 2021-06-05  9:59 UTC (permalink / raw)
  To: João Távora
  Cc: Eli Zaretskii, Jim Porter, ubolonton, dancol, theo, emacs-devel

On Sat, Jun 05, 2021 at 10:32:12AM +0100, Joï¿½o Tï¿½vora wrote:
>Eli Zaretskii <eliz@gnu.org> writes:
>
>> That sounds like a use case for running Emacs on the remote machine,
>> and only having the display on the local machine, like via X
>> forwarding or similar technology?
>
>Then why use TRAMP at all?  Anyway, this is just an aside, but running
>the server remotely and also editing the files via TRAMP is still
>popular and predates LSP by many years: Many use the SLIME or SLY Common
>Lisp IDEs like that, pretty effectively.  Personally I prefer running an
>Emacs on the remote machine, but there's clearly a share of users who
>like to keep use one local Emacs for everything.
>
>Joï¿½o
>
Usually running a remote emacs is extremely slow if using gui and
creates all kind of issues if the connection fails or hang.

When using tui there are also some issues due to terminfo in the remote
system; because the local TERM is informed to the remote system when ssh
connection starts, but if the remote system does not have terminfo for
that term, then it tries to do the best (use a default). In that case,
for normal uses it just works, even for vim and nano; but is seems like
emacs tries to use more advanced features or characters. Also a
connection hang is very problematic because emacs totally blocks and you
lost your changes. And packages like xclip doesn't work (as expected).

The other issue is also "resources", when using remote systems like
raspberry pi or permissions restricted environments, installing emacs
remotely is not always possible either.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:51                   ` Eli Zaretskii
@ 2021-06-05 10:14                     ` Joost Kremers
  2021-06-05 11:31                       ` Eli Zaretskii
  2021-06-05 13:23                     ` Stefan Monnier
  2021-06-05 18:46                     ` João Távora
  2 siblings, 1 reply; 274+ messages in thread
From: Joost Kremers @ 2021-06-05 10:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Sat, Jun 05 2021, Eli Zaretskii wrote:
> Thus, I'd expect LSP to be able to deal with missing information,
> which then means it shouldn't require access to the entire tree as a
> prerequisite for useful functionality.

Of course the LSP server won't just give up if there is missing information, but
its functionality will be reduced. For me, if my development environment were on
a remote machine and I couldn't (or don't want to) replicate that environment on
my local machine, running the LSP server locally would probably take away much
of the appeal of using one.

-- 
Joost Kremers
Life has its moments

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:32                   ` João Távora
  2021-06-05  9:59                     ` Ergus
@ 2021-06-05 11:25                     ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:25 UTC (permalink / raw)
  To: João Távora; +Cc: jporterbugs, ubolonton, dancol, theo, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Jim Porter <jporterbugs@gmail.com>,  theo@thornhill.no,
>   dancol@dancol.org,  ubolonton@gmail.com,  emacs-devel@gnu.org
> Date: Sat, 05 Jun 2021 10:32:12 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > That sounds like a use case for running Emacs on the remote machine,
> > and only having the display on the local machine, like via X
> > forwarding or similar technology?
> 
> Then why use TRAMP at all?

Because it could be the other way around: the local machine has the
full development environment available, while the remote one doesn't.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:46                   ` Ergus
@ 2021-06-05 11:27                     ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:27 UTC (permalink / raw)
  To: Ergus; +Cc: jporterbugs, ubolonton, theo, emacs-devel, joaotavora, dancol

> Date: Sat, 5 Jun 2021 11:46:39 +0200
> From: Ergus <spacibba@aol.com>
> Cc: Jim Porter <jporterbugs@gmail.com>, ubolonton@gmail.com,
> 	dancol@dancol.org, theo@thornhill.no, joaotavora@gmail.com,
> 	emacs-devel@gnu.org
> 
> >That sounds like a use case for running Emacs on the remote machine,
> >and only having the display on the local machine, like via X
> >forwarding or similar technology?
> >
> Some time ago, one of my first questions in this mailing list was how to
> run emacsserver on a remote machine and connect to that with the local
> emacsclient. On that moment that was not supported.

I didn't mean to suggest using emacsclient, I meant to suggest a
remote display via the X capabilities.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  9:59                     ` Ergus
@ 2021-06-05 11:29                       ` Eli Zaretskii
  2021-06-05 11:55                         ` Daniel Colascione
  2021-06-05 12:43                         ` Ergus
  2021-06-05 13:59                       ` Remote GUI Emacs really works (was: cc-mode fontification feels random) Óscar Fuentes
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:29 UTC (permalink / raw)
  To: Ergus; +Cc: jporterbugs, ubolonton, theo, emacs-devel, joaotavora, dancol

> Date: Sat, 5 Jun 2021 11:59:04 +0200
> From: Ergus <spacibba@aol.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
> 	emacs-devel@gnu.org
> 
> Usually running a remote emacs is extremely slow if using gui and
> creates all kind of issues if the connection fails or hang.

And using Tramp with bad connections doesn't create any issues?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 10:14                     ` Joost Kremers
@ 2021-06-05 11:31                       ` Eli Zaretskii
  2021-06-05 12:12                         ` Joost Kremers
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05 11:31 UTC (permalink / raw)
  To: Joost Kremers; +Cc: emacs-devel

> From: Joost Kremers <joostkremers@fastmail.fm>
> Cc: emacs-devel@gnu.org
> Date: Sat, 05 Jun 2021 12:14:42 +0200
> 
> Of course the LSP server won't just give up if there is missing information, but
> its functionality will be reduced. For me, if my development environment were on
> a remote machine and I couldn't (or don't want to) replicate that environment on
> my local machine, running the LSP server locally would probably take away much
> of the appeal of using one.

And the speedup doesn't matter?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:29                       ` Eli Zaretskii
@ 2021-06-05 11:55                         ` Daniel Colascione
  2021-06-05 12:27                           ` Eli Zaretskii
  2021-06-05 12:43                         ` Ergus
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-05 11:55 UTC (permalink / raw)
  To: Eli Zaretskii, Ergus
  Cc: jporterbugs, ubolonton, theo, joaotavora, emacs-devel

On 6/5/21 4:29 AM, Eli Zaretskii wrote:
>> Date: Sat, 5 Jun 2021 11:59:04 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
>> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
>> 	emacs-devel@gnu.org
>>
>> Usually running a remote emacs is extremely slow if using gui and
>> creates all kind of issues if the connection fails or hang.
> 
> And using Tramp with bad connections doesn't create any issues?

Fewer than running a remote Emacs: you don't interact with Tramp on each 
keystroke. There are tons of advantages to Tramp; it's a first-class 
feature and it's worth making language server support work properly with it.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:31                       ` Eli Zaretskii
@ 2021-06-05 12:12                         ` Joost Kremers
  0 siblings, 0 replies; 274+ messages in thread
From: Joost Kremers @ 2021-06-05 12:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Sat, Jun 05 2021, Eli Zaretskii wrote:
>> From: Joost Kremers <joostkremers@fastmail.fm>
>> Cc: emacs-devel@gnu.org
>> Date: Sat, 05 Jun 2021 12:14:42 +0200
>> 
>> Of course the LSP server won't just give up if there is missing information,
>> but its functionality will be reduced. For me, if my development environment
>> were on a remote machine and I couldn't (or don't want to) replicate that
>> environment on my local machine, running the LSP server locally would
>> probably take away much of the appeal of using one.
>
> And the speedup doesn't matter?

I guess it would be a trade-off. Last time I had to run my code on a remote
machine I wasn't using lsp-mode yet, so I'm not speaking from experience here. I
just wanted to say that running an LSP server without the full development
environment makes the LSP server less capable, and if all the server does is
provide syntax highlighting and indentation, it might be less of a hassle to use
something else instead (e.g., elpy, or just plain old python-mode).

-- 
Joost Kremers
Life has its moments

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:55                         ` Daniel Colascione
@ 2021-06-05 12:27                           ` Eli Zaretskii
  2021-06-05 17:59                             ` Jim Porter
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05 12:27 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: jporterbugs, spacibba, theo, ubolonton, emacs-devel, joaotavora

> Cc: joaotavora@gmail.com, jporterbugs@gmail.com, ubolonton@gmail.com,
>  theo@thornhill.no, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Sat, 5 Jun 2021 04:55:21 -0700
> 
> On 6/5/21 4:29 AM, Eli Zaretskii wrote:
> >> Date: Sat, 5 Jun 2021 11:59:04 +0200
> >> From: Ergus <spacibba@aol.com>
> >> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
> >> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
> >> 	emacs-devel@gnu.org
> >>
> >> Usually running a remote emacs is extremely slow if using gui and
> >> creates all kind of issues if the connection fails or hang.
> > 
> > And using Tramp with bad connections doesn't create any issues?
> 
> Fewer than running a remote Emacs: you don't interact with Tramp on each 
> keystroke. There are tons of advantages to Tramp; it's a first-class 
> feature and it's worth making language server support work properly with it.

No argument that we need to support that properly.  This sub-thread
started when someone said it would probably be slow with Tramp in the
loop.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 11:29                       ` Eli Zaretskii
  2021-06-05 11:55                         ` Daniel Colascione
@ 2021-06-05 12:43                         ` Ergus
  1 sibling, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-05 12:43 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: joaotavora, jporterbugs, ubolonton, dancol, theo, emacs-devel

On Sat, Jun 05, 2021 at 02:29:15PM +0300, Eli Zaretskii wrote:
>> Date: Sat, 5 Jun 2021 11:59:04 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: Eli Zaretskii <eliz@gnu.org>, Jim Porter <jporterbugs@gmail.com>,
>> 	ubolonton@gmail.com, dancol@dancol.org, theo@thornhill.no,
>> 	emacs-devel@gnu.org
>>
>> Usually running a remote emacs is extremely slow if using gui and
>> creates all kind of issues if the connection fails or hang.
>
>And using Tramp with bad connections doesn't create any issues?

Yes, but much much less than forwarding X over ssh or even use emacs on
tty over ssh.

ex: Tramp is capable to reconnect when the connection fails for some
minutes, and don't exchange a so huge amount of information constantly
over the network, so a poor bandwidth almost does not affected either.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05 10:14                     ` Joost Kremers
@ 2021-06-05 13:23                     ` Stefan Monnier
  2021-06-05 17:08                       ` Óscar Fuentes
  2021-06-05 18:46                     ` João Távora
  2 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-05 13:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Joost Kremers, emacs-devel

> not everything is laid out 100% yet.  The user will expect that some
> completion cases could be inaccurate when not everything is coded yet,
> but the user will NOT expect to see inaccurate "syntax highlighting"
> or indentation, nor incorrect "show definition" and "show callers"
> results for the code that was already written, and in particular for
> the code in the file being edited.

I think that's where tree-sitter shines, because AFAIK it does not rely
on access to other files.

I think you'd expect a good LSP server to "degrade gracefully" and still
provide good info for indentation and syntax highlighting even if you
only have the one file and all the other files in the project
are missing.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Remote GUI Emacs really works (was: cc-mode fontification feels random)
  2021-06-05  9:59                     ` Ergus
  2021-06-05 11:29                       ` Eli Zaretskii
@ 2021-06-05 13:59                       ` Óscar Fuentes
  1 sibling, 0 replies; 274+ messages in thread
From: Óscar Fuentes @ 2021-06-05 13:59 UTC (permalink / raw)
  To: emacs-devel

Ergus <spacibba@aol.com> writes:

> Usually running a remote emacs is extremely slow if using gui and
> creates all kind of issues if the connection fails or hang.

Use the right method: something based on the NX protocol, like x2go.

Until a year ago, I used that with no issues on ADSL lines with 60KBps
upstream bandwidth. Connection failures are a non-issue: the session is
kept live on the remote machine, you simply reconnect to it and
everything comes back as it was. The same mechanism allows you to
suspend and resume the session at your convenience.

Text-based applications such as Emacs work very well with this setup
over slow networks, just some tens of KBps are enough. As long as you
don't have too much latency (as it is often the case with cellular
networks) the experience is quite good.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 13:23                     ` Stefan Monnier
@ 2021-06-05 17:08                       ` Óscar Fuentes
  2021-06-05 17:31                         ` Stefan Monnier
  2021-06-05 17:32                         ` Eli Zaretskii
  0 siblings, 2 replies; 274+ messages in thread
From: Óscar Fuentes @ 2021-06-05 17:08 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> not everything is laid out 100% yet.  The user will expect that some
>> completion cases could be inaccurate when not everything is coded yet,
>> but the user will NOT expect to see inaccurate "syntax highlighting"
>> or indentation, nor incorrect "show definition" and "show callers"
>> results for the code that was already written, and in particular for
>> the code in the file being edited.
>
> I think that's where tree-sitter shines, because AFAIK it does not rely
> on access to other files.

I took a look at tree-sitter and, IIUC, it suffers from the same
limitations as CC mode: it gets the information provided by a parser.

For starts, in C++, being limited to the current file means that it is
unable to determine if Foo::bar is a type, a value or a function when
Foo is defined on a header file.

But most fundamentally, it is unable to determine what
Foo<whatever>::bar is even when it is defined on the current file.

If we are going to really modernize Emacs' programming language support
we need to provide more than parser-based syntax highlighting and
indentation. We need smart code completion, code hints, transformations,
etc. That means we need something like LSP. Tree-sitter migth be useful
for the languages not yet supported by LSP, though (but, if I got it
right, tree-sitter is implemented on Javascript, so it requires a JS
engine to work, maybe too much of a dependency for something that
doesn't add that much over what we have now.)

> I think you'd expect a good LSP server to "degrade gracefully" and still
> provide good info for indentation and syntax highlighting even if you
> only have the one file and all the other files in the project
> are missing.

As already mentioned elsewhere on this thread, an LSP server with access
to just the current file is severely handicapped. One thing is to miss
the information about some functions yet-to-be-written and another thing
entirely is to ignore everything not defined on the current file.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 17:08                       ` Óscar Fuentes
@ 2021-06-05 17:31                         ` Stefan Monnier
  2021-06-05 17:32                         ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-05 17:31 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

>> I think that's where tree-sitter shines, because AFAIK it does not rely
>> on access to other files.
>
> I took a look at tree-sitter and, IIUC, it suffers from the same
> limitations as CC mode: it gets the information provided by a parser.

To a large extent that's unavoidable.  AFAIU it should be able to do
a slightly better job in some cases by just trying out all possible
interpretations and only keeping those that make sense (from a purely
syntactic point of view).

> But most fundamentally, it is unable to determine what
> Foo<whatever>::bar is even when it is defined on the current file.

Indeed it's quite possible that there are also cases where tree-sitter
does a worse job than CC-mode, e.g. by not taking into account semantic
information that can be extracted from the current file.

> If we are going to really modernize Emacs' programming language
> support we need to provide more than parser-based syntax highlighting
> and indentation. [...] That means we need something like LSP.

I believe/hope this is obvious to everyone, yes.

> Tree-sitter migth be useful for the languages not yet supported by
> LSP, though

My impression is that tree-sitter might be useful for
syntax-highlighting and indentation.  I'm not sure how well those two
features are supported/handled by LSP servers and clients currently.

> (but, if I got it right, tree-sitter is implemented on Javascript,

AFAIK only the source grammars and the grammar-compiler is written in
Javascript: the parsing engine is written in C and exposed as
a C library.

>> I think you'd expect a good LSP server to "degrade gracefully" and still
>> provide good info for indentation and syntax highlighting even if you
>> only have the one file and all the other files in the project
>> are missing.
> As already mentioned elsewhere on this thread, an LSP server with access
> to just the current file is severely handicapped.

Of course.  The question is whether it can still provide a good enough
behavior in that case compared to tree-sitter.  If not, it might be
an argument in favor of using both LSP and tree-sitter.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 17:08                       ` Óscar Fuentes
  2021-06-05 17:31                         ` Stefan Monnier
@ 2021-06-05 17:32                         ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-05 17:32 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Sat, 05 Jun 2021 19:08:13 +0200
> 
> If we are going to really modernize Emacs' programming language support
> we need to provide more than parser-based syntax highlighting and
> indentation. We need smart code completion, code hints, transformations,
> etc.

Yes, we need that, and much more.  But if we reject partial solutions
because they aren't 110% perfect with every PL out there, we get to
stay with what we have now, which is much worse.  So in my book
incremental improvements using contemporary technology are a win, even
if they don't get us all the way to the ultimate goal.  Let's not
discourage potential volunteers from taking up the job of bringing
stuff like tree-sitter to Emacs because it may not be perfect for some
demanding languages.

> That means we need something like LSP.

We need to try both these technologies, before we make the decision.
Each one of them has upsides and downsides, and it is therefore
unwise, IMO, to put all the eggs into a single basket.  Chances are we
will want to keep both solutions handy, because they can be
complementary.

> Tree-sitter migth be useful for the languages not yet supported by
> LSP, though (but, if I got it right, tree-sitter is implemented on
> Javascript, so it requires a JS engine to work, maybe too much of a
> dependency for something that doesn't add that much over what we
> have now.)

That's secondary, IMO.  If the main issues are solved satisfactorily,
I don't expect too much time to pass before someone comes up with a
way of producing the tree-sitter grammars in Emacs Lisp.

> > I think you'd expect a good LSP server to "degrade gracefully" and still
> > provide good info for indentation and syntax highlighting even if you
> > only have the one file and all the other files in the project
> > are missing.
> 
> As already mentioned elsewhere on this thread, an LSP server with access
> to just the current file is severely handicapped. One thing is to miss
> the information about some functions yet-to-be-written and another thing
> entirely is to ignore everything not defined on the current file.

Once again, my suggestion is not to require perfect solutions,
especially since what we have now is nowhere near perfection.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 12:27                           ` Eli Zaretskii
@ 2021-06-05 17:59                             ` Jim Porter
  2021-06-05 18:56                               ` Daniel Martín
  0 siblings, 1 reply; 274+ messages in thread
From: Jim Porter @ 2021-06-05 17:59 UTC (permalink / raw)
  To: Eli Zaretskii, Daniel Colascione
  Cc: joaotavora, spacibba, theo, ubolonton, emacs-devel

On 6/5/2021 5:27 AM, Eli Zaretskii wrote:
>> Cc: joaotavora@gmail.com, jporterbugs@gmail.com, ubolonton@gmail.com,
>>   theo@thornhill.no, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Sat, 5 Jun 2021 04:55:21 -0700
>>
>> Fewer than running a remote Emacs: you don't interact with Tramp on each
>> keystroke. There are tons of advantages to Tramp; it's a first-class
>> feature and it's worth making language server support work properly with it.
> 
> No argument that we need to support that properly.  This sub-thread
> started when someone said it would probably be slow with Tramp in the
> loop.

Just to clarify: it may turn out that communicating with a remote LSP 
server is fast enough for this purpose. However, if performance issues 
do crop up, they'll be more severe with TRAMP. Having used Eglot over 
TRAMP on a fast connection (within a LAN), nothing jumps out as 
annoyingly slow, although the things I use LSP for aren't 
latency-sensitive. It might be worth collecting some numbers on this to 
see how slow it really is.

(As mentioned elsewhere in the thread, I'm very happy with how Eglot 
works with remote files currently, since it means that my dev tools for 
a particular environment can live on the same system as my source code. 
Having to run the appropriate LSP server locally to edit remote files 
would be inconvenient, although I suppose I could tolerate it.)

Looking into this a bit more, I'm not actually 100% sure how much VSCode 
(or other LSP-aware editors) use LSP for syntax highlighting today. 
Semantic tokens are only available in the most recent specification of 
LSP (3.16)[1], so many LSP clients/servers likely wouldn't be using this 
yet. It might be helpful to see what they were doing prior to this; 
there may be some relatively non-invasive changes that could improve things.

Perhaps there's a way to use something like tree-sitter (or even cc-mode 
as it currently stands) to get 90% of the way there and then augment 
that with results from the LSP server. For example, to address the 
original post, it seems the main issue is that cc-mode doesn't know 
what's a type and what isn't. If we could get type information from the 
LSP server, then cc-mode could take that into account. In the example, 
we could even rely on the fact that `std::variant' takes types as 
arguments, so we know that the arguments are types (or the code is 
incorrect).

- Jim

[1] 
https://microsoft.github.io/language-server-protocol/specifications/specification-current/#textDocument_semanticTokens

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05  6:51                   ` Eli Zaretskii
  2021-06-05 10:14                     ` Joost Kremers
  2021-06-05 13:23                     ` Stefan Monnier
@ 2021-06-05 18:46                     ` João Távora
  2 siblings, 0 replies; 274+ messages in thread
From: João Távora @ 2021-06-05 18:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Joost Kremers, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Joost Kremers <joostkremers@fastmail.fm>
>> Date: Fri, 04 Jun 2021 22:11:06 +0200
>> 
>> > I'm not an expert on the internals of LSP servers, but it's my
>> > understanding that for a language server like clangd, it needs access
>> > not just to the current file, but the entire source tree[1].
>> 
>> And speaking from my experience with lsp-mode (not eglot) and Python, it needs
>> access to the entire virtual env so it can provide type information and
>> completions for built-in Python packages and for 3rd-party packages that you use
>> your code.
>
> That cannot be a mandatory requirement, right?  Because otherwise LSP
> wouldn't be able to support editing of an unfinished project, where
> not everything is laid out 100% yet.  

You're mostly right.  Most good servers give some level of support even
if they can't make out the whole project.  And clangd is one of them, in
my experience.  It'd likely be able to fontify perfectly just by looking
at the file.  Of course, to be able to relate compilation units and
provide full completion they must understand the project and the linking
between units (unfortunately, this requires duplicating much of one's
makefile in a compile-commands.json or equivalent, though there are
tools that try to automate that).

But "seeing" the whole project isn't generally a problem as LSP usually
run in the same host where the project lives.  They don't see the
project through Emacs, they only see the "document" through Emacs, which
acts as the LSP client.  A "document" is similar to a
file-visiting-buffer.

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 17:59                             ` Jim Porter
@ 2021-06-05 18:56                               ` Daniel Martín
  0 siblings, 0 replies; 274+ messages in thread
From: Daniel Martín @ 2021-06-05 18:56 UTC (permalink / raw)
  To: Jim Porter
  Cc: Eli Zaretskii, Daniel Colascione, joaotavora, spacibba, theo,
	ubolonton, emacs-devel

Jim Porter <jporterbugs@gmail.com> writes:

> Looking into this a bit more, I'm not actually 100% sure how much
> VSCode (or other LSP-aware editors) use LSP for syntax highlighting
> today. Semantic tokens are only available in the most recent
> specification of LSP (3.16)[1], so many LSP clients/servers likely
> wouldn't be using this yet. It might be helpful to see what they were
> doing prior to this; there may be some relatively non-invasive changes
> that could improve things.

VSCode uses TextMate grammars[1] for syntax highlighting.  If the
language server supports it, it adds semantic highlighting on top of it.

I think TextMate grammars have more or less the same problems our
current syntax highlighting engine has: They are regexp-based, and
difficult to write and maintain.

Two editors that I think are already using Tree-sitter for syntax
highlighting are Atom and Neovim.

[1]: https://macromates.com/manual/en/language_grammars

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 15:54 ` Alan Mackenzie
  2021-06-04 18:30   ` Daniel Colascione
@ 2021-06-05 20:25   ` Dmitry Gutov
  2021-06-06 11:53     ` Alan Mackenzie
  1 sibling, 1 reply; 274+ messages in thread
From: Dmitry Gutov @ 2021-06-05 20:25 UTC (permalink / raw)
  To: Alan Mackenzie, Daniel Colascione; +Cc: emacs-devel

On 04.06.2021 18:54, Alan Mackenzie wrote:
> Whether a type is recognised as such depends on that, yes.  It's hard to
> think of a better way without having the resources of a compiler,
> particularly for ill-behaved languages like C+

Would it work much worse if you took the approach of not applying the 
highlighting when you frequently cannot be sure of what the type of the 
term is?

That would mean none of the types in brackets would be highlighted in 
the original example, but perhaps that is still better than the current 
result?

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-04 18:30   ` Daniel Colascione
@ 2021-06-06 11:37     ` Alan Mackenzie
  2021-06-06 11:57       ` Eli Zaretskii
  2021-06-06 17:44       ` Stefan Monnier
  0 siblings, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-06 11:37 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

Hello, Daniel.

On Fri, Jun 04, 2021 at 11:30:09 -0700, Daniel Colascione wrote:
> On 6/4/21 8:54 AM, Alan Mackenzie wrote:
> >> Is there any *general* way that we can make fontification more robust
> >> and consistent?
> > Like other people have said on the thread, rewriting CC Mode to use an
> > LSP parser.

> > Less drastically, it would be possible to fix the specific bug you
> > allude to, by the user making a list of types and configuring CC Mode
> > with them, rather than attempting to recognise such types.  This feels
> > as though it would be tedious to use, though.

> I understand that cc-mode can't always get it right. It's only 
> asymptotically omniscient. :-) Some deficiencies in highlighting are 
> bound to happen.

> What's striking to me is the inconsistency in the highlighting. None of 
> the types in the std::variant declaration in my screenshot is special. 
> They're all declared in the same file as the std::variant typedef. So 
> why is PrimitiveType fontified while the others aren't?

Because of the order various jit-lock chunks are fontified.  If the
chunk which establishes foo as a type is fontified first, subsequent
fontifications of foo will use font-lock-type-face.  Otherwise, not.

> FWIW, fontification is correct and consistent when I set 
> font-lock-support-mode to nil, so this really does look like another 
> case of getting unlucky with jit-lock block divisions.

Maybe an improvement might come from scanning the buffer for occurrences
of foo after foo has been recognised as a type and entered into the CC
Mode table.  That way, the lack of fontification on foo would be
temporary, at least provided your Emacs is configured to fontify
non-displayed bits of the buffer in the background (which it is by
default).

This might need enhanced support from jit-lock, such as some sort of
signal indicating a buffer has been completly fontified.  I haven't
thought this through, yet.

> Yes, I'm sure that this particular problem is caused by some bug, and
> with the right repro, we can quickly isolate and fix it. But this kind
> of seemingly-inexplicable inconsistent highlighting has been happening
> for years and years now.  There's something fundamental about the way
> cc-mode is written that makes bugs like this keep popping up. Is there
> some internal abstraction we can add, some algorithmic test suite we
> can write, that would make this whole class of bug less likely?

Well, "seemingly-inexplicable inconsistent highlighting" isn't much to
go on.  If this means "problems with types not getting fontified", then
see above.  Otherwise, particulars help.  It may well be that the ad-hoc
parsing method which CC Mode uses is no longer appropriate for the
modern languages it supports; that's what a lot of this thread has been
discussing.  By "internal abstraction" I think you might mean getting
information from a compiler, or building a partial compiler into CC
Mode.  This is surely possible in theory, but in practice?

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-05 20:25   ` Dmitry Gutov
@ 2021-06-06 11:53     ` Alan Mackenzie
  2021-06-06 17:08       ` Dmitry Gutov
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-06 11:53 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Daniel Colascione, emacs-devel

Hello, Dmitry.

On Sat, Jun 05, 2021 at 23:25:41 +0300, Dmitry Gutov wrote:
> On 04.06.2021 18:54, Alan Mackenzie wrote:
> > Whether a type is recognised as such depends on that, yes.  It's hard to
> > think of a better way without having the resources of a compiler,
> > particularly for ill-behaved languages like C+

> Would it work much worse if you took the approach of not applying the 
> highlighting when you frequently cannot be sure of what the type of the 
> term is?

Cases of "not being sure" are common indeed.  The whole of CC Mode is
based on heuristics.

> That would mean none of the types in brackets would be highlighted in 
> the original example, but perhaps that is still better than the current 
> result?

That would mean adding complicated decision functions for "not being
sure".  If the fontification of types where they are used (as opposed to
being declared) were to become less common, people would notice and
complain too.

There's the idea I proposed in my post to Daniel C of today - when a
type is newly recognised, then go through the buffer fontifying
occurrences of it.  That would probably help a lot, possibly at the cost
of slowing the mode down a bit.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:37     ` Alan Mackenzie
@ 2021-06-06 11:57       ` Eli Zaretskii
  2021-06-06 12:27         ` Alan Mackenzie
  2021-06-06 17:44       ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-06 11:57 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Sun, 6 Jun 2021 11:37:47 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
> > FWIW, fontification is correct and consistent when I set 
> > font-lock-support-mode to nil, so this really does look like another 
> > case of getting unlucky with jit-lock block divisions.
> 
> Maybe an improvement might come from scanning the buffer for occurrences
> of foo after foo has been recognised as a type and entered into the CC
> Mode table.  That way, the lack of fontification on foo would be
> temporary, at least provided your Emacs is configured to fontify
> non-displayed bits of the buffer in the background (which it is by
> default).
> 
> This might need enhanced support from jit-lock, such as some sort of
> signal indicating a buffer has been completly fontified.  I haven't
> thought this through, yet.

AFAIR, the way to tell JIT font-lock that a chunk of text was already
fontified is to set the 'fontified' property on that text.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:57       ` Eli Zaretskii
@ 2021-06-06 12:27         ` Alan Mackenzie
  2021-06-06 12:44           ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-06 12:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, emacs-devel

Hello, Eli.

On Sun, Jun 06, 2021 at 14:57:35 +0300, Eli Zaretskii wrote:
> > Date: Sun, 6 Jun 2021 11:37:47 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> > > FWIW, fontification is correct and consistent when I set 
> > > font-lock-support-mode to nil, so this really does look like another 
> > > case of getting unlucky with jit-lock block divisions.

> > Maybe an improvement might come from scanning the buffer for occurrences
> > of foo after foo has been recognised as a type and entered into the CC
> > Mode table.  That way, the lack of fontification on foo would be
> > temporary, at least provided your Emacs is configured to fontify
> > non-displayed bits of the buffer in the background (which it is by
> > default).

> > This might need enhanced support from jit-lock, such as some sort of
> > signal indicating a buffer has been completly fontified.  I haven't
> > thought this through, yet.

> AFAIR, the way to tell JIT font-lock that a chunk of text was already
> fontified is to set the 'fontified' property on that text.

Sorry, I was unclear.  I was thinking of a signal from jit-lock to the
major mode, indicating that background fontification had been completed.
CC Mode could react to this by fontifying all occurrences in the buffer
of "newly found" types.  Or something like that.

Or maybe the fontification could be done immediately after parsing a new
type.  This might be a bit sluggish, but it might be OK.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 12:27         ` Alan Mackenzie
@ 2021-06-06 12:44           ` Eli Zaretskii
  2021-06-06 14:19             ` Alan Mackenzie
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-06 12:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Sun, 6 Jun 2021 12:27:05 +0000
> Cc: dancol@dancol.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > AFAIR, the way to tell JIT font-lock that a chunk of text was already
> > fontified is to set the 'fontified' property on that text.
> 
> Sorry, I was unclear.  I was thinking of a signal from jit-lock to the
> major mode, indicating that background fontification had been completed.
> CC Mode could react to this by fontifying all occurrences in the buffer
> of "newly found" types.  Or something like that.

Sorry, I don't understand (probably because I missed the beginning of
this discussion): what do you mean by "background fontification", and
what does it mean for that to have been "completed"?  I'm afraid we
are not on the same page wrt JIT font-lock related terminology.

> Or maybe the fontification could be done immediately after parsing a new
> type.

Parsing by whom? by CC Mode?  If so, CC Mode parsing is itself part of
fontification, AFAIU, and is invoked by the JIT font-lock machinery.
So I'm confused wrt what you are looking for.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 12:44           ` Eli Zaretskii
@ 2021-06-06 14:19             ` Alan Mackenzie
  2021-06-06 17:06               ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-06 14:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, emacs-devel

Hello, Eli.

On Sun, Jun 06, 2021 at 15:44:38 +0300, Eli Zaretskii wrote:
> > Date: Sun, 6 Jun 2021 12:27:05 +0000
> > Cc: dancol@dancol.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > AFAIR, the way to tell JIT font-lock that a chunk of text was already
> > > fontified is to set the 'fontified' property on that text.

> > Sorry, I was unclear.  I was thinking of a signal from jit-lock to the
> > major mode, indicating that background fontification had been completed.
> > CC Mode could react to this by fontifying all occurrences in the buffer
> > of "newly found" types.  Or something like that.

> Sorry, I don't understand (probably because I missed the beginning of
> this discussion): what do you mean by "background fontification", and
> what does it mean for that to have been "completed"?  I'm afraid we
> are not on the same page wrt JIT font-lock related terminology.

CC Mode maintains a simple table of a buffer's types, which it uses to
fontify the same types when they occur again in the buffer.  Daniel's
main problem was that with JIT fontification, the occurrences of foo get
"fontified" to default face before foo has been entered into the table.
This happens because jit-lock doesn't scan the buffer from (point-min).

By "background fontification" I meant stealth fontification (and should
have said so).  This is, sadly, disabled by default.  If it were to be
enabled again, I was envisaging some sort of signal from jit-lock stealth
fontification when the stealth had determined a buffer was completely
fontified.  Reacting to this signal, CC Mode could then fontify all the
types which the stealth had caused to be added to the CC Mode table.

I no longer think this is a good idea.

> > Or maybe the fontification could be done immediately after parsing a new
> > type.

> Parsing by whom? by CC Mode?

Yes.  By CC Mode's fontification detecting a symbol, foo, must be a type,
and entering it into its internal table.  I am thinking that immediately
following, CC Mode could scan the entire buffer and refontify occurrences
of foo which hadn't yet got font-lock-type-face.

> If so, CC Mode parsing is itself part of fontification, AFAIU, and is
> invoked by the JIT font-lock machinery.  So I'm confused wrt what you
> are looking for.

I was looking for jit stealth locking to detect when it had completely
fontified a buffer (i.e. the `fontified' property was on the entire
buffer) and do something like calling a major mode function.  As I said,
I don't think this is a good idea, any more.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 14:19             ` Alan Mackenzie
@ 2021-06-06 17:06               ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-06 17:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Sun, 6 Jun 2021 14:19:00 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: dancol@dancol.org, emacs-devel@gnu.org
> 
> By "background fontification" I meant stealth fontification (and should
> have said so).  This is, sadly, disabled by default.  If it were to be
> enabled again, I was envisaging some sort of signal from jit-lock stealth
> fontification when the stealth had determined a buffer was completely
> fontified.

When a buffer has been completely fontified by jit-lock stealth
fontification, that buffer no longer appears in
jit-lock-stealth-buffers.  Is that good enough?

But yes, since stealth fontifications are disabled by default, this
isn't the way to make CC mode fontifications more accurate.

> Yes.  By CC Mode's fontification detecting a symbol, foo, must be a type,
> and entering it into its internal table.  I am thinking that immediately
> following, CC Mode could scan the entire buffer and refontify occurrences
> of foo which hadn't yet got font-lock-type-face.

You could still do that, but it could be costly, and slow down
redisplay, no?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:53     ` Alan Mackenzie
@ 2021-06-06 17:08       ` Dmitry Gutov
  0 siblings, 0 replies; 274+ messages in thread
From: Dmitry Gutov @ 2021-06-06 17:08 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel

Hi Alan,

On 06.06.2021 14:53, Alan Mackenzie wrote:

>> Would it work much worse if you took the approach of not applying the
>> highlighting when you frequently cannot be sure of what the type of the
>> term is?
> 
> Cases of "not being sure" are common indeed.  The whole of CC Mode is
> based on heuristics.

I would differentiate between approaches like

   need to parse around the callsite/usage site [of identifier]

and

   need to parse the identifier's definition itself

and, as far as Emacs major modes go, only used the first approach, plus 
perhaps some predefined/customizable list of built-ins.

Because it's pretty much a given that in a big enough project a lot of 
functions/classes/etc will be defined in files that the user will never 
visit in the current session.

>> That would mean none of the types in brackets would be highlighted in
>> the original example, but perhaps that is still better than the current
>> result?
> 
> That would mean adding complicated decision functions for "not being
> sure".  If the fontification of types where they are used (as opposed to
> being declared) were to become less common, people would notice and
> complain too.

Some might be relieved, too, seeing more stability of what is highlighed 
and what is not (and when).

> There's the idea I proposed in my post to Daniel C of today - when a
> type is newly recognised, then go through the buffer fontifying
> occurrences of it.  That would probably help a lot, possibly at the cost
> of slowing the mode down a bit.

What about the types that are defined in files you never visited?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 11:37     ` Alan Mackenzie
  2021-06-06 11:57       ` Eli Zaretskii
@ 2021-06-06 17:44       ` Stefan Monnier
  2021-06-06 18:00         ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-06 17:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Daniel Colascione, emacs-devel

> Because of the order various jit-lock chunks are fontified.  If the
> chunk which establishes foo as a type is fontified first, subsequent
> fontifications of foo will use font-lock-type-face.  Otherwise, not.

The way this is handled in other modes is to keep a highwater mark of
the buffer position up to which the text has been scanned for type
definitions and then in the font-lock-keywords you start by scanning the
text between this mark and the text that needs to be fontified (and
then moving the mark, of course).

Of course, this presumes that text later in the buffer can't affect
highlighting of earlier text (e.g. a type definition has to come before
its first use).  And it can have other downsides (e.g. if you already do
the scan for highlighting itself, it means you now have to do the scan
twice (once to collect and once to highlight), and it also means that if
the user jumps to the end of the buffer you'll have to scan the whole
buffer before you can start highlighting the last screenful of text).

> Maybe an improvement might come from scanning the buffer for occurrences
> of foo after foo has been recognised as a type and entered into the CC
> Mode table.  That way, the lack of fontification on foo would be
> temporary, at least provided your Emacs is configured to fontify
> non-displayed bits of the buffer in the background (which it is by
> default).

Not since:

    commit d0483d25c034c38a8c6f0d718e9780c50e6ba03a
    Author: David Kastrup <dak@gnu.org>
    Date:   Sun Mar 4 08:41:08 2007 +0000

        * NEWS (fontification): Mention that the new default for
        jit-lock-stealth-time is now nil.

        * jit-lock.el (jit-lock-stealth-time): Change default to nil.
        Preserve 16 as default value for "seconds" when customizing.

    diff --git a/lisp/jit-lock.el b/lisp/jit-lock.el
    --- a/lisp/jit-lock.el
    +++ b/lisp/jit-lock.el
    @@ -77,9 +77,9 @@
    -(defcustom jit-lock-stealth-time 16
    +(defcustom jit-lock-stealth-time nil
       "*Time in seconds to wait before beginning stealth fontification.
     Stealth fontification occurs if there is no input within this time.
     If nil, stealth fontification is never performed.

> This might need enhanced support from jit-lock, such as some sort of
> signal indicating a buffer has been completly fontified.

Indeed, there's no way currently for font-lock to tell jit-lock that it
has decided to fontify a particular chunk without being requested to do
so (Eli suggests setting the `fontified` property, but this means that
all the clients of jit-lock have done their work, so it's only correct
to set it from font-lock if you run the other `jit-lock-functions` (or
if there are currently no other `jit-lock-functions`)),

The closest related functionality is that a jit-lock function
(e.g. `font-lock-fontify-region`) can return a value of the form
(jit-lock-bounds BEG . END) to indicate the region it actually
fontified (which should cover the region they were asked to fontify).

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 17:44       ` Stefan Monnier
@ 2021-06-06 18:00         ` Eli Zaretskii
  2021-06-06 18:18           ` Stefan Monnier
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-06 18:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Daniel Colascione <dancol@dancol.org>,  emacs-devel@gnu.org
> Date: Sun, 06 Jun 2021 13:44:06 -0400
> 
> > Because of the order various jit-lock chunks are fontified.  If the
> > chunk which establishes foo as a type is fontified first, subsequent
> > fontifications of foo will use font-lock-type-face.  Otherwise, not.
> 
> The way this is handled in other modes is to keep a highwater mark of
> the buffer position up to which the text has been scanned for type
> definitions and then in the font-lock-keywords you start by scanning the
> text between this mark and the text that needs to be fontified (and
> then moving the mark, of course).

So if the first windowful of a file that's displayed is at EOB,
fontification must go all the way back to BOB and start scanning
there, until it comes to the end?

> Indeed, there's no way currently for font-lock to tell jit-lock that it
> has decided to fontify a particular chunk without being requested to do
> so (Eli suggests setting the `fontified` property, but this means that
> all the clients of jit-lock have done their work, so it's only correct
> to set it from font-lock if you run the other `jit-lock-functions` (or
> if there are currently no other `jit-lock-functions`)),

By "other clients" you mean those which don't fontify, but instead
piggy-back jit-lock to do other jobs?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:00         ` Eli Zaretskii
@ 2021-06-06 18:18           ` Stefan Monnier
  2021-06-06 18:33             ` Daniel Colascione
  2021-06-06 19:03             ` Eli Zaretskii
  0 siblings, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-06 18:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, dancol, emacs-devel

> So if the first windowful of a file that's displayed is at EOB,
> fontification must go all the way back to BOB and start scanning
> there, until it comes to the end?

Yup.  The way to make it bearable is to make that scan be as simple and
fast as possible.

Note that `syntax-propertize` and `syntax-ppss` also work this way, so
it's already the case that when we start by displaying EOB we first have
to apply `syntax-propertize` over the whole buffer :-(

In theory, there are various cases (which depend on the specific
programming language under consideration) where we could avoid such
a scan, but it would introduce a lot of complexity so we don't bother.

>> Indeed, there's no way currently for font-lock to tell jit-lock that it
>> has decided to fontify a particular chunk without being requested to do
>> so (Eli suggests setting the `fontified` property, but this means that
>> all the clients of jit-lock have done their work, so it's only correct
>> to set it from font-lock if you run the other `jit-lock-functions` (or
>> if there are currently no other `jit-lock-functions`)),
>
> By "other clients" you mean those which don't fontify, but instead
> piggy-back jit-lock to do other jobs?

`grep jit-lock-register` in Emacs' bundled files gives
bug-reference-mode, glasses-mode, and goto-address-mode as packages
which use jit-lock.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:18           ` Stefan Monnier
@ 2021-06-06 18:33             ` Daniel Colascione
  2021-06-06 20:24               ` Stefan Monnier
  2021-06-06 19:03             ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-06 18:33 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: acm, emacs-devel

On 6/6/21 11:18 AM, Stefan Monnier wrote:
>> So if the first windowful of a file that's displayed is at EOB,
>> fontification must go all the way back to BOB and start scanning
>> there, until it comes to the end?
> Yup.  The way to make it bearable is to make that scan be as simple and
> fast as possible.
>
> Note that `syntax-propertize` and `syntax-ppss` also work this way, so
> it's already the case that when we start by displaying EOB we first have
> to apply `syntax-propertize` over the whole buffer :-(
>
> In theory, there are various cases (which depend on the specific
> programming language under consideration) where we could avoid such
> a scan, but it would introduce a lot of complexity so we don't bother.

I've been thinking of a new core facility for helping modes implement 
this kind of incremental buffer analysis. Basically, it works like this: 
fontification logically proceeds from bob to eob in fixed-size chunks. 
After each chunk, we checkpoint the state of the fontification engine in 
a text property. Whenever we modify the buffer, we invalidate chunks 
that the modification might have affected and proceed from the last 
known-valid checkpoint.

It's more subtle than it sounds though.

First, we need to support lookahead. Fontification of region [A, B) 
might do lookahead and depend on text in region [B, C). If it does, a 
modification occurs somewhere between B and C, we need to invalidate the 
[A, B) chunk. If we put the fontification-by-chunking code in core, we 
can track (via core magic) a high-water-mark of accessed buffer position 
for fontification of each chunk. This way, invalidation becomes 
automatically correct.

Second, writing fontification as some kind of callback with explicit 
checkpoint and restore support is annoying, and nobody's going to do 
that. If it were possible to write fontification programs as coroutines, 
we would keep mode fontification routines simply and declarative and 
automatically do both the chunking and the checkpointing.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:18           ` Stefan Monnier
  2021-06-06 18:33             ` Daniel Colascione
@ 2021-06-06 19:03             ` Eli Zaretskii
  2021-06-06 20:28               ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-06 19:03 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  dancol@dancol.org,  emacs-devel@gnu.org
> Date: Sun, 06 Jun 2021 14:18:15 -0400
> 
> > So if the first windowful of a file that's displayed is at EOB,
> > fontification must go all the way back to BOB and start scanning
> > there, until it comes to the end?
> 
> Yup.  The way to make it bearable is to make that scan be as simple and
> fast as possible.
> 
> Note that `syntax-propertize` and `syntax-ppss` also work this way, so
> it's already the case that when we start by displaying EOB we first have
> to apply `syntax-propertize` over the whole buffer :-(

What exactly are the reasons that we need to scan from BOB?  With the
exception of data type declarations, what else requires to go back
farther that the beginning of the defun in which we start fontifying?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 18:33             ` Daniel Colascione
@ 2021-06-06 20:24               ` Stefan Monnier
  2021-06-06 20:27                 ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-06 20:24 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, acm, emacs-devel

> I've been thinking of a new core facility for helping modes implement this
> kind of incremental buffer analysis. Basically, it works like this:
> fontification logically proceeds from bob to eob in fixed-size chunks. After
> each chunk, we checkpoint the state of the fontification engine in a text
> property. Whenever we modify the buffer, we invalidate chunks that the
> modification might have affected and proceed from the last
> known-valid checkpoint.

[ I assume that what you mean by "fontification" is not literally
  placing faces (which is typically what font-lock does), but only
  a subset of that job (the subset that needs to proceed sequentially
  from BOB).  ]

You mean like what we do for `syntax-ppss` (except we keep the
checkpoint data in an alist indexed by positions, rather than in
text-properties)?

I think it would be fairly easy to add some way to keep extra data in
`syntax-ppss-wide/narrow`.

> It's more subtle than it sounds though.
>
> First, we need to support lookahead. Fontification of region [A, B) might do
> lookahead and depend on text in region [B, C).

For `syntax-propertize` we handle this via a `syntax-multiline` text
property, so that changes in the B region cause re-propertization of the
A region.

> Second, writing fontification as some kind of callback with explicit
> checkpoint and restore support is annoying, and nobody's going to do
> that. If it were possible to write fontification programs as coroutines, we
> would keep mode fontification routines simply and declarative and
> automatically do both the chunking and the checkpointing.

When I wrote the `syntax-ppss` code, I did expect to add facilities to
keep extra data in there, but so far the need has not really come up.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:24               ` Stefan Monnier
@ 2021-06-06 20:27                 ` Daniel Colascione
  2021-06-06 20:38                   ` Stefan Monnier
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-06 20:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, Eli Zaretskii, emacs-devel

On 6/6/21 1:24 PM, Stefan Monnier wrote:
>> I've been thinking of a new core facility for helping modes implement this
>> kind of incremental buffer analysis. Basically, it works like this:
>> fontification logically proceeds from bob to eob in fixed-size chunks. After
>> each chunk, we checkpoint the state of the fontification engine in a text
>> property. Whenever we modify the buffer, we invalidate chunks that the
>> modification might have affected and proceed from the last
>> known-valid checkpoint.
> [ I assume that what you mean by "fontification" is not literally
>    placing faces (which is typically what font-lock does), but only
>    a subset of that job (the subset that needs to proceed sequentially
>    from BOB).  ]
>
> You mean like what we do for `syntax-ppss` (except we keep the
> checkpoint data in an alist indexed by positions, rather than in
> text-properties)?
Yes, but generic.
>
> I think it would be fairly easy to add some way to keep extra data in
> `syntax-ppss-wide/narrow`.
>
>> It's more subtle than it sounds though.
>>
>> First, we need to support lookahead. Fontification of region [A, B) might do
>> lookahead and depend on text in region [B, C).
> For `syntax-propertize` we handle this via a `syntax-multiline` text
> property, so that changes in the B region cause re-propertization of the
> A region.

Manually placing syntax-multiline is annoying and error-prone. Can't we 
instead keep track of what buffer positions were actually inspected?




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 19:03             ` Eli Zaretskii
@ 2021-06-06 20:28               ` Stefan Monnier
  2021-06-07  7:35                 ` martin rudalics
  2021-06-07 12:08                 ` Eli Zaretskii
  0 siblings, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-06 20:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, dancol, emacs-devel

> What exactly are the reasons that we need to scan from BOB?  With the
> exception of data type declarations, what else requires to go back
> farther that the beginning of the defun in which we start fontifying?

It all depends on the language.

E.g. in ELisp, what looks like a defun might actually be in the middle
of a string and there's no reliable way to know if something's in
a string other than to parse from BOB.
In C the situation is somewhat similar but for comments.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:27                 ` Daniel Colascione
@ 2021-06-06 20:38                   ` Stefan Monnier
  0 siblings, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-06 20:38 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, acm, emacs-devel

> Manually placing syntax-multiline is annoying and error-prone.
> Can't we instead keep track of what buffer positions were actually inspected?

Depends how the inspection is done, but of course that could be done.
Note that in the current uses of `syntax-propertize`, it's rather
unusual to need `syntax-multiline` (and it's fairly easy to add it in
most cases).  So while I agree with "annoying and error-prone" the
motivation to come up with some automatic way to do it has been
rather low.

In any case, I think this is a very secondary issue compared to the
issue of deciding what it is you want to do in that
"fontification" scan (and then how you want to do it, etc...).

If you want to do something fancier than `parse-partial-sexp`, then that
probably means inventing a new parsing engine, along with
corresponding grammars.  If so, using tree-sitter as that parsing engine
is probably one of the most attractive options since it lets us reuse
existing grammars.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:28               ` Stefan Monnier
@ 2021-06-07  7:35                 ` martin rudalics
  2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 12:08                 ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: martin rudalics @ 2021-06-07  7:35 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: acm, dancol, emacs-devel

 >> What exactly are the reasons that we need to scan from BOB?  With the
 >> exception of data type declarations, what else requires to go back
 >> farther that the beginning of the defun in which we start fontifying?
 >
 > It all depends on the language.
 >
 > E.g. in ELisp, what looks like a defun might actually be in the middle
 > of a string and there's no reliable way to know if something's in
 > a string other than to parse from BOB.

Unless `open-paren-in-column-0-is-defun-start' is non-nil.

martin



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-06 20:28               ` Stefan Monnier
  2021-06-07  7:35                 ` martin rudalics
@ 2021-06-07 12:08                 ` Eli Zaretskii
  2021-06-08 15:22                   ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-07 12:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  dancol@dancol.org,  emacs-devel@gnu.org
> Date: Sun, 06 Jun 2021 16:28:02 -0400
> 
> > What exactly are the reasons that we need to scan from BOB?  With the
> > exception of data type declarations, what else requires to go back
> > farther that the beginning of the defun in which we start fontifying?
> 
> It all depends on the language.
> 
> E.g. in ELisp, what looks like a defun might actually be in the middle
> of a string and there's no reliable way to know if something's in
> a string other than to parse from BOB.
> In C the situation is somewhat similar but for comments.

So you are saying we need that just to know where the current defun
begins?  Any other needs to start from BOB?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07  7:35                 ` martin rudalics
@ 2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 13:37                     ` Eli Zaretskii
                                       ` (2 more replies)
  0 siblings, 3 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-07 13:20 UTC (permalink / raw)
  To: martin rudalics; +Cc: Eli Zaretskii, acm, dancol, emacs-devel

>>> What exactly are the reasons that we need to scan from BOB?  With the
>>> exception of data type declarations, what else requires to go back
>>> farther that the beginning of the defun in which we start fontifying?
>> It all depends on the language.
>> E.g. in ELisp, what looks like a defun might actually be in the middle
>> of a string and there's no reliable way to know if something's in
>> a string other than to parse from BOB.
> Unless `open-paren-in-column-0-is-defun-start' is non-nil.

We can use hacks like this one, indeed, but it's not in fashion
nowadays.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:20                   ` Stefan Monnier
@ 2021-06-07 13:37                     ` Eli Zaretskii
  2021-06-08  0:06                       ` Daniel Colascione
  2021-06-08 15:16                       ` Stefan Monnier
  2021-06-07 15:58                     ` martin rudalics
  2021-06-08  4:01                     ` Richard Stallman
  2 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-07 13:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, dancol, emacs-devel, acm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  acm@muc.de,  dancol@dancol.org,
>   emacs-devel@gnu.org
> Date: Mon, 07 Jun 2021 09:20:22 -0400
> 
> > Unless `open-paren-in-column-0-is-defun-start' is non-nil.
> 
> We can use hacks like this one, indeed, but it's not in fashion
> nowadays.

Yes, we prefer waiting forever for Emacs to respond to a TAB or RET,
and are okay with "random" fontification which triggered this thread.
The price of fashion, I guess.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 13:37                     ` Eli Zaretskii
@ 2021-06-07 15:58                     ` martin rudalics
  2021-06-08  4:01                     ` Richard Stallman
  2 siblings, 0 replies; 274+ messages in thread
From: martin rudalics @ 2021-06-07 15:58 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, Eli Zaretskii, dancol, emacs-devel

 > We can use hacks like this one, indeed, but it's not in fashion
 > nowadays.

So we joined the Carnabetian army.

martin



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:37                     ` Eli Zaretskii
@ 2021-06-08  0:06                       ` Daniel Colascione
  2021-06-08 15:16                       ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-08  0:06 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: rudalics, emacs-devel, acm

On 6/7/21 6:37 AM, Eli Zaretskii wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  acm@muc.de,  dancol@dancol.org,
>>    emacs-devel@gnu.org
>> Date: Mon, 07 Jun 2021 09:20:22 -0400
>>
>>> Unless `open-paren-in-column-0-is-defun-start' is non-nil.
>> We can use hacks like this one, indeed, but it's not in fashion
>> nowadays.
> Yes, we prefer waiting forever for Emacs to respond to a TAB or RET,
> and are okay with "random" fontification which triggered this thread.
> The price of fashion, I guess.

If a modern machine you're waiting "forever" to syntactically scan the 
buffer from BOB, something is very wrong. There's just no reason to use 
hacks like open-paren-in-column-0-is-defun-start, especially if we can 
checkpoint parsing.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:20                   ` Stefan Monnier
  2021-06-07 13:37                     ` Eli Zaretskii
  2021-06-07 15:58                     ` martin rudalics
@ 2021-06-08  4:01                     ` Richard Stallman
  2021-06-08 15:29                       ` Stefan Monnier
  2 siblings, 1 reply; 274+ messages in thread
From: Richard Stallman @ 2021-06-08  4:01 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, eliz, dancol, emacs-devel, acm

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > Unless `open-paren-in-column-0-is-defun-start' is non-nil.

  > We can use hacks like this one, indeed, but it's not in fashion
  > nowadays.

We have to choose between imperfect options.  We can't afford to
let fashion dictate our choice.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 13:37                     ` Eli Zaretskii
  2021-06-08  0:06                       ` Daniel Colascione
@ 2021-06-08 15:16                       ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-08 15:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, acm, dancol, emacs-devel

>> > Unless `open-paren-in-column-0-is-defun-start' is non-nil.
>> We can use hacks like this one, indeed, but it's not in fashion
>> nowadays.
> Yes, we prefer waiting forever for Emacs to respond to a TAB or RET,

I'd be interested to hear about the cases where you think the time to
reply to RET or TAB would be sped up by
`open-paren-in-column-0-is-defun-start`.

The case I know of where it would make a significant difference in
practice are things like:
- open a large file in a mode that uses a heavy
  `syntax-propertize-function`, such as perl-mode, and jump to the end.
- turn off font-lock-mode, do the same as above (which should be
  quick this time around), and then hit TAB (at which point you should
  see the same delay as you saw above).

So, yes, there is a performance price to pay, but in return you get
simpler ELisp code (because you don't need to implement the hacks), and
a more reliable behavior.

> and are okay with "random" fontification which triggered this thread.
> The price of fashion, I guess.

I think you're confused:
- the "random" fontification in this thread is in CC-mode, which does
  not use the approach I described (and used in syntax-propertize).
  E.g. Alan mentioned that the problematic behavior of CC-mode's
  highlighting can depend on the order in which the chunks are
  fontified, and `syntax-propertize` specifically aims to avoid
  such order-dependency [ And please don't get me wrong:
  an approach like that of `syntax-propertize` wouldn't solve the
  problematic fontification, but it would (mis)fonftify the same way
  every time.  ]
- it's with `open-paren-in-column-0-is-defun-start` that we had
  occasional/random misfontification, and it's indeed to get rid of
  those that we finally changed its default value.

The performance cost is real, but AFAIK this cost gives *less random*
behavior contrary to what you state.  The whole point of the design of
`syntax-propertize` is to try and make it eas(y|ier) to get
correct&reliable behavior (at the cost of sometimes sub-optimal
performance).

AFAIK one of the reasons why Alan doesn't want to use an approach like
that of syntax-propertize in CC-mode is because his guts tell him that
it would be too inefficient for C++.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-07 12:08                 ` Eli Zaretskii
@ 2021-06-08 15:22                   ` Stefan Monnier
  2021-06-08 15:46                     ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-08 15:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, dancol, emacs-devel

>> E.g. in ELisp, what looks like a defun might actually be in the middle
>> of a string and there's no reliable way to know if something's in
>> a string other than to parse from BOB.
>> In C the situation is somewhat similar but for comments.
>
> So you are saying we need that just to know where the current defun
> begins?

Not really: the dependency goes the other way around.

The real question is "given a POS determine whether it is inside
a string or a comment or neither", which we need in all kinds of
circumstances (sometimes we need a bit more info than that, of course,
but this one is the killer).

Approaches like `open-paren-in-column-0-is-defun-start` try to answer
this question without parsing from BOB by making an assumption that if
something looks like a defun, then it is neither inside a string nor
a comment.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08  4:01                     ` Richard Stallman
@ 2021-06-08 15:29                       ` Stefan Monnier
  2021-06-08 15:52                         ` Eli Zaretskii
                                           ` (2 more replies)
  0 siblings, 3 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-08 15:29 UTC (permalink / raw)
  To: Richard Stallman; +Cc: rudalics, eliz, acm, dancol, emacs-devel

>   > We can use hacks like this one, indeed, but it's not in fashion
>   > nowadays.
> We have to choose between imperfect options.  We can't afford to
> let fashion dictate our choice.

Oh boy, I see my use of the term "fashion" has really tipped
people's sensitivities.

All I meant is that given the increase of performance of CPUs (until the
beginning of this century) and a non-corresponding increase in file size
and complexity of language syntax, programmers nowadays prefer correct
behavior over fast behavior, since the correct behavior is fast enough
anyway to be bearable.

Given the lack of improvement in CPU performance over the last decade,
this may well change again, of course, but so far I haven't seen people
shy away from Python and IDEs, so I expect this won't happen in the
near future.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:22                   ` Stefan Monnier
@ 2021-06-08 15:46                     ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-08 15:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  dancol@dancol.org,  emacs-devel@gnu.org
> Date: Tue, 08 Jun 2021 11:22:21 -0400
> 
> >> E.g. in ELisp, what looks like a defun might actually be in the middle
> >> of a string and there's no reliable way to know if something's in
> >> a string other than to parse from BOB.
> >> In C the situation is somewhat similar but for comments.
> >
> > So you are saying we need that just to know where the current defun
> > begins?
> 
> Not really: the dependency goes the other way around.
> 
> The real question is "given a POS determine whether it is inside
> a string or a comment or neither", which we need in all kinds of
> circumstances (sometimes we need a bit more info than that, of course,
> but this one is the killer).
> 
> Approaches like `open-paren-in-column-0-is-defun-start` try to answer
> this question without parsing from BOB by making an assumption that if
> something looks like a defun, then it is neither inside a string nor
> a comment.

Then I guess you are not describing what CC Mode does, do you.  Which
is the subject of this discussion, AFAIU.  Doesn't CC Mode go to BOB
_a_lot_, and not just to determine whether we are inside a string or a
comment?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:29                       ` Stefan Monnier
@ 2021-06-08 15:52                         ` Eli Zaretskii
  2021-06-08 16:36                           ` Stefan Monnier
  2021-06-09  3:39                         ` Richard Stallman
  2021-06-09  8:34                         ` martin rudalics
  2 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-08 15:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, dancol, emacs-devel, rms, acm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: rudalics@gmx.at,  eliz@gnu.org,  acm@muc.de,  dancol@dancol.org,
>   emacs-devel@gnu.org
> Date: Tue, 08 Jun 2021 11:29:07 -0400
> 
> All I meant is that given the increase of performance of CPUs (until the
> beginning of this century) and a non-corresponding increase in file size
> and complexity of language syntax, programmers nowadays prefer correct
> behavior over fast behavior, since the correct behavior is fast enough
> anyway to be bearable.

Not in CC Mode, not IMO anyway.  But perhaps you don't consider what
CC Mode does to be "correct behavior".

And then, of course, there's a question "what is correct"?  When I see
something like

   static foo_t __attribute__((bar)) myvar;

I'm not sure I'd care if everything before "myvar" would be in the
same face and "myvar" in another face.  IOW, it isn't necessarily
important to me that fontification knows that foo_t is a type and not
a keyword.  So searching the file (and perhaps other files) for the
definition of foo_t isn't important -- for the purposes of
fontification.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:52                         ` Eli Zaretskii
@ 2021-06-08 16:36                           ` Stefan Monnier
  2021-06-08 18:11                             ` Daniel Colascione
  2021-06-08 18:11                             ` cc-mode fontification feels random Eli Zaretskii
  0 siblings, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-08 16:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, rudalics, acm, dancol, emacs-devel

>> All I meant is that given the increase of performance of CPUs (until the
>> beginning of this century) and a non-corresponding increase in file size
>> and complexity of language syntax, programmers nowadays prefer correct
>> behavior over fast behavior, since the correct behavior is fast enough
>> anyway to be bearable.
> Not in CC Mode, not IMO anyway.  But perhaps you don't consider what
> CC Mode does to be "correct behavior".

My comment was about using hacks like
`open-paren-in-column-0-is-defun-start` to avoid scanning from BOB in
`syntax-ppss/propertize`.

> And then, of course, there's a question "what is correct"?  When I see
> something like
>
>    static foo_t __attribute__((bar)) myvar;
>
> I'm not sure I'd care if everything before "myvar" would be in the
> same face and "myvar" in another face.  IOW, it isn't necessarily
> important to me that fontification knows that foo_t is a type and not
> a keyword.  So searching the file (and perhaps other files) for the
> definition of foo_t isn't important -- for the purposes of
> fontification.

FWIW, my `font-lock-type-face` is customized to:

    '(font-lock-type-face ((t)))

so I'll let you guess my opinion on this ;-)


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 16:36                           ` Stefan Monnier
@ 2021-06-08 18:11                             ` Daniel Colascione
  2021-06-08 18:25                               ` Eli Zaretskii
  2021-06-08 18:11                             ` cc-mode fontification feels random Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-08 18:11 UTC (permalink / raw)
  To: Stefan Monnier, Eli Zaretskii; +Cc: rudalics, emacs-devel, rms, acm

On 6/8/21 9:36 AM, Stefan Monnier wrote:

>>> All I meant is that given the increase of performance of CPUs (until the
>>> beginning of this century) and a non-corresponding increase in file size
>>> and complexity of language syntax, programmers nowadays prefer correct
>>> behavior over fast behavior, since the correct behavior is fast enough
>>> anyway to be bearable.
>> Not in CC Mode, not IMO anyway.  But perhaps you don't consider what
>> CC Mode does to be "correct behavior".
> My comment was about using hacks like
> `open-paren-in-column-0-is-defun-start` to avoid scanning from BOB in
> `syntax-ppss/propertize`.
>
>> And then, of course, there's a question "what is correct"?  When I see
>> something like
>>
>>     static foo_t __attribute__((bar)) myvar;
>>
>> I'm not sure I'd care if everything before "myvar" would be in the
>> same face and "myvar" in another face.  IOW, it isn't necessarily
>> important to me that fontification knows that foo_t is a type and not
>> a keyword.  So searching the file (and perhaps other files) for the
>> definition of foo_t isn't important -- for the purposes of
>> fontification.
> FWIW, my `font-lock-type-face` is customized to:
>
>      '(font-lock-type-face ((t)))
>
> so I'll let you guess my opinion on this ;-)

The whole point of fontification is to provide visual hints about the 
semantic structure of source code. If cc-mode can't do that reliably, my 
preference would be for it to not do it at all. Fontification of a 
type-using expression shouldn't change if I move the definition of that 
type from one file to another.

IMHO, we should rely on LSP to figure out what symbols are types, and if 
a LSP isn't available, we shouldn't try to guess.







^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 16:36                           ` Stefan Monnier
  2021-06-08 18:11                             ` Daniel Colascione
@ 2021-06-08 18:11                             ` Eli Zaretskii
  2021-06-08 21:25                               ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-08 18:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, dancol, emacs-devel, rms, acm

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: rms@gnu.org,  rudalics@gmx.at,  acm@muc.de,  dancol@dancol.org,
>   emacs-devel@gnu.org
> Date: Tue, 08 Jun 2021 12:36:40 -0400
> 
> >    static foo_t __attribute__((bar)) myvar;
> >
> > I'm not sure I'd care if everything before "myvar" would be in the
> > same face and "myvar" in another face.  IOW, it isn't necessarily
> > important to me that fontification knows that foo_t is a type and not
> > a keyword.  So searching the file (and perhaps other files) for the
> > definition of foo_t isn't important -- for the purposes of
> > fontification.
> 
> FWIW, my `font-lock-type-face` is customized to:
> 
>     '(font-lock-type-face ((t)))

Does that make CC Mode bypass those scans from BOB?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:11                             ` Daniel Colascione
@ 2021-06-08 18:25                               ` Eli Zaretskii
  2021-06-08 18:28                                 ` Daniel Colascione
  2021-06-09 18:22                                 ` Alan Mackenzie
  0 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-08 18:25 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, acm, monnier, rms, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Tue, 8 Jun 2021 11:11:21 -0700
> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
> 
> The whole point of fontification is to provide visual hints about the 
> semantic structure of source code. If cc-mode can't do that reliably, my 
> preference would be for it to not do it at all. Fontification of a 
> type-using expression shouldn't change if I move the definition of that 
> type from one file to another.

I think we agree.  Except that for me, it should also not try if it
cannot do it quickly enough, not only reliably enough.

> IMHO, we should rely on LSP to figure out what symbols are types, and if 
> a LSP isn't available, we shouldn't try to guess.

I was talking about what to do (or not to do) with our existing
regexp- and "syntax"-based fontifications.  I still remember the days
when CC Mode handled that well enough without being a snail it
frequently is now, and that was on a machine about 10 times slower
than the one I use nowadays.  The C language didn't change too much
since then, at least not the flavor I frequently edit.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:25                               ` Eli Zaretskii
@ 2021-06-08 18:28                                 ` Daniel Colascione
  2021-06-08 18:54                                   ` Eli Zaretskii
  2021-06-09 18:22                                 ` Alan Mackenzie
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-08 18:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, acm, monnier, rms, emacs-devel

On 6/8/21 11:25 AM, Eli Zaretskii wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Tue, 8 Jun 2021 11:11:21 -0700
>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
>>
>> The whole point of fontification is to provide visual hints about the
>> semantic structure of source code. If cc-mode can't do that reliably, my
>> preference would be for it to not do it at all. Fontification of a
>> type-using expression shouldn't change if I move the definition of that
>> type from one file to another.
> I think we agree.  Except that for me, it should also not try if it
> cannot do it quickly enough, not only reliably enough.
>
>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>> a LSP isn't available, we shouldn't try to guess.
> I was talking about what to do (or not to do) with our existing
> regexp- and "syntax"-based fontifications.  I still remember the days
> when CC Mode handled that well enough without being a snail it
> frequently is now, and that was on a machine about 10 times slower
> than the one I use nowadays.  The C language didn't change too much
> since then, at least not the flavor I frequently edit.

C++ is a much more complex language and a lot more relevant for modern 
software development.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:28                                 ` Daniel Colascione
@ 2021-06-08 18:54                                   ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-08 18:54 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, acm, monnier, rms, emacs-devel

> Cc: monnier@iro.umontreal.ca, rudalics@gmx.at, emacs-devel@gnu.org,
>  rms@gnu.org, acm@muc.de
> From: Daniel Colascione <dancol@dancol.org>
> Date: Tue, 8 Jun 2021 11:28:41 -0700
> 
> > I was talking about what to do (or not to do) with our existing
> > regexp- and "syntax"-based fontifications.  I still remember the days
> > when CC Mode handled that well enough without being a snail it
> > frequently is now, and that was on a machine about 10 times slower
> > than the one I use nowadays.  The C language didn't change too much
> > since then, at least not the flavor I frequently edit.
> 
> C++ is a much more complex language and a lot more relevant for modern 
> software development.

Sure, but that doesn't justify the slowdown in C editing I experience
over the years.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:11                             ` cc-mode fontification feels random Eli Zaretskii
@ 2021-06-08 21:25                               ` Stefan Monnier
  0 siblings, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-08 21:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, rudalics, acm, dancol, emacs-devel

>> FWIW, my `font-lock-type-face` is customized to:
>> 
>>     '(font-lock-type-face ((t)))
>
> Does that make CC Mode bypass those scans from BOB?

It doesn't affect CC-mode, of course.  But I don't known which scans
you're referring to, since CC-mode does not perform many scans from
BOB, AFAIK.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:29                       ` Stefan Monnier
  2021-06-08 15:52                         ` Eli Zaretskii
@ 2021-06-09  3:39                         ` Richard Stallman
  2021-06-09  8:34                         ` martin rudalics
  2 siblings, 0 replies; 274+ messages in thread
From: Richard Stallman @ 2021-06-09  3:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, eliz, dancol, emacs-devel, acm

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > All I meant is that given the increase of performance of CPUs (until the
  > beginning of this century) and a non-corresponding increase in file size
  > and complexity of language syntax, programmers nowadays prefer correct
  > behavior over fast behavior, since the correct behavior is fast enough
  > anyway to be bearable.

That makes sense, in general.  But the alternatives available to us
may not give us a good way to adjust that tradeoff.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 15:29                       ` Stefan Monnier
  2021-06-08 15:52                         ` Eli Zaretskii
  2021-06-09  3:39                         ` Richard Stallman
@ 2021-06-09  8:34                         ` martin rudalics
  2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
  2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
  2 siblings, 2 replies; 274+ messages in thread
From: martin rudalics @ 2021-06-09  8:34 UTC (permalink / raw)
  To: Stefan Monnier, Richard Stallman; +Cc: acm, eliz, dancol, emacs-devel

 > Oh boy, I see my use of the term "fashion" has really tipped
 > people's sensitivities.

It was rather the use of the idiom "We can use hacks like this one".  I
see `open-paren-in-column-0-is-defun-start' as a way to subdivide code
into chunks that may be edited and processed independently.  Currently,
we use a monolithic approach (one that works on the whole buffer from
its beginning) for fontification and a chunk-wise approach (as in the
default `beginning-of-defun') for editing proper.

I do not like, for example, that inserting a quotation mark somewhere
into a Lisp buffer, with some delay repaints the entire rest of the
buffer just to undo that when I insert the closing quotation mark.
Maybe these are bad editing habits but I won't change them any more.  So
for me `open-paren-in-column-0-is-defun-start' is not a hack but an
entire philosophy which, unfortunately, doesn't work with fontification.

martin

^ permalink raw reply	[flat|nested] 274+ messages in thread

* `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random)
  2021-06-09  8:34                         ` martin rudalics
@ 2021-06-09 13:14                           ` Stefan Monnier
  2021-06-09 15:15                             ` Yuri Khan
  2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-09 13:14 UTC (permalink / raw)
  To: martin rudalics; +Cc: Richard Stallman, eliz, acm, dancol, emacs-devel

> It was rather the use of the idiom "We can use hacks like this one".  I
> see `open-paren-in-column-0-is-defun-start' as a way to subdivide code
> into chunks that may be edited and processed independently.  Currently,
> we use a monolithic approach (one that works on the whole buffer from
> its beginning) for fontification and a chunk-wise approach (as in the
> default `beginning-of-defun') for editing proper.

I see two problems with `open-paren-in-column-0-is-defun-start` (opic0ids):

- The implementation was a lot simpler than what's needed for your
  notion of "chunk-wise editing", thus leading to somewhat arbitrary
  behaviors because we only used the opic0ids property when it was
  convenient, rather than using it at every place where it could change
  the behavior.

- this convention is imposed on top of the definition of the language,
  so it's like editing "C with the opic0ids convention" rather than
  editing "C".  This works fine if your file is indeed written in "C
  with the opic0ids convention", but no so well otherwise.  And that
  convention is specific to Emacs (I can imagine other editors
  supporting a similar convention, but most likely it won't be exactly
  the same one since it's not a widely known convention), so unless all
  the coders agree to use Emacs you'll probably want to enforce that
  convention via some kind of "sanity check" maybe running in a CI.

- I don't think a major mode for language Foo should default to
  assuming that the buffer is written in "Foo with the opic0ids
  convention".


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random)
  2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
@ 2021-06-09 15:15                             ` Yuri Khan
  2021-06-09 15:16                               ` Yuri Khan
  0 siblings, 1 reply; 274+ messages in thread
From: Yuri Khan @ 2021-06-09 15:15 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Richard Stallman, Emacs developers, martin rudalics,
	Alan Mackenzie, Eli Zaretskii, Daniel Colascione

On Wed, 9 Jun 2021 at 20:16, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> I see two problems with `open-paren-in-column-0-is-defun-start` (opic0ids):
[…]
> - this convention is imposed on top of the definition of the language,
>   so it's like editing "C with the opic0ids convention" rather than
>   editing "C".  This works fine if your file is indeed written in "C
>   with the opic0ids convention", but no so well otherwise.  And that
>   convention is specific to Emacs

The convention of not indenting lines that start a function, or, at
least, an important landmark in the code, is also supported by ‘diff
--show-c-function’ and ‘git show-function’.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random)
  2021-06-09 15:15                             ` Yuri Khan
@ 2021-06-09 15:16                               ` Yuri Khan
  0 siblings, 0 replies; 274+ messages in thread
From: Yuri Khan @ 2021-06-09 15:16 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Richard Stallman, Emacs developers, martin rudalics,
	Alan Mackenzie, Eli Zaretskii, Daniel Colascione

On Wed, 9 Jun 2021 at 22:15, Yuri Khan <yuri.v.khan@gmail.com> wrote:

> The convention of not indenting lines that start a function, or, at
> least, an important landmark in the code, is also supported by ‘diff
> --show-c-function’ and ‘git show-function’.

‘git grep --show-function’ I meant, of course.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-08 18:25                               ` Eli Zaretskii
  2021-06-08 18:28                                 ` Daniel Colascione
@ 2021-06-09 18:22                                 ` Alan Mackenzie
  2021-06-09 18:36                                   ` Eli Zaretskii
  2021-06-09 19:05                                   ` Daniel Colascione
  1 sibling, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-09 18:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, monnier, rms, emacs-devel

Hello, Eli.

On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Tue, 8 Jun 2021 11:11:21 -0700
> > Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de

> > The whole point of fontification is to provide visual hints about
> > the semantic structure of source code. If cc-mode can't do that
> > reliably, my preference would be for it to not do it at all.
> > Fontification of a type-using expression shouldn't change if I move
> > the definition of that type from one file to another.

> I think we agree.  Except that for me, it should also not try if it
> cannot do it quickly enough, not only reliably enough.

Quickly and reliably enough are desirable things, but in competition
with eachother.  Reliably enough is a lot easier to measure, quickly
enough depends on the machine, the degree of optimisation, and above
all, the user's expectations.

> > IMHO, we should rely on LSP to figure out what symbols are types, and if 
> > a LSP isn't available, we shouldn't try to guess.

"Shouldn't try to guess" means taking a great deal of
font-lock-type-faces out of CC Mode.  I don't honestly think the end
result would be any better than what we have at the moment.

> I was talking about what to do (or not to do) with our existing
> regexp- and "syntax"-based fontifications.  I still remember the days
> when CC Mode handled that well enough without being a snail it
> frequently is now, and that was on a machine about 10 times slower
> than the one I use nowadays.

Those old versions had masses of fontification bugs in them.  People
wrote bug reports about them and they got fixed.  Those fixes frequently
involved a loss of speed.  :-(

There have also been several bug reports about unusual buffers getting
fontified at the speed of continental drift, and fixing those has
usually led to a little slowdown for ordinary buffers.  I'm thinking,
for example, about bug #25706, where a 4 MB file took nearly an hour to
scroll through on my machine.  After the fix, it took around 86 seconds.

> The C language didn't change too much since then, at least not the
> flavor I frequently edit.

There are two places where CC Mode can be slow: font locking large areas
of text, and keeping up with somebody typing quickly.  Which of these
bothers you the most?  I have plans for speeding up one of these.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:22                                 ` Alan Mackenzie
@ 2021-06-09 18:36                                   ` Eli Zaretskii
  2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-09 19:05                                   ` Daniel Colascione
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-09 18:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, monnier, rms, emacs-devel

> Date: Wed, 9 Jun 2021 18:22:57 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>   rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > I think we agree.  Except that for me, it should also not try if it
> > cannot do it quickly enough, not only reliably enough.
> 
> Quickly and reliably enough are desirable things, but in competition
> with eachother.  Reliably enough is a lot easier to measure, quickly
> enough depends on the machine, the degree of optimisation, and above
> all, the user's expectations.

That's why we had (and still have) font-lock-maximum-decoration: so
that users could control the tradeoff.  Unfortunately, support for
that variable is all but absent nowadays, because of the widespread
mistaken assumption that font-lock is fast enough in all modes.

> > > IMHO, we should rely on LSP to figure out what symbols are types, and if 
> > > a LSP isn't available, we shouldn't try to guess.
> 
> "Shouldn't try to guess" means taking a great deal of
> font-lock-type-faces out of CC Mode.  I don't honestly think the end
> result would be any better than what we have at the moment.

You don't think it will be better for what reason?

> > I was talking about what to do (or not to do) with our existing
> > regexp- and "syntax"-based fontifications.  I still remember the days
> > when CC Mode handled that well enough without being a snail it
> > frequently is now, and that was on a machine about 10 times slower
> > than the one I use nowadays.
> 
> Those old versions had masses of fontification bugs in them.

I don't remember bumping into those bugs.  Or maybe they were not
important enough to affect my UX.  Slow redisplay, by contrast, hits
me _every_day_, especially if I need to work with an unoptimized
build.  From where I stand, the balance between performance and
accuracy have shifted to the worse, unfortunately.

> People wrote bug reports about them and they got fixed.  Those fixes
> frequently involved a loss of speed.  :-(

If there's no way of fixing a bug without adversely affecting speed,
we should add user options to control those "fixes", so that people
could choose the balance that fits them.  Sometimes Emacs could itself
decide whether to invoke the "slow" code.  For example, it makes no
sense for users of C to be "punished" because we want more accurate
fontification of C++ sources.

> There have also been several bug reports about unusual buffers getting
> fontified at the speed of continental drift, and fixing those has
> usually led to a little slowdown for ordinary buffers.  I'm thinking,
> for example, about bug #25706, where a 4 MB file took nearly an hour to
> scroll through on my machine.  After the fix, it took around 86 seconds.

Once again, a pathological use case should not punish the usual ones;
if the punishment is too harsh, there should be a way to disable the
support for pathological cases for those who never hit them.

> > The C language didn't change too much since then, at least not the
> > flavor I frequently edit.
> 
> There are two places where CC Mode can be slow: font locking large areas
> of text, and keeping up with somebody typing quickly.  Which of these
> bothers you the most?  I have plans for speeding up one of these.

Both, I guess.  Though the former is probably more prominent, since
I'm not really such a fast typist, but I do happen to scroll through
source quite a lot.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:36                                   ` Eli Zaretskii
@ 2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 19:04                                       ` Eli Zaretskii
                                                         ` (2 more replies)
  2021-06-09 21:03                                     ` Alan Mackenzie
  1 sibling, 3 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-09 18:51 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, monnier, rms, emacs-devel



On June 9, 2021 11:37:17 AM Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Wed, 9 Jun 2021 18:22:57 +0000
>> Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>> rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>>> I think we agree.  Except that for me, it should also not try if it
>>> cannot do it quickly enough, not only reliably enough.
>>
>> Quickly and reliably enough are desirable things, but in competition
>> with eachother.  Reliably enough is a lot easier to measure, quickly
>> enough depends on the machine, the degree of optimisation, and above
>> all, the user's expectations.
>
> That's why we had (and still have) font-lock-maximum-decoration: so
> that users could control the tradeoff.  Unfortunately, support for
> that variable is all but absent nowadays, because of the widespread
> mistaken assumption that font-lock is fast enough in all modes.

It should be fast enough for all modes. This isn't 1985. Computers in 
general are *several orders* of magnitude faster than needed to do real 
time syntax highlighting in general. Other editors don't seem to struggle.  
Tree sitter is very fast. If regular editing is stuttering because of 
fontification, we have bad data structures, algorithms, or architectures 
--- that is, bugs. And we shouldn't add user options to paper over bugs. 
That's ridiculous. I can't believe we really want to propose a "please make 
syntax highlighting wrong" user option.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:51                                     ` Daniel Colascione
@ 2021-06-09 19:04                                       ` Eli Zaretskii
  2021-06-09 20:07                                       ` chad
  2021-06-09 20:17                                       ` Dmitry Gutov
  2 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-09 19:04 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <emacs-devel@gnu.org>, <rms@gnu.org>
> Date: Wed, 09 Jun 2021 11:51:28 -0700
> 
> > That's why we had (and still have) font-lock-maximum-decoration: so
> > that users could control the tradeoff.  Unfortunately, support for
> > that variable is all but absent nowadays, because of the widespread
> > mistaken assumption that font-lock is fast enough in all modes.
> 
> It should be fast enough for all modes. This isn't 1985. Computers in 
> general are *several orders* of magnitude faster than needed to do real 
> time syntax highlighting in general.

I'm all for speeding it up, but the fact is, it isn't always fast
enough, especially in large files/buffers.  As long as it isn't fast
enough, that variable has its place, IMO.

> Other editors don't seem to struggle.  

Do you happen to know why?  Maybe we could use some of the ideas.

> Tree sitter is very fast.

But we don't use it.  I hope we will some day.

> If regular editing is stuttering because of 
> fontification, we have bad data structures, algorithms, or architectures 
> --- that is, bugs. And we shouldn't add user options to paper over bugs. 

I disagree.  These aren't "normal" bugs, these are design bugs, or
maybe even limitations of the methods we use for fontifications.  Such
issues sometimes take time to replace with better ones, and in the
meantime we need to provide reasonably responsive editing.

> That's ridiculous. I can't believe we really want to propose a "please make 
> syntax highlighting wrong" user option.

Not "wrong", just "less granular".  There's no single "right" here.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:22                                 ` Alan Mackenzie
  2021-06-09 18:36                                   ` Eli Zaretskii
@ 2021-06-09 19:05                                   ` Daniel Colascione
  2021-06-09 19:11                                     ` Eli Zaretskii
  2021-06-09 20:20                                     ` Alan Mackenzie
  1 sibling, 2 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-09 19:05 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: rudalics, monnier, rms, emacs-devel



On June 9, 2021 11:23:04 AM Alan Mackenzie <acm@muc.de> wrote:

> Hello, Eli.
>
> On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
>>> From: Daniel Colascione <dancol@dancol.org>
>>> Date: Tue, 8 Jun 2021 11:11:21 -0700
>>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
>
>>> The whole point of fontification is to provide visual hints about
>>> the semantic structure of source code. If cc-mode can't do that
>>> reliably, my preference would be for it to not do it at all.
>>> Fontification of a type-using expression shouldn't change if I move
>>> the definition of that type from one file to another.
>
>> I think we agree.  Except that for me, it should also not try if it
>> cannot do it quickly enough, not only reliably enough.
>
> Quickly and reliably enough are desirable things, but in competition
> with eachother.  Reliably enough is a lot easier to measure, quickly
> enough depends on the machine, the degree of optimisation, and above
> all, the user's expectations.
>
>>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>>> a LSP isn't available, we shouldn't try to guess.
>
> "Shouldn't try to guess" means taking a great deal of
> font-lock-type-faces out of CC Mode.  I don't honestly think the end
> result would be any better than what we have at the moment.


>
I think it would be better in fact. The whole point of fontification is to 
provide visual clues about the function of a word in a buffer. If I can't 
rely on font lock type face actually distinguishing types from non-types, 
what's the point? If fontification isn't reliable, it's not syntax 
highlighting, but instead a kewl rainbow effect.

ISTM we can only correctly do fontification of type references with the 
help of LSP. Without LSP support, I'd rather we not try to get it right, 
sometimes get it wrong, and make font-lock-type-face unreliable.  (We can 
correctly fontify declarations and definitions I think.)





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 19:05                                   ` Daniel Colascione
@ 2021-06-09 19:11                                     ` Eli Zaretskii
  2021-06-09 20:20                                     ` Alan Mackenzie
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-09 19:11 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <emacs-devel@gnu.org>, <rms@gnu.org>
> Date: Wed, 09 Jun 2021 12:05:27 -0700
> 
> ISTM we can only correctly do fontification of type references with the 
> help of LSP.

Patches are welcome to integrate LSP support, so that it could be the
main means of fontifying buffers.

> Without LSP support, I'd rather we not try to get it right,
> sometimes get it wrong, and make font-lock-type-face unreliable.
> (We can correctly fontify declarations and definitions I think.)

If we cannot do a reasonably good job in that case, then perhaps we
should indeed refrain from fontifying types.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 19:04                                       ` Eli Zaretskii
@ 2021-06-09 20:07                                       ` chad
  2021-06-10  6:43                                         ` Eli Zaretskii
  2021-06-09 20:17                                       ` Dmitry Gutov
  2 siblings, 1 reply; 274+ messages in thread
From: chad @ 2021-06-09 20:07 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Richard Stallman, EMACS development team, martin rudalics,
	Stefan Monnier, Alan Mackenzie, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

On Wed, Jun 9, 2021 at 11:56 AM Daniel Colascione <dancol@dancol.org> wrote:

> It should be fast enough for all modes. This isn't 1985. Computers in
> general are *several orders* of magnitude faster than needed to do real
> time syntax highlighting in general. Other editors don't seem to
> struggle.
> Tree sitter is very fast. If regular editing is stuttering because of
> fontification, we have bad data structures, algorithms, or architectures
> --- that is, bugs. And we shouldn't add user options to paper over bugs.
> That's ridiculous. I can't believe we really want to propose a "please
> make
> syntax highlighting wrong" user option.
>

I'm all for keeping context in mind, and I think that part of that is Eli's
unusual circumstances: running unoptimised builds with extra checking
enabled. I don't know what his particular hardware is like, but my laptop
is a medium-spec i5 from ~4 generations back running debian inside a
lightweight VM, and I can both scroll from top to bottom of src/xdisp.c and
open the file and immediately Esc-> to the end without (being aware of?)
font-lock falling behind.

Are other people having much worse experiences than this? Is there some
other situation where emacs developers are frequently seeing problems? I
don't do anything with C++ anymore, and I haven't bothered setting up LSP
here.

Thanks
~Chad

[-- Attachment #2: Type: text/html, Size: 1835 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:51                                     ` Daniel Colascione
  2021-06-09 19:04                                       ` Eli Zaretskii
  2021-06-09 20:07                                       ` chad
@ 2021-06-09 20:17                                       ` Dmitry Gutov
  2 siblings, 0 replies; 274+ messages in thread
From: Dmitry Gutov @ 2021-06-09 20:17 UTC (permalink / raw)
  To: Daniel Colascione, Eli Zaretskii, Alan Mackenzie
  Cc: rudalics, emacs-devel, monnier, rms

On 09.06.2021 21:51, Daniel Colascione wrote:
> And we shouldn't add user options to paper over bugs. That's ridiculous. 
> I can't believe we really want to propose a "please make syntax 
> highlighting wrong" user option.

If it's possible to add a user option to disable or enable the 
fontification of type references in CC Mode, and if its nil value would 
disable the additional parsing logic required to get that "mostly 
right", the result could make both Eli happy with increased performance, 
and you (together with a number of other users) happier with more 
predictable, yet less ambitious syntax highlighting.

And one could then optionally add TreeSitter on top of that.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 19:05                                   ` Daniel Colascione
  2021-06-09 19:11                                     ` Eli Zaretskii
@ 2021-06-09 20:20                                     ` Alan Mackenzie
  2021-06-09 20:36                                       ` Stefan Monnier
  2021-06-10  2:21                                       ` Daniel Colascione
  1 sibling, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-09 20:20 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, Eli Zaretskii, monnier, rms, emacs-devel

Hello, Daniel.

On Wed, Jun 09, 2021 at 12:05:27 -0700, Daniel Colascione wrote:

> On June 9, 2021 11:23:04 AM Alan Mackenzie <acm@muc.de> wrote:

> > Hello, Eli.

> > On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
> >>> From: Daniel Colascione <dancol@dancol.org>
> >>> Date: Tue, 8 Jun 2021 11:11:21 -0700
> >>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de

> >>> The whole point of fontification is to provide visual hints about
> >>> the semantic structure of source code. If cc-mode can't do that
> >>> reliably, my preference would be for it to not do it at all.
> >>> Fontification of a type-using expression shouldn't change if I move
> >>> the definition of that type from one file to another.

> >> I think we agree.  Except that for me, it should also not try if it
> >> cannot do it quickly enough, not only reliably enough.

> > Quickly and reliably enough are desirable things, but in competition
> > with eachother.  Reliably enough is a lot easier to measure, quickly
> > enough depends on the machine, the degree of optimisation, and above
> > all, the user's expectations.

> >>> IMHO, we should rely on LSP to figure out what symbols are types, and if
> >>> a LSP isn't available, we shouldn't try to guess.

> > "Shouldn't try to guess" means taking a great deal of
> > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > result would be any better than what we have at the moment.



> I think it would be better in fact. The whole point of fontification is to 
> provide visual clues about the function of a word in a buffer.

That's one of the points.  Another point is to provide colour, thus
giving the eye some pattern to orient around.  I think its most important
function is to point out comments, thus making things like

    if (foo)
      bar (); /* comment about bar
    else
      baz (); /* comment about baz */
    
undangerous.  For that case, fine distinctions about types are
irrelevant.

> If I can't rely on font lock type face actually distinguishing types
> from non-types, what's the point?

Because the information about types, though imperfect, is nevertheless
highly useful.

> If fontification isn't reliable, it's not syntax highlighting, but
> instead a kewl rainbow effect.

Now you seem to be saying that either font lock has to be 100% right, or
it's wholly useless.  Is that a fair summary of your position?  If so, do
you disable font lock mode for CC Mode and other modes which can't
guarantee perfect font locking?

> ISTM we can only correctly do fontification of type references with the 
> help of LSP.

I don't think it would be sensible to try to do it otherwise.

> Without LSP support, I'd rather we not try to get it right, sometimes
> get it wrong, and make font-lock-type-face unreliable.  (We can
> correctly fontify declarations and definitions I think.)

That's a rather negative way of putting things, which is a bit indefinite
and wishy-washy.  You could instead try to specify which tokens should get
font-lock-type-face and which shouldn't, thus giving something concrete
to discuss.  I think this will be difficult to do well, and may lead to
the result which I alluded to above.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:20                                     ` Alan Mackenzie
@ 2021-06-09 20:36                                       ` Stefan Monnier
  2021-06-10  7:01                                         ` Daniel Colascione
  2021-06-10  2:21                                       ` Daniel Colascione
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-09 20:36 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Daniel Colascione, Eli Zaretskii, rudalics, emacs-devel, rms

> That's a rather negative way of putting things, which is a bit indefinite
> and wishy-washy.  You could instead try to specify which tokens should get
> font-lock-type-face and which shouldn't, thus giving something concrete
> to discuss.  I think this will be difficult to do well, and may lead to
> the result which I alluded to above.

It has to be said also that C/C++ is quite unusual in that knowing which
identifier is a type is necessary for correct parsing.  If it weren't
so, we could reliably highlight types not based on their name but based
on their location in the syntax.

I think an approach like that of tree-sitter should be able (at least in
theory) to give reasonably good highlighting of types based on their
position (tho sadly not in those cases where the syntax is ambiguous).

I don't have a good intuition of how often ambiguities come into play in
real code, nor how much work would be needed to disambiguate most cases
(without relying on discovery of the corresponding type declarations).

If ambiguities are rare enough and/or easy enough to disambiguate
via some simple/local heuristic, then maybe CC-mode could try to
highlight types based on their location rather than based on
their identifiers.  This would make it more stable (not dependent on
the order in which chunks are highlighted) and maybe more reliable.
But I suspect that it's not easy to do that kind of parsing, short of
doing a full parse like tree-sitter does.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 18:36                                   ` Eli Zaretskii
  2021-06-09 18:51                                     ` Daniel Colascione
@ 2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-10  2:21                                       ` Daniel Colascione
                                                         ` (2 more replies)
  1 sibling, 3 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-09 21:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, monnier, rms, emacs-devel

Hello, Eli.

On Wed, Jun 09, 2021 at 21:36:44 +0300, Eli Zaretskii wrote:
> > Date: Wed, 9 Jun 2021 18:22:57 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
> >   rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > I think we agree.  Except that for me, it should also not try if it
> > > cannot do it quickly enough, not only reliably enough.

> > Quickly and reliably enough are desirable things, but in competition
> > with eachother.  Reliably enough is a lot easier to measure, quickly
> > enough depends on the machine, the degree of optimisation, and above
> > all, the user's expectations.

> That's why we had (and still have) font-lock-maximum-decoration: so
> that users could control the tradeoff.  Unfortunately, support for
> that variable is all but absent nowadays, because of the widespread
> mistaken assumption that font-lock is fast enough in all modes.

That variable is still supported by CC Mode (with the exception of AWK
Mode, where it surely is not needed).

Another possibility would be to replace accurate auxiliary functionality
with rough and ready facilities.  In a scroll through xdisp.c, fontifying
as we go, the following three functions are taking around 30% of the
run-time:

(i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
  top level.
(ii) c-determine-limit, c-determine-+ve-limit, which determine search
  limits approximately ARG non-literal characters before or after point.

By replacing these accurate functions with rough ones, the fontification
would be right most of the time, but a mess at other times (for example,
when there are big comments near point).  (i) is more important for C++
that C, but still makes a difference in C.

If we were to try this, I think a user toggle would be needed.

> > > > IMHO, we should rely on LSP to figure out what symbols are types, and if 
> > > > a LSP isn't available, we shouldn't try to guess.

> > "Shouldn't try to guess" means taking a great deal of
> > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > result would be any better than what we have at the moment.

> You don't think it will be better for what reason?

Because many users will still want at least the basic types (int, double,
unsigned long, ....) fontified, leading to the very mess Daniel would
like to avoid.   Declarations with basic types tend to be interleaved
with those using project defined types.

> > > I was talking about what to do (or not to do) with our existing
> > > regexp- and "syntax"-based fontifications.  I still remember the days
> > > when CC Mode handled that well enough without being a snail it
> > > frequently is now, and that was on a machine about 10 times slower
> > > than the one I use nowadays.

> > Those old versions had masses of fontification bugs in them.

> I don't remember bumping into those bugs.  Or maybe they were not
> important enough to affect my UX.  Slow redisplay, by contrast, hits
> me _every_day_, especially if I need to work with an unoptimized
> build.  From where I stand, the balance between performance and
> accuracy have shifted to the worse, unfortunately.

OK.  My above suggestion might give ~50% increase in fontification speed.

> > People wrote bug reports about them and they got fixed.  Those fixes
> > frequently involved a loss of speed.  :-(

> If there's no way of fixing a bug without adversely affecting speed,
> we should add user options to control those "fixes", so that people
> could choose the balance that fits them.

I think this would be a bad thing.  There are no (or very few) similar
user options in CC Mode at the moment, and an option to fix or not fix a
bug seems a strange idea, and would make the code quite a bit more
complicated.

> Sometimes Emacs could itself decide whether to invoke the "slow" code.
> For example, it makes no sense for users of C to be "punished" because
> we want more accurate fontification of C++ sources.

There is some truth in this imputation, yes.

> > There have also been several bug reports about unusual buffers
> > getting fontified at the speed of continental drift, and fixing those
> > has usually led to a little slowdown for ordinary buffers.  I'm
> > thinking, for example, about bug #25706, where a 4 MB file took
> > nearly an hour to scroll through on my machine.  After the fix, it
> > took around 86 seconds.

> Once again, a pathological use case should not punish the usual ones;
> if the punishment is too harsh, there should be a way to disable the
> support for pathological cases for those who never hit them.

The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
bugs would just be unsolvable.

> > > The C language didn't change too much since then, at least not the
> > > flavor I frequently edit.

> > There are two places where CC Mode can be slow: font locking large areas
> > of text, and keeping up with somebody typing quickly.  Which of these
> > bothers you the most?  I have plans for speeding up one of these.

> Both, I guess.  Though the former is probably more prominent, since
> I'm not really such a fast typist, but I do happen to scroll through
> source quite a lot.

Thanks.  I'll try to come up with speedups in the coming weeks (and
months).

Do you have fast-but-imprecise-scrolling enabled?  That can reduce the
pain.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:20                                     ` Alan Mackenzie
  2021-06-09 20:36                                       ` Stefan Monnier
@ 2021-06-10  2:21                                       ` Daniel Colascione
  2021-06-19  9:25                                         ` Alan Mackenzie
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-10  2:21 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, Eli Zaretskii, monnier, rms, emacs-devel



On June 9, 2021 1:20:32 PM Alan Mackenzie <acm@muc.de> wrote:

> Hello, Daniel.
>
> On Wed, Jun 09, 2021 at 12:05:27 -0700, Daniel Colascione wrote:
>
>> On June 9, 2021 11:23:04 AM Alan Mackenzie <acm@muc.de> wrote:
>
>>> Hello, Eli.
>
>>> On Tue, Jun 08, 2021 at 21:25:49 +0300, Eli Zaretskii wrote:
>>>>> From: Daniel Colascione <dancol@dancol.org>
>>>>> Date: Tue, 8 Jun 2021 11:11:21 -0700
>>>>> Cc: rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org, acm@muc.de
>
>>>>> The whole point of fontification is to provide visual hints about
>>>>> the semantic structure of source code. If cc-mode can't do that
>>>>> reliably, my preference would be for it to not do it at all.
>>>>> Fontification of a type-using expression shouldn't change if I move
>>>>> the definition of that type from one file to another.
>
>>>> I think we agree.  Except that for me, it should also not try if it
>>>> cannot do it quickly enough, not only reliably enough.
>
>>> Quickly and reliably enough are desirable things, but in competition
>>> with eachother.  Reliably enough is a lot easier to measure, quickly
>>> enough depends on the machine, the degree of optimisation, and above
>>> all, the user's expectations.
>
>>>>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>>>>> a LSP isn't available, we shouldn't try to guess.
>
>>> "Shouldn't try to guess" means taking a great deal of
>>> font-lock-type-faces out of CC Mode.  I don't honestly think the end
>>> result would be any better than what we have at the moment.
>
>
>
>> I think it would be better in fact. The whole point of fontification is to
>> provide visual clues about the function of a word in a buffer.
>
> That's one of the points.  Another point is to provide colour, thus
> giving the eye some pattern to orient around.  I think its most important
> function is to point out comments, thus making things like
>
>    if (foo)
>      bar (); /* comment about bar
>    else
>      baz (); /* comment about baz */
>
> undangerous.  For that case, fine distinctions about types are
> irrelevant.
>
>> If I can't rely on font lock type face actually distinguishing types
>> from non-types, what's the point?
>
> Because the information about types, though imperfect, is nevertheless
> highly useful.
>
>> If fontification isn't reliable, it's not syntax highlighting, but
>> instead a kewl rainbow effect.
>
> Now you seem to be saying that either font lock has to be 100% right, or
> it's wholly useless.  Is that a fair summary of your position?  If so, do
> you disable font lock mode for CC Mode and other modes which can't
> guarantee perfect font locking?
>
>> ISTM we can only correctly do fontification of type references with the
>> help of LSP.
>
> I don't think it would be sensible to try to do it otherwise.
>
>> Without LSP support, I'd rather we not try to get it right, sometimes
>> get it wrong, and make font-lock-type-face unreliable.  (We can
>> correctly fontify declarations and definitions I think.)
>
> That's a rather negative way of putting things, which is a bit indefinite
> and wishy-washy.  You could instead try to specify which tokens should get
> font-lock-type-face and which shouldn't, thus giving something concrete
> to discuss.  I think this will be difficult to do well, and may lead to
> the result which I alluded to above.

Sure. To be more precise: what I propose is not applying 
font-lock-type-face to symbols when we think that symbol is a type solely 
because it's been entered into cc-mode's table of dynamically discovered 
types for the current buffer.


>
> --
> Alan Mackenzie (Nuremberg, Germany).






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 21:03                                     ` Alan Mackenzie
@ 2021-06-10  2:21                                       ` Daniel Colascione
  2021-06-10  6:55                                         ` Eli Zaretskii
  2021-06-10  6:39                                       ` Eli Zaretskii
  2021-06-10 15:16                                       ` Ergus
  2 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-10  2:21 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: rudalics, monnier, rms, emacs-devel



On June 9, 2021 2:03:07 PM Alan Mackenzie <acm@muc.de> wrote:

> Hello, Eli.
>
> On Wed, Jun 09, 2021 at 21:36:44 +0300, Eli Zaretskii wrote:
>>> Date: Wed, 9 Jun 2021 18:22:57 +0000
>>> Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>>> rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
>>> From: Alan Mackenzie <acm@muc.de>
>
>>>> I think we agree.  Except that for me, it should also not try if it
>>>> cannot do it quickly enough, not only reliably enough.
>
>>> Quickly and reliably enough are desirable things, but in competition
>>> with eachother.  Reliably enough is a lot easier to measure, quickly
>>> enough depends on the machine, the degree of optimisation, and above
>>> all, the user's expectations.
>
>> That's why we had (and still have) font-lock-maximum-decoration: so
>> that users could control the tradeoff.  Unfortunately, support for
>> that variable is all but absent nowadays, because of the widespread
>> mistaken assumption that font-lock is fast enough in all modes.
>
> That variable is still supported by CC Mode (with the exception of AWK
> Mode, where it surely is not needed).
>
> Another possibility would be to replace accurate auxiliary functionality
> with rough and ready facilities.  In a scroll through xdisp.c, fontifying
> as we go, the following three functions are taking around 30% of the
> run-time:
>
> (i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
>  top level.
> (ii) c-determine-limit, c-determine-+ve-limit, which determine search
>  limits approximately ARG non-literal characters before or after point.



>
> By replacing these accurate functions with rough ones, the fontification
> would be right most of the time, but a mess at other times (for example,
> when there are big comments near point).  (i) is more important for C++
> that C, but still makes a difference i


Another option is adding core support to speed up these operations. I don't 
think we should be sacrificing correctness for speed.

>
>
> If we were to try this, I think a user toggle would be needed.
>
>>>>> IMHO, we should rely on LSP to figure out what symbols are types, and if
>>>>> a LSP isn't available, we shouldn't try to guess.
>
>>> "Shouldn't try to guess" means taking a great deal of
>>> font-lock-type-faces out of CC Mode.  I don't honestly think the end
>>> result would be any better than what we have at the moment.
>
>> You don't think it will be better for what reason?
>
> Because many users will still want at least the basic types (int, double,
> unsigned long, ....) fontified, leading to the very mess Daniel would
> like to avoid.   Declarations with basic types tend to be interleaved
> with those using project defined types.
>
>>>> I was talking about what to do (or not to do) with our existing
>>>> regexp- and "syntax"-based fontifications.  I still remember the days
>>>> when CC Mode handled that well enough without being a snail it
>>>> frequently is now, and that was on a machine about 10 times slower
>>>> than the one I use nowadays.
>
>>> Those old versions had masses of fontification bugs in them.
>
>> I don't remember bumping into those bugs.  Or maybe they were not
>> important enough to affect my UX.  Slow redisplay, by contrast, hits
>> me _every_day_, especially if I need to work with an unoptimized
>> build.  From where I stand, the balance between performance and
>> accuracy have shifted to the worse, unfortunately.
>
> OK.  My above suggestion might give ~50% increase in fontification speed.
>
>>> People wrote bug reports about them and they got fixed.  Those fixes
>>> frequently involved a loss of speed.  :-(
>
>> If there's no way of fixing a bug without adversely affecting speed,
>> we should add user options to control those "fixes", so that people
>> could choose the balance that fits them.
>
> I think this would be a bad thing.  There are no (or very few) similar
> user options in CC Mode at the moment, and an option to fix or not fix a
> bug seems a strange idea, and would make the code quite a bit more
> complicated.
>
>> Sometimes Emacs could itself decide whether to invoke the "slow" code.
>> For example, it makes no sense for users of C to be "punished" because
>> we want more accurate fontification of C++ sources.
>
> There is some truth in this imputation, yes.
>
>>> There have also been several bug reports about unusual buffers
>>> getting fontified at the speed of continental drift, and fixing those
>>> has usually led to a little slowdown for ordinary buffers.  I'm
>>> thinking, for example, about bug #25706, where a 4 MB file took
>>> nearly an hour to scroll through on my machine.  After the fix, it
>>> took around 86 seconds.
>
>> Once again, a pathological use case should not punish the usual ones;
>> if the punishment is too harsh, there should be a way to disable the
>> support for pathological cases for those who never hit them.
>
> The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
> 3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
> bugs would just be unsolvable.
>
>>>> The C language didn't change too much since then, at least not the
>>>> flavor I frequently edit.
>
>>> There are two places where CC Mode can be slow: font locking large areas
>>> of text, and keeping up with somebody typing quickly.  Which of these
>>> bothers you the most?  I have plans for speeding up one of these.
>
>> Both, I guess.  Though the former is probably more prominent, since
>> I'm not really such a fast typist, but I do happen to scroll through
>> source quite a lot.
>
> Thanks.  I'll try to come up with speedups in the coming weeks (and
> months).
>
> Do you have fast-but-imprecise-scrolling enabled?  That can reduce the
> pain.
>
> --
> Alan Mackenzie (Nuremberg, Germany).






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-10  2:21                                       ` Daniel Colascione
@ 2021-06-10  6:39                                       ` Eli Zaretskii
  2021-06-10 16:46                                         ` Alan Mackenzie
  2021-06-10 15:16                                       ` Ergus
  2 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10  6:39 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, monnier, rms, emacs-devel

> Date: Wed, 9 Jun 2021 21:03:03 +0000
> Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
>   emacs-devel@gnu.org, rms@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > That's why we had (and still have) font-lock-maximum-decoration: so
> > that users could control the tradeoff.  Unfortunately, support for
> > that variable is all but absent nowadays, because of the widespread
> > mistaken assumption that font-lock is fast enough in all modes.
> 
> That variable is still supported by CC Mode (with the exception of AWK
> Mode, where it surely is not needed).

Does it make a difference, performance-wise?  If not (which is what
ISTR), then that variable isn't really "supported", because supporting
it means that different values of it cause tangible differences in
performance.

> Another possibility would be to replace accurate auxiliary functionality
> with rough and ready facilities.  In a scroll through xdisp.c, fontifying
> as we go, the following three functions are taking around 30% of the
> run-time:
> 
> (i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
>   top level.
> (ii) c-determine-limit, c-determine-+ve-limit, which determine search
>   limits approximately ARG non-literal characters before or after point.
> 
> By replacing these accurate functions with rough ones, the fontification
> would be right most of the time, but a mess at other times (for example,
> when there are big comments near point).  (i) is more important for C++
> that C, but still makes a difference in C.
> 
> If we were to try this, I think a user toggle would be needed.

How about making font-lock-maximum-decoration control that as well?

> > > "Shouldn't try to guess" means taking a great deal of
> > > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > > result would be any better than what we have at the moment.
> 
> > You don't think it will be better for what reason?
> 
> Because many users will still want at least the basic types (int, double,
> unsigned long, ....) fontified

I'm not sure.  Can you explain why would I care too much about the
basic types (or types in general) standing out?

> > If there's no way of fixing a bug without adversely affecting speed,
> > we should add user options to control those "fixes", so that people
> > could choose the balance that fits them.
> 
> I think this would be a bad thing.  There are no (or very few) similar
> user options in CC Mode at the moment, and an option to fix or not fix a
> bug seems a strange idea

It depends on the bug.  If the bug causes Emacs to infloop or work
very slowly, then sure, no toggle for the fix would make sense.  But I
was talking about "bugs" that cause inaccurate or incorrect
fontifications, and those are much "softer".  At least IMO such "bugs"
are tolerable if they are rare enough, especially if fixing them hurts
redisplay performance and Emacs responsiveness in general.

Don't forget that the display code invokes fontifications also when it
does internal layout calculations whose results are not immediately
shown (or even not at all).  When that happens, some command not
directly related to display could be adversely affected.  So one idea
would be to turn off these expensive parts in those cases.

> > Once again, a pathological use case should not punish the usual ones;
> > if the punishment is too harsh, there should be a way to disable the
> > support for pathological cases for those who never hit them.
> 
> The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
> 3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
> bugs would just be unsolvable.

Once again: what kind of "bugs" are those?  If they only cause
imperfect faces, I'm not sure it's unthinkable to disable them, given
some optional value of a user knob.

> Do you have fast-but-imprecise-scrolling enabled?

No.  That's a separate issue, and influences all the modes, even those
where font-lock is light-weight.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:07                                       ` chad
@ 2021-06-10  6:43                                         ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10  6:43 UTC (permalink / raw)
  To: chad; +Cc: rms, emacs-devel, rudalics, monnier, acm, dancol

> From: chad <yandros@gmail.com>
> Date: Wed, 9 Jun 2021 13:07:17 -0700
> Cc: Richard Stallman <rms@gnu.org>,
>  EMACS development team <emacs-devel@gnu.org>,
>  martin rudalics <rudalics@gmx.at>, Stefan Monnier <monnier@iro.umontreal.ca>,
>  Alan Mackenzie <acm@muc.de>, Eli Zaretskii <eliz@gnu.org>
> 
> I'm all for keeping context in mind, and I think that part of that is Eli's unusual circumstances: running
> unoptimised builds with extra checking enabled. I don't know what his particular hardware is like, but my
> laptop is a medium-spec i5 from ~4 generations back running debian inside a lightweight VM, and I can both
> scroll from top to bottom of src/xdisp.c and open the file and immediately Esc-> to the end without (being
> aware of?) font-lock falling behind. 

Make a C file that's 10 copies of xdisp.c one after the other, and
repeat the experiment.  Then try the same with Emacs 23 to see the
regression.

My machine is a Core i7, albeit an old model of it.  But it still can
run circles around the one Richard Stallman uses, or the one Stefan
said he was using.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  2:21                                       ` Daniel Colascione
@ 2021-06-10  6:55                                         ` Eli Zaretskii
  2021-06-10  6:58                                           ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10  6:55 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> Date: Wed, 09 Jun 2021 19:21:23 -0700
> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> 
> > By replacing these accurate functions with rough ones, the fontification
> > would be right most of the time, but a mess at other times (for example,
> > when there are big comments near point).  (i) is more important for C++
> > that C, but still makes a difference i
> 
> Another option is adding core support to speed up these operations. I don't 
> think we should be sacrificing correctness for speed.

If speeding that up is feasible, sure, that's a better alternative.
Sacrificing correctness is a kind-of retreat, justified only when a
better solution is not at hand.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  6:55                                         ` Eli Zaretskii
@ 2021-06-10  6:58                                           ` Daniel Colascione
  2021-06-10  7:19                                             ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-10  6:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, monnier, rms, rudalics



On June 9, 2021 11:55:45 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Wed, 09 Jun 2021 19:21:23 -0700
>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>>
>>> By replacing these accurate functions with rough ones, the fontification
>>> would be right most of the time, but a mess at other times (for example,
>>> when there are big comments near point).  (i) is more important for C++
>>> that C, but still makes a difference i
>>
>> Another option is adding core support to speed up these operations. I don't
>> think we should be sacrificing correctness for speed.
>
> If speeding that up is feasible, sure, that's a better alternative.
> Sacrificing correctness is a kind-of retreat, justified only when a
> better solution is not at hand.

Sure. But I started this thread not because cc-mode was slow, but because 
specific design choices led to inconsistent fontification. It'd be a shame 
for it to result in changes that made cc-mode even more inconsistent.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 20:36                                       ` Stefan Monnier
@ 2021-06-10  7:01                                         ` Daniel Colascione
  2021-06-10  7:21                                           ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-10  7:01 UTC (permalink / raw)
  To: Stefan Monnier, Alan Mackenzie; +Cc: rudalics, Eli Zaretskii, rms, emacs-devel



On June 9, 2021 1:36:42 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> That's a rather negative way of putting things, which is a bit indefinite
>> and wishy-washy.  You could instead try to specify which tokens should get
>> font-lock-type-face and which shouldn't, thus giving something concrete
>> to discuss.  I think this will be difficult to do well, and may lead to
>> the result which I alluded to above.
>
> It has to be said also that C/C++ is quite unusual in that knowing which
> identifier is a type is necessary for correct parsing.  If it weren't
> so, we could reliably highlight types not based on their name but based
> on their location in the syntax.
>
> I think an approach like that of tree-sitter should be able (at least in
> theory) to give reasonably good highlighting of types based on their
> position (tho sadly not in those cases where the syntax is ambiguous).

The model I've had in mind for dealing with parse ambiguity is an 
incremental GLR parser generating a parse forest, pruning the forest by 
constraint solving on ad-hoc language specific constraints, then picking 
one of the remaining parse trees incrementally to fontify.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  6:58                                           ` Daniel Colascione
@ 2021-06-10  7:19                                             ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10  7:19 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <acm@muc.de>, <rudalics@gmx.at>, <monnier@iro.umontreal.ca>, <rms@gnu.org>, <emacs-devel@gnu.org>
> Date: Wed, 09 Jun 2021 23:58:40 -0700
> 
> > If speeding that up is feasible, sure, that's a better alternative.
> > Sacrificing correctness is a kind-of retreat, justified only when a
> > better solution is not at hand.
> 
> Sure. But I started this thread not because cc-mode was slow, but because 
> specific design choices led to inconsistent fontification. It'd be a shame 
> for it to result in changes that made cc-mode even more inconsistent.

Yes, there are two sub-threads here, about two different aspects of CC
Mode's fontifications.  Not unheard of in our discussions ;-)

From my POV, I'd like both of these issues be fixed at some future
time.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  7:01                                         ` Daniel Colascione
@ 2021-06-10  7:21                                           ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10  7:21 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 10 Jun 2021 00:01:38 -0700
> Cc: rudalics@gmx.at, Eli Zaretskii <eliz@gnu.org>, rms@gnu.org,
>  emacs-devel@gnu.org
> 
> The model I've had in mind for dealing with parse ambiguity is an 
> incremental GLR parser generating a parse forest, pruning the forest by 
> constraint solving on ad-hoc language specific constraints, then picking 
> one of the remaining parse trees incrementally to fontify.

I'm not an expert in this area: is this different from what
tree-sitter does?  If so, what are the main differences?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09 21:03                                     ` Alan Mackenzie
  2021-06-10  2:21                                       ` Daniel Colascione
  2021-06-10  6:39                                       ` Eli Zaretskii
@ 2021-06-10 15:16                                       ` Ergus
  2021-06-10 15:34                                         ` Óscar Fuentes
                                                           ` (2 more replies)
  2 siblings, 3 replies; 274+ messages in thread
From: Ergus @ 2021-06-10 15:16 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, rudalics, dancol, monnier, rms, emacs-devel

Hi:

Sorry to bother, but just to clarify the conclusions because I lost some
messages:

1) What is finally the most desirable/long path/future feature? I mean,,
finally what is preferred by the developers to support in the future?

lsp or tree-sitter?

2) Alan, some time ago there was an issue related with the indentation
that the proper fix substituted some regex with iterative solutions. In
this case, it seems like that happens relatively often for complex
solutions.

Do you think that there is some missing/needed common use
function/API/feature that we could implement in the C side to improve
such iterative solutions?

  Maybe some vectorized "magic" functions that return pre-processed
vectors or low level data structure and avoid lisp loops and object
constructors and the lisp forth and back overheads and/or stressing the
GC?

3) Eli/Stefan do you think are there any missing feature in the low
level API that may simplify/improve integration with LSP or tree-sitters
in the future?

For things like font-lock/display engine I only consider to do as much
as possible in the C side to improve performance. And reduce as much as
possible interacting with the lisp side... Do you think that it may be
possible?

Best,
Ergus.

  
  

On Wed, Jun 09, 2021 at 09:03:03PM +0000, Alan Mackenzie wrote:
>Hello, Eli.
>
>On Wed, Jun 09, 2021 at 21:36:44 +0300, Eli Zaretskii wrote:
>> > Date: Wed, 9 Jun 2021 18:22:57 +0000
>> > Cc: Daniel Colascione <dancol@dancol.org>, monnier@iro.umontreal.ca,
>> >   rudalics@gmx.at, emacs-devel@gnu.org, rms@gnu.org
>> > From: Alan Mackenzie <acm@muc.de>
>
>> > > I think we agree.  Except that for me, it should also not try if it
>> > > cannot do it quickly enough, not only reliably enough.
>
>> > Quickly and reliably enough are desirable things, but in competition
>> > with eachother.  Reliably enough is a lot easier to measure, quickly
>> > enough depends on the machine, the degree of optimisation, and above
>> > all, the user's expectations.
>
>> That's why we had (and still have) font-lock-maximum-decoration: so
>> that users could control the tradeoff.  Unfortunately, support for
>> that variable is all but absent nowadays, because of the widespread
>> mistaken assumption that font-lock is fast enough in all modes.
>
>That variable is still supported by CC Mode (with the exception of AWK
>Mode, where it surely is not needed).
>
>Another possibility would be to replace accurate auxiliary functionality
>with rough and ready facilities.  In a scroll through xdisp.c, fontifying
>as we go, the following three functions are taking around 30% of the
>run-time:
>
>(i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
>  top level.
>(ii) c-determine-limit, c-determine-+ve-limit, which determine search
>  limits approximately ARG non-literal characters before or after point.
>
>By replacing these accurate functions with rough ones, the fontification
>would be right most of the time, but a mess at other times (for example,
>when there are big comments near point).  (i) is more important for C++
>that C, but still makes a difference in C.
>
>If we were to try this, I think a user toggle would be needed.
>
>> > > > IMHO, we should rely on LSP to figure out what symbols are types, and if
>> > > > a LSP isn't available, we shouldn't try to guess.
>
>> > "Shouldn't try to guess" means taking a great deal of
>> > font-lock-type-faces out of CC Mode.  I don't honestly think the end
>> > result would be any better than what we have at the moment.
>
>> You don't think it will be better for what reason?
>
>Because many users will still want at least the basic types (int, double,
>unsigned long, ....) fontified, leading to the very mess Daniel would
>like to avoid.   Declarations with basic types tend to be interleaved
>with those using project defined types.
>
>> > > I was talking about what to do (or not to do) with our existing
>> > > regexp- and "syntax"-based fontifications.  I still remember the days
>> > > when CC Mode handled that well enough without being a snail it
>> > > frequently is now, and that was on a machine about 10 times slower
>> > > than the one I use nowadays.
>
>> > Those old versions had masses of fontification bugs in them.
>
>> I don't remember bumping into those bugs.  Or maybe they were not
>> important enough to affect my UX.  Slow redisplay, by contrast, hits
>> me _every_day_, especially if I need to work with an unoptimized
>> build.  From where I stand, the balance between performance and
>> accuracy have shifted to the worse, unfortunately.
>
>OK.  My above suggestion might give ~50% increase in fontification speed.
>
>> > People wrote bug reports about them and they got fixed.  Those fixes
>> > frequently involved a loss of speed.  :-(
>
>> If there's no way of fixing a bug without adversely affecting speed,
>> we should add user options to control those "fixes", so that people
>> could choose the balance that fits them.
>
>I think this would be a bad thing.  There are no (or very few) similar
>user options in CC Mode at the moment, and an option to fix or not fix a
>bug seems a strange idea, and would make the code quite a bit more
>complicated.
>
>> Sometimes Emacs could itself decide whether to invoke the "slow" code.
>> For example, it makes no sense for users of C to be "punished" because
>> we want more accurate fontification of C++ sources.
>
>There is some truth in this imputation, yes.
>
>> > There have also been several bug reports about unusual buffers
>> > getting fontified at the speed of continental drift, and fixing those
>> > has usually led to a little slowdown for ordinary buffers.  I'm
>> > thinking, for example, about bug #25706, where a 4 MB file took
>> > nearly an hour to scroll through on my machine.  After the fix, it
>> > took around 86 seconds.
>
>> Once again, a pathological use case should not punish the usual ones;
>> if the punishment is too harsh, there should be a way to disable the
>> support for pathological cases for those who never hit them.
>
>The punishment is rarely too harsh for a single bug.  But a lot of 2%s,
>3%s or 5%s add up over time.  If we were to outlaw a "3% fix", then many
>bugs would just be unsolvable.
>
>> > > The C language didn't change too much since then, at least not the
>> > > flavor I frequently edit.
>
>> > There are two places where CC Mode can be slow: font locking large areas
>> > of text, and keeping up with somebody typing quickly.  Which of these
>> > bothers you the most?  I have plans for speeding up one of these.
>
>> Both, I guess.  Though the former is probably more prominent, since
>> I'm not really such a fast typist, but I do happen to scroll through
>> source quite a lot.
>
>Thanks.  I'll try to come up with speedups in the coming weeks (and
>months).
>
>Do you have fast-but-imprecise-scrolling enabled?  That can reduce the
>pain.
>
>-- 
>Alan Mackenzie (Nuremberg, Germany).
>



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:16                                       ` Ergus
@ 2021-06-10 15:34                                         ` Óscar Fuentes
  2021-06-10 19:06                                           ` Ergus
  2021-06-10 15:59                                         ` Jim Porter
  2021-06-10 21:02                                         ` Stefan Monnier
  2 siblings, 1 reply; 274+ messages in thread
From: Óscar Fuentes @ 2021-06-10 15:34 UTC (permalink / raw)
  To: emacs-devel

Ergus <spacibba@aol.com> writes:

> For things like font-lock/display engine I only consider to do as much
> as possible in the C side to improve performance. And reduce as much as
> possible interacting with the lisp side... Do you think that it may be
> possible?

Before going this route, we need to check if native-comp is enough of an
improvement and, if it isn't, try to improve it.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:16                                       ` Ergus
  2021-06-10 15:34                                         ` Óscar Fuentes
@ 2021-06-10 15:59                                         ` Jim Porter
  2021-06-10 21:02                                         ` Stefan Monnier
  2 siblings, 0 replies; 274+ messages in thread
From: Jim Porter @ 2021-06-10 15:59 UTC (permalink / raw)
  To: Ergus, Alan Mackenzie
  Cc: rms, emacs-devel, rudalics, monnier, Eli Zaretskii, dancol

On 6/10/2021 8:16 AM, Ergus wrote:
> 1) What is finally the most desirable/long path/future feature? I mean,,
> finally what is preferred by the developers to support in the future?
> 
> lsp or tree-sitter?

Elsewhere in the thread, I and a few others discussed this briefly. The 
solution other editors use (and which I think is ideal) is to start with 
a base that does its best purely by looking at the syntax of the file, 
and then augment that with LSP. For Emacs and CC-mode, this could mean 
continuing to use the current implementation, or switching to something 
built on tree-sitter. Then on top of that, Emacs can consult LSP for 
more-accurate information. I'm not sure whether this means LSP would 
take over entirely or if it would merely augment the base-level 
syntactic highlighting. Figuring that out would probably require doing 
some experiments to see what the best solution for Emacs would look like.

One of the main benefits of continuing to have some form of (non-LSP) 
syntactic highlighting is that it works for everyone. Even if you don't 
have an LSP server installed, you may want to edit a source file in a 
particular language. Your LSP server of choice may also lack full 
semantic highlighting support (it's a pretty new feature, as I 
understand it). Having a reasonably-correct baseline that works 
everywhere is nice, and hopefully there are no plans to get rid of that.

LSP *may* also be too slow in some situations (though this is just a 
guess). For example, when editing a file over TRAMP, the LSP server runs 
on the remote side; if the network is slow, this could result in delayed 
fontification while editing, which reduces the usefulness of 
fontification. In addition, a new checkout of a large project won't have 
any cached LSP information, so analyzing the code enough to generate 
semantic highlighting may take some time. These might not actually be 
problems, but they do make me a bit skeptical about the performance of a 
purely LSP-based fontification system.

- Jim

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  6:39                                       ` Eli Zaretskii
@ 2021-06-10 16:46                                         ` Alan Mackenzie
  2021-06-10 17:01                                           ` Eli Zaretskii
  2021-06-10 21:06                                           ` Stefan Monnier
  0 siblings, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-10 16:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, monnier, rms, emacs-devel

Hello, Eli.

On Thu, Jun 10, 2021 at 09:39:06 +0300, Eli Zaretskii wrote:
> > Date: Wed, 9 Jun 2021 21:03:03 +0000
> > Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
> >   emacs-devel@gnu.org, rms@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > That's why we had (and still have) font-lock-maximum-decoration: so
> > > that users could control the tradeoff.  Unfortunately, support for
> > > that variable is all but absent nowadays, because of the widespread
> > > mistaken assumption that font-lock is fast enough in all modes.

> > That variable is still supported by CC Mode (with the exception of AWK
> > Mode, where it surely is not needed).

> Does it make a difference, performance-wise?  If not (which is what
> ISTR), then that variable isn't really "supported", because supporting
> it means that different values of it cause tangible differences in
> performance.

Yes, it does make a difference.  On my machine, the times to scroll
through xdisp.c with my favourite benchmark for
font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.

> > Another possibility would be to replace accurate auxiliary functionality
> > with rough and ready facilities.  In a scroll through xdisp.c, fontifying
> > as we go, the following three functions are taking around 30% of the
> > run-time:

> > (i) c-bs-at-toplevel-p, which determines whether or not a brace is at the
> >   top level.
> > (ii) c-determine-limit, c-determine-+ve-limit, which determine search
> >   limits approximately ARG non-literal characters before or after point.

> > By replacing these accurate functions with rough ones, the fontification
> > would be right most of the time, but a mess at other times (for example,
> > when there are big comments near point).  (i) is more important for C++
> > that C, but still makes a difference in C.

> > If we were to try this, I think a user toggle would be needed.

> How about making font-lock-maximum-decoration control that as well?

Maybe.  It seems, though, that f-l-max-decoration is primarily about the
degree of fontification applied, not its accuracy.

> > > > "Shouldn't try to guess" means taking a great deal of
> > > > font-lock-type-faces out of CC Mode.  I don't honestly think the end
> > > > result would be any better than what we have at the moment.

> > > You don't think it will be better for what reason?

> > Because many users will still want at least the basic types (int, double,
> > unsigned long, ....) fontified

> I'm not sure.  Can you explain why would I care too much about the
> basic types (or types in general) standing out?

Well, I care for my own personal use, because the type fontifications
help optically to separate the different parts of a function without
needing to look too hard.  The coloured bits are the variable
declarations, to a zeroth order approximation.  I suspect different users
have very different needs here.  Doesn't RMS run with font lock switched
off (or is that just a rumour)?

> > > If there's no way of fixing a bug without adversely affecting speed,
> > > we should add user options to control those "fixes", so that people
> > > could choose the balance that fits them.

> > I think this would be a bad thing.  There are no (or very few) similar
> > user options in CC Mode at the moment, and an option to fix or not fix a
> > bug seems a strange idea

> It depends on the bug.  If the bug causes Emacs to infloop or work
> very slowly, then sure, no toggle for the fix would make sense.  But I
> was talking about "bugs" that cause inaccurate or incorrect
> fontifications, and those are much "softer".  At least IMO such "bugs"
> are tolerable if they are rare enough, especially if fixing them hurts
> redisplay performance and Emacs responsiveness in general.

> Don't forget that the display code invokes fontifications also when it
> does internal layout calculations whose results are not immediately
> shown (or even not at all).  When that happens, some command not
> directly related to display could be adversely affected.  So one idea
> would be to turn off these expensive parts in those cases.

That would be difficult.  Frequently a bug fix involves extensive code
changes rather than simply a block of code one could put an `if' around.

> > > Once again, a pathological use case should not punish the usual ones;
> > > if the punishment is too harsh, there should be a way to disable the
> > > support for pathological cases for those who never hit them.

> > The punishment is rarely too harsh for a single bug.  But a lot of
> > 2%s, 3%s or 5%s add up over time.  If we were to outlaw a "3% fix",
> > then many bugs would just be unsolvable.

> Once again: what kind of "bugs" are those?

They're not of any particular kind.  Any bug fix could slow CC Mode down
marginally.  Some have been known to speed it up.

> If they only cause imperfect faces, I'm not sure it's unthinkable to
> disable them, given some optional value of a user knob.

Well, I've fixed around 550 bugs in CC Mode in the last 20 years.
Identifying and reversing a subset of these to revert the performance
would be difficult.

> > Do you have fast-but-imprecise-scrolling enabled?

> No.  That's a separate issue, and influences all the modes, even those
> where font-lock is light-weight.

You could set it buffer locally in c-mode-common-hook, for example.  It
won't solve the basic problem, but it might brighten your day up.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 16:46                                         ` Alan Mackenzie
@ 2021-06-10 17:01                                           ` Eli Zaretskii
  2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 21:06                                           ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:01 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, monnier, rms, emacs-devel

> Date: Thu, 10 Jun 2021 16:46:11 +0000
> Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
>   emacs-devel@gnu.org, rms@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > > That variable is still supported by CC Mode (with the exception of AWK
> > > Mode, where it surely is not needed).
> 
> > Does it make a difference, performance-wise?  If not (which is what
> > ISTR), then that variable isn't really "supported", because supporting
> > it means that different values of it cause tangible differences in
> > performance.
> 
> Yes, it does make a difference.  On my machine, the times to scroll
> through xdisp.c with my favourite benchmark for
> font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.

Then I suggest to set it to 2 by default.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:01                                           ` Eli Zaretskii
@ 2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 17:22                                               ` Eli Zaretskii
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-10 17:07 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, monnier, rms, emacs-devel



On June 10, 2021 10:01:49 AM Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Thu, 10 Jun 2021 16:46:11 +0000
>> Cc: dancol@dancol.org, monnier@iro.umontreal.ca, rudalics@gmx.at,
>> emacs-devel@gnu.org, rms@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>>>> That variable is still supported by CC Mode (with the exception of AWK
>>>> Mode, where it surely is not needed).
>>
>>> Does it make a difference, performance-wise?  If not (which is what
>>> ISTR), then that variable isn't really "supported", because supporting
>>> it means that different values of it cause tangible differences in
>>> performance.
>>
>> Yes, it does make a difference.  On my machine, the times to scroll
>> through xdisp.c with my favourite benchmark for
>> font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.
>
> Then I suggest to set it to 2 by default.

Performance is reasonable most of the time. If it weren't, we'd see rampant 
complaints. Emacs should default to maximum fontification. If it doesn't, 
most users won't even know they can get more.







^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:07                                             ` Daniel Colascione
@ 2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:33                                                 ` Daniel Colascione
                                                                   ` (2 more replies)
  2021-06-10 17:26                                               ` Óscar Fuentes
  2021-06-10 17:39                                               ` andrés ramírez
  2 siblings, 3 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:22 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 10 Jun 2021 10:07:52 -0700
> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> 
> > Then I suggest to set it to 2 by default.
> 
> Performance is reasonable most of the time.

Not IME.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 17:22                                               ` Eli Zaretskii
@ 2021-06-10 17:26                                               ` Óscar Fuentes
  2021-06-10 17:39                                               ` andrés ramírez
  2 siblings, 0 replies; 274+ messages in thread
From: Óscar Fuentes @ 2021-06-10 17:26 UTC (permalink / raw)
  To: emacs-devel

Daniel Colascione <dancol@dancol.org> writes:

>>> Yes, it does make a difference.  On my machine, the times to scroll
>>> through xdisp.c with my favourite benchmark for
>>> font-lock-maximum-decoration set to 3, 2, 1 are 23s, 7.5s, 5.5s.
>>
>> Then I suggest to set it to 2 by default.
>
> Performance is reasonable most of the time. If it weren't, we'd see
> rampant complaints. Emacs should default to maximum fontification. If
> it doesn't, most users won't even know they can get more.

Yes.

And it is remarkable that a thread about incorrect fontification could
yield a change on the defaults that guarantees even more incorrect
fontification :-)




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:22                                               ` Eli Zaretskii
@ 2021-06-10 17:33                                                 ` Daniel Colascione
  2021-06-10 17:39                                                   ` Eli Zaretskii
  2021-06-10 17:40                                                 ` Óscar Fuentes
  2021-06-11 16:11                                                 ` Alan Mackenzie
  2 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-10 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, rudalics, monnier, rms, emacs-devel

On 6/10/21 10:22 AM, Eli Zaretskii wrote:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Thu, 10 Jun 2021 10:07:52 -0700
>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>>
>>> Then I suggest to set it to 2 by default.
>> Performance is reasonable most of the time.
> Not IME.

Is it true that you run at -O0 and extra checking enabled?




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:33                                                 ` Daniel Colascione
@ 2021-06-10 17:39                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:39 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, rudalics, monnier, rms, emacs-devel

> Cc: acm@muc.de, emacs-devel@gnu.org, monnier@iro.umontreal.ca, rms@gnu.org,
>  rudalics@gmx.at
> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 10 Jun 2021 10:33:56 -0700
> 
> >> Performance is reasonable most of the time.
> > Not IME.
> 
> Is it true that you run at -O0 and extra checking enabled?

Sometimes, yes.  But mostly, no.  My long-term production sessions are
usually a released Emacs compiled with the default options, which
means -O2 and no --enable-checking.  I do use --with-wide-int, but
that incurs only a 30% slowdown.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:07                                             ` Daniel Colascione
  2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:26                                               ` Óscar Fuentes
@ 2021-06-10 17:39                                               ` andrés ramírez
  2 siblings, 0 replies; 274+ messages in thread
From: andrés ramírez @ 2021-06-10 17:39 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: rms, emacs-devel, rudalics, monnier, Alan Mackenzie,
	Eli Zaretskii

Hi. Daniel. Hi. Guys.
    >> Then I suggest to set it to 2 by default.

    Daniel> Performance is reasonable most of the time. If it weren't, we'd see rampant
    Daniel> complaints. Emacs should default to maximum fontification. If it doesn't, most users
    Daniel> won't even know they can get more.

On my SBC-opiplus2e I have it set to nil. And It is very slow. But I am
aware that people using slow devices are not the majority. That's one of
the reasons I sometimes fire up emacs23 (the speed of light emacs).

GC and timers should also add some weight to the slowness. Take in
account also that on phones these slowness means battery-life. Again not
all people have emacs on their phones.

Best Regards








^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:33                                                 ` Daniel Colascione
@ 2021-06-10 17:40                                                 ` Óscar Fuentes
  2021-06-10 17:44                                                   ` Eli Zaretskii
  2021-06-11 16:11                                                 ` Alan Mackenzie
  2 siblings, 1 reply; 274+ messages in thread
From: Óscar Fuentes @ 2021-06-10 17:40 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Thu, 10 Jun 2021 10:07:52 -0700
>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>> 
>> > Then I suggest to set it to 2 by default.
>> 
>> Performance is reasonable most of the time.
>
> Not IME.

But your use case is not representative, isn't it? Using a debug build
with checks enabled have a large impact on performance.

BTW, from time to time I use a 2011 netbook with an Atom CPU and have no
complaints while working with 30k lines-long machine-generated (read:
code-dense, comment-sparse, almost no withespace) C++ files.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:40                                                 ` Óscar Fuentes
@ 2021-06-10 17:44                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10 17:44 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Thu, 10 Jun 2021 19:40:56 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: Daniel Colascione <dancol@dancol.org>
> >> Date: Thu, 10 Jun 2021 10:07:52 -0700
> >> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> >> 
> >> > Then I suggest to set it to 2 by default.
> >> 
> >> Performance is reasonable most of the time.
> >
> > Not IME.
> 
> But your use case is not representative, isn't it? Using a debug build
> with checks enabled have a large impact on performance.

See my other message: you have an inaccurate impression about my use
cases.

And I don't really agree that debug builds are uninteresting: if they
are so slow, it means our fontification is borderline even on
relatively fast machines.

> BTW, from time to time I use a 2011 netbook with an Atom CPU and have no
> complaints while working with 30k lines-long machine-generated (read:
> code-dense, comment-sparse, almost no withespace) C++ files.

Exactly.  So why cannot we have the same level of performance, and
need to rely on compiler optimizations even on a fast i7?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:34                                         ` Óscar Fuentes
@ 2021-06-10 19:06                                           ` Ergus
  2021-06-10 19:28                                             ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-10 19:06 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

On Thu, Jun 10, 2021 at 05:34:31PM +0200, ï¿½scar Fuentes wrote:
>Ergus <spacibba@aol.com> writes:
>
>> For things like font-lock/display engine I only consider to do as much
>> as possible in the C side to improve performance. And reduce as much as
>> possible interacting with the lisp side... Do you think that it may be
>> possible?
>
>Before going this route, we need to check if native-comp is enough of an
>improvement and, if it isn't, try to improve it.
>
>
I work very extensively with jit compilers and similar and with
different architectures (ARM, ePIC, Intel).

In my experience the JIT improvement in performance is very significant
compared to bytecode. Specially for a similar code the difference in
time can be 1 or even 2 orders of magnitude better.

BUT

When translating from high level languages (my experience: cpython and
julia) it requires much more effort, optimization and time to improve
the compiler to get just a same order or performance than a similar C
code. Just creating a low level "intrinsic" or "binding" saves time and
relies in many other optimization the C compiler already have.

Ex: AOS vs SOA, vectorization, parallelization and similar optimizations
are very easy to do at low level (or give the hints to the compiler to
do them). But it is extremely hard to teach a high level compiler to do
themq; basically because of the data structures and types we use in high
level languages.

That's why in python it is so extended to use libraries like Pandas or
Numpy. And every time more and more python packages are just interfaces
to C libraries. Julia on the other hand provides C primitives for
everything and has primitive data types to give more hints to the
compiler... but even with that in real code it can't compare to
Python+numpy. Other languages like PHP have a very good compiler
improved for decades, but even with that, they have moved a lot of their
functionalities to C code with some bindings.

Font locking is a dynamic feature and affects responsiveness, and must
be executed in the background constantly, so even fast will be never too
fast. Responsiveness is an usual complain when new users come from
different editors. But also the languages syntax are expected to become
more complex with the time.

In our case, just accessing the buffer content and passing directly to
tree-sitter in C will be almost trivial at the low level, without types
conversions or extra copies; but also when we receive the output,
processing them with the json library we already link against will be
orders of magnitude simpler and faster, instead of converting them to
lisp object and stressing the gc with temporal objects we don't really
need, and then iterate on the lisp level.

Sadly not all the architectures are supported by libgccjit either. So a
low level call-preprocess solution will work faster for
everyone. Specially for those that still use emacs 23 due to the speed
feeling.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 19:06                                           ` Ergus
@ 2021-06-10 19:28                                             ` Eli Zaretskii
  2021-06-10 21:56                                               ` Ergus
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-10 19:28 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Thu, 10 Jun 2021 21:06:22 +0200
> From: Ergus <spacibba@aol.com>
> Cc: emacs-devel@gnu.org
> 
> >Before going this route, we need to check if native-comp is enough of an
> >improvement and, if it isn't, try to improve it.
> >
> >
> I work very extensively with jit compilers and similar and with
> different architectures (ARM, ePIC, Intel).

Our native-compilation feature is not really JIT.  Once a Lisp file
was native-compiled, it is loaded from a file and used without any JIT
step.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 15:16                                       ` Ergus
  2021-06-10 15:34                                         ` Óscar Fuentes
  2021-06-10 15:59                                         ` Jim Porter
@ 2021-06-10 21:02                                         ` Stefan Monnier
  2021-06-11 20:21                                           ` Ergus
  2 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-10 21:02 UTC (permalink / raw)
  To: Ergus; +Cc: Alan Mackenzie, Eli Zaretskii, rudalics, dancol, rms, emacs-devel

> 1) What is finally the most desirable/long path/future feature?
> I mean, finally what is preferred by the developers to support in the future?
>
> lsp or tree-sitter?
      ^^
     and


-- Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 16:46                                         ` Alan Mackenzie
  2021-06-10 17:01                                           ` Eli Zaretskii
@ 2021-06-10 21:06                                           ` Stefan Monnier
  2021-06-11  6:14                                             ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-10 21:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, dancol, rudalics, emacs-devel, rms

> Well, I've fixed around 550 bugs in CC Mode in the last 20 years.
> Identifying and reversing a subset of these to revert the performance
> would be difficult.

Clearly  what would work better is to have a clear "test case" where the
performance is poor.  Then we could investigate what is the cause of
this particular problem and see how to fix this (and hopefully other
similar) circumstance.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 19:28                                             ` Eli Zaretskii
@ 2021-06-10 21:56                                               ` Ergus
  0 siblings, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-10 21:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Thu, Jun 10, 2021 at 10:28:17PM +0300, Eli Zaretskii wrote:
>> Date: Thu, 10 Jun 2021 21:06:22 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: emacs-devel@gnu.org
>>
>> >Before going this route, we need to check if native-comp is enough of an
>> >improvement and, if it isn't, try to improve it.
>> >
>> >
>> I work very extensively with jit compilers and similar and with
>> different architectures (ARM, ePIC, Intel).
>
>Our native-compilation feature is not really JIT.  Once a Lisp file
>was native-compiled, it is loaded from a file and used without any JIT
>step.
>
Yes, I know, but it relies on the libgccjit. Doing the compilation at
once or dynamically will generate similar native code any way.

In any case the optimizations it can generate are very limited due to
the limited information about types, alignment, the complexity of data
structures and lisp types are dynamic.

Following the Andrea's blog he actually describes some of the ideas he
has to optimize the compiler. But that's still very limited and will
require a lot of work and probably some small modifications in the Elisp
syntax to make it optimal... something that will require many years and
a lot of discussions.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 21:06                                           ` Stefan Monnier
@ 2021-06-11  6:14                                             ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11  6:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  dancol@dancol.org,  rudalics@gmx.at,
>   emacs-devel@gnu.org,  rms@gnu.org
> Date: Thu, 10 Jun 2021 17:06:31 -0400
> 
> > Well, I've fixed around 550 bugs in CC Mode in the last 20 years.
> > Identifying and reversing a subset of these to revert the performance
> > would be difficult.
> 
> Clearly  what would work better is to have a clear "test case" where the
> performance is poor.  Then we could investigate what is the cause of
> this particular problem and see how to fix this (and hopefully other
> similar) circumstance.

We already have, and we already did.  (It was even mentioned in this
discussion.)



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 17:22                                               ` Eli Zaretskii
  2021-06-10 17:33                                                 ` Daniel Colascione
  2021-06-10 17:40                                                 ` Óscar Fuentes
@ 2021-06-11 16:11                                                 ` Alan Mackenzie
  2021-06-11 17:53                                                   ` Eli Zaretskii
  2 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-11 16:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, emacs-devel, monnier, rms

Hello, Eli.

On Thu, Jun 10, 2021 at 20:22:50 +0300, Eli Zaretskii wrote:
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Thu, 10 Jun 2021 10:07:52 -0700
> > Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org

> > > Then I suggest to set it to 2 by default.

> > Performance is reasonable most of the time.

> Not IME.

I have measured CC Mode's scrolling performance using:

(defmacro time-it (&rest forms)
  "Time the running of a sequence of forms using `float-time'.
Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
  `(let ((start (float-time)))
    ,@forms
    (- (float-time) start)))

together with

M-: (time-it (scroll-up-window) (sit-for 0))

on regions of text which are not yet fontified.  My window has 65 lines
of buffer text.  Starting at the middle of xdisp.c, I see the following
timings for the first few scrolls:

   0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.

That is, with the exception of the fourth timing, the scroll operation
takes a little over 1/40 second.

This is in an Emacs-28 compiled with default optimisation, on a 4
year-old first generation Ryzen machine.

For me personally, this scrolling speed, in conjunction with
fast-but-imprecise-scrolling, is acceptable.  I also accept there are
people with slower machines.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 16:11                                                 ` Alan Mackenzie
@ 2021-06-11 17:53                                                   ` Eli Zaretskii
  2021-06-11 18:02                                                     ` Daniel Colascione
  2021-06-11 18:34                                                     ` Alan Mackenzie
  0 siblings, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 17:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 16:11:19 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
>   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> I have measured CC Mode's scrolling performance using:
> 
> (defmacro time-it (&rest forms)
>   "Time the running of a sequence of forms using `float-time'.
> Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
>   `(let ((start (float-time)))
>     ,@forms
>     (- (float-time) start)))
> 
> together with
> 
> M-: (time-it (scroll-up-window) (sit-for 0))
> 
> on regions of text which are not yet fontified.  My window has 65 lines
> of buffer text.  Starting at the middle of xdisp.c, I see the following
> timings for the first few scrolls:
> 
>    0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
> 
> That is, with the exception of the fourth timing, the scroll operation
> takes a little over 1/40 second.
> 
> This is in an Emacs-28 compiled with default optimisation, on a 4
> year-old first generation Ryzen machine.
> 
> For me personally, this scrolling speed, in conjunction with
> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> people with slower machines.

I suggest to compare these times with Emacs 23 to see how we
regressed.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 17:53                                                   ` Eli Zaretskii
@ 2021-06-11 18:02                                                     ` Daniel Colascione
  2021-06-11 18:22                                                       ` Eli Zaretskii
  2021-06-11 18:42                                                       ` Stefan Monnier
  2021-06-11 18:34                                                     ` Alan Mackenzie
  1 sibling, 2 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-11 18:02 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, emacs-devel, monnier, rms

On 6/11/21 10:53 AM, Eli Zaretskii wrote:

>> Date: Fri, 11 Jun 2021 16:11:19 +0000
>> Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
>>    monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>> I have measured CC Mode's scrolling performance using:
>>
>> (defmacro time-it (&rest forms)
>>    "Time the running of a sequence of forms using `float-time'.
>> Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
>>    `(let ((start (float-time)))
>>      ,@forms
>>      (- (float-time) start)))
>>
>> together with
>>
>> M-: (time-it (scroll-up-window) (sit-for 0))
>>
>> on regions of text which are not yet fontified.  My window has 65 lines
>> of buffer text.  Starting at the middle of xdisp.c, I see the following
>> timings for the first few scrolls:
>>
>>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
>>
>> That is, with the exception of the fourth timing, the scroll operation
>> takes a little over 1/40 second.
>>
>> This is in an Emacs-28 compiled with default optimisation, on a 4
>> year-old first generation Ryzen machine.
>>
>> For me personally, this scrolling speed, in conjunction with
>> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
>> people with slower machines.
> I suggest to compare these times with Emacs 23 to see how we
> regressed.


Regression is acceptable in exchange for correctness so long as absolute 
performance is adequate. We're not using 80486s anymore.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:02                                                     ` Daniel Colascione
@ 2021-06-11 18:22                                                       ` Eli Zaretskii
  2021-06-11 18:28                                                         ` Daniel Colascione
  2021-06-11 18:47                                                         ` Alan Mackenzie
  2021-06-11 18:42                                                       ` Stefan Monnier
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 18:22 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>  emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 11 Jun 2021 11:02:34 -0700
> 
> >>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
> >>
> >> That is, with the exception of the fourth timing, the scroll operation
> >> takes a little over 1/40 second.
> >>
> >> This is in an Emacs-28 compiled with default optimisation, on a 4
> >> year-old first generation Ryzen machine.
> >>
> >> For me personally, this scrolling speed, in conjunction with
> >> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> >> people with slower machines.
> > I suggest to compare these times with Emacs 23 to see how we
> > regressed.
> 
> Regression is acceptable in exchange for correctness so long as absolute 
> performance is adequate. We're not using 80486s anymore.

Here are my times using an optimized build of Emacs 27.2 on a 3.4GHz
Core i7 box:

  0.015625
  0.03125
  0.015625
  0.046875
  0.09375
  0.0625
  0.015625
  0.03125
  0.015625
  0.03125
  0.015625
  0.03125

You consider this to be adequate performance for a single
window-scroll?  (I don't have an optimized build of Emacs 28, but
there's no reason to believe it is faster; quite the opposite.)

And here's the top part of the profile while running the above
benchmark:

  - redisplay_internal (C function)                                 159  65%
   - jit-lock-function                                              158  65%
    - jit-lock-fontify-now                                          158  65%
     - jit-lock--run-functions                                      158  65%
      - run-hook-wrapped                                            158  65%
       - #<compiled -0x1ffffffff8a67860>                            158  65%
	- font-lock-fontify-region                                  157  65%
	 - c-font-lock-fontify-region                               157  65%
	  - font-lock-default-fontify-region                        146  60%
	   - font-lock-fontify-keywords-region                      143  59%
	    - c-font-lock-declarations                               97  40%
	     - c-find-decl-spots                                     97  40%
	      - #<compiled -0x1ffffffff94b65d0>                      73  30%
	       - c-forward-decl-or-cast-1                            38  15%
		- c-forward-type                                     22   9%
		 - c-check-qualified-type                             7   2%

We can stick our heads in the sand as much as we want, but facts are
stubborn things.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:22                                                       ` Eli Zaretskii
@ 2021-06-11 18:28                                                         ` Daniel Colascione
  2021-06-11 19:12                                                           ` Alan Mackenzie
  2021-06-11 19:23                                                           ` Eli Zaretskii
  2021-06-11 18:47                                                         ` Alan Mackenzie
  1 sibling, 2 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-11 18:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, monnier, rms, rudalics

On 6/11/21 11:22 AM, Eli Zaretskii wrote:

>> Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>>   emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 11 Jun 2021 11:02:34 -0700
>>
>>>>      0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
>>>>
>>>> That is, with the exception of the fourth timing, the scroll operation
>>>> takes a little over 1/40 second.
>>>>
>>>> This is in an Emacs-28 compiled with default optimisation, on a 4
>>>> year-old first generation Ryzen machine.
>>>>
>>>> For me personally, this scrolling speed, in conjunction with
>>>> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
>>>> people with slower machines.
>>> I suggest to compare these times with Emacs 23 to see how we
>>> regressed.
>> Regression is acceptable in exchange for correctness so long as absolute
>> performance is adequate. We're not using 80486s anymore.
> Here are my times using an optimized build of Emacs 27.2 on a 3.4GHz
> Core i7 box:
>
>    0.015625
>    0.03125
>    0.015625
>    0.046875
>    0.09375
>    0.0625
>    0.015625
>    0.03125
>    0.015625
>    0.03125
>    0.015625
>    0.03125
>
> You consider this to be adequate performance for a single
> window-scroll?  (I don't have an optimized build of Emacs 28, but
> there's no reason to believe it is faster; quite the opposite.)

native-comp?

>
> And here's the top part of the profile while running the above
> benchmark:
>
>    - redisplay_internal (C function)                                 159  65%
>     - jit-lock-function                                              158  65%
>      - jit-lock-fontify-now                                          158  65%
>       - jit-lock--run-functions                                      158  65%
>        - run-hook-wrapped                                            158  65%
>         - #<compiled -0x1ffffffff8a67860>                            158  65%
> 	- font-lock-fontify-region                                  157  65%
> 	 - c-font-lock-fontify-region                               157  65%
> 	  - font-lock-default-fontify-region                        146  60%
> 	   - font-lock-fontify-keywords-region                      143  59%
> 	    - c-font-lock-declarations                               97  40%
> 	     - c-find-decl-spots                                     97  40%
> 	      - #<compiled -0x1ffffffff94b65d0>                      73  30%
> 	       - c-forward-decl-or-cast-1                            38  15%
> 		- c-forward-type                                     22   9%
> 		 - c-check-qualified-type                             7   2%
>
> We can stick our heads in the sand as much as we want, but facts are
> stubborn things.

Hrm. That doesn't seem consistent with Alan's report that we spend a ton 
of time doing work like deciding whether a brace occurs at top-level. My 
question stands: what core facilities can we add to accelerate cc-mode's 
parsing here? There's got to be some efficiency we can gain here.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 17:53                                                   ` Eli Zaretskii
  2021-06-11 18:02                                                     ` Daniel Colascione
@ 2021-06-11 18:34                                                     ` Alan Mackenzie
  1 sibling, 0 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-11 18:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, emacs-devel, monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 20:53:10 +0300, Eli Zaretskii wrote:
> > Date: Fri, 11 Jun 2021 16:11:19 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
> >   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > I have measured CC Mode's scrolling performance using:

> > (defmacro time-it (&rest forms)
> >   "Time the running of a sequence of forms using `float-time'.
> > Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
> >   `(let ((start (float-time)))
> >     ,@forms
> >     (- (float-time) start)))

> > together with

> > M-: (time-it (scroll-up-window) (sit-for 0))

> > on regions of text which are not yet fontified.  My window has 65 lines
> > of buffer text.  Starting at the middle of xdisp.c, I see the following
> > timings for the first few scrolls:

> >    0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.

> > That is, with the exception of the fourth timing, the scroll operation
> > takes a little over 1/40 second.

> > This is in an Emacs-28 compiled with default optimisation, on a 4
> > year-old first generation Ryzen machine.

> > For me personally, this scrolling speed, in conjunction with
> > fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> > people with slower machines.

> I suggest to compare these times with Emacs 23 to see how we
> regressed.

OK, on emacs-23.3 -Q, otherwise exactly the same circumstances,  I get
these timings:

    0.0093s, 0.0089s, 0.0084s, 0.0144s, 0.0094s, 0.0084s.

So the difference is around a factor of 3, perhaps a little more.  "Half
an order of magnitude" perhaps sums it up best.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:02                                                     ` Daniel Colascione
  2021-06-11 18:22                                                       ` Eli Zaretskii
@ 2021-06-11 18:42                                                       ` Stefan Monnier
  2021-06-11 19:31                                                         ` Eli Zaretskii
  2021-06-11 19:48                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-11 18:42 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Eli Zaretskii, Alan Mackenzie, rudalics, rms, emacs-devel

>>> M-: (time-it (scroll-up-window) (sit-for 0))
>>>
>>> on regions of text which are not yet fontified.  My window has 65 lines
>>> of buffer text.  Starting at the middle of xdisp.c, I see the following
>>> timings for the first few scrolls:
>>>
>>>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.
>>>
>>> That is, with the exception of the fourth timing, the scroll operation
>>> takes a little over 1/40 second.

FWIW, see below my measurements using Emacs's `master` with the compile
options I happen to use (i.e. extra checks and -Og) on my almost still
new Librem mini (which was running at ~4GHz during that time, so I'd
expect it to be about twice as fast as what I'd see with most of my
other machines).

I used pretty much your above test, except I started it at BOB of
xdisp.c and used:

    M_: (dotimes (_ 700)
          (message "%S" (benchmark-elapse (scroll-up) (sit-for 0)))
          (sleep-for 0.05))

As you can see, the speed doesn't get noticeably worse as we go further
into the file (the first few screenfuls were a bit faster, but after
that it's a wash).

Eli, do you see similar results?
Would you say that this shows the slow behaviors that bother you?

E.g. there used to be a time where I found CC-mode unusably slow in some
cases, but these were typically while editing rather than while
scrolling (i.e. even simple buffer modifications incurred delays
measured in seconds).

FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
about as well as CC-mode, but solves an easier problem since it doesn't
try to handle as much of C as CC-mode does (e.g. no support for K&R, no
highlighting of types), nor does it try to handle C++, Java, ...), and
most of the times for it are between 0.02 and 0.04.


        Stefan


0.064421217
0.053839483
0.044885893
0.043997597
0.092963631
0.042693633
0.099381721
0.128002545
0.094545212
0.163745893
0.105989807
0.055903411
0.10900201
0.111409377
0.194541712
0.153313625
0.109133194
0.245912877
0.283898722
0.322253988
0.193199986
0.118792661
0.199273251
0.130073771
0.196526213
0.115176187
0.172246163
0.126954015
0.167790303
0.109683224
0.127960426
0.171211834
0.114826649
0.114200165
0.189255059
0.116494687
0.130894337
0.096504964
0.159958961
0.109482149
0.114501549
0.108971213
0.177497445
0.108309423
0.105783162
0.168977695
0.100549815
0.103706478
0.092696073
0.10432707
0.173879201
0.175646549
0.111500935
0.118432586
0.12011896
0.172456591
0.177120433
0.104136632
0.099530361
0.127795086
0.176916521
0.138550313
0.114789114
0.17856252
0.118454075
0.117133346
0.101272965
0.099138115
0.116056804
0.126078027
0.163253319
0.127315341
0.185968183
0.124788531
0.182340263
0.130805359
0.121452585
0.125351387
0.139440851
0.107537455
0.186858364
0.120384479
0.133356353
0.131290683
0.168943064
0.136992182
0.12346563
0.112871744
0.122782849
0.106021947
0.262331903
0.118295096
0.185145874
0.118002528
0.24931801
0.104444512
0.120716476
0.167382408
0.11458813
0.125722018
0.098804093
0.179202455
0.12640851
0.174556734
0.11220414
0.109073496
0.111259698
0.11418513
0.108025927
0.122940442
0.191836234
0.113417345
0.120711433
0.172513071
0.114420954
0.106913074
0.120929181
0.112327071
0.115024723
0.112698933
0.117357841
0.171556781
0.108914295
0.122564707
0.171817164
0.124725584
0.116682097
0.110775186
0.189281251
0.123835457
0.116927855
0.122824897
0.195963401
0.127717141
0.142261624
0.209577271
0.13124328
0.105018838
0.140045227
0.117403158
0.170725313
0.11485384
0.09650258
0.110668479
0.117975569
0.113766316
0.112954986
0.174354914
0.112653174
0.127833658
0.180525967
0.108714222
0.114321764
0.181837745
0.105400609
0.116630508
0.118542553
0.110567673
0.110128366
0.118041019
0.118595549
0.115326382
0.109436946
0.115400399
0.111347021
0.18042566
0.118994131
0.115646883
0.104489214
0.130443576
0.115561413
0.11330043
0.170368134
0.119400101
0.110526952
0.114555681
0.112566447
0.115888525
0.113044462
0.188394244
0.119813541
0.126508385
0.108936934
0.188695379
0.128201612
0.097573221
0.204059025
0.122495487
0.116058655
0.201060241
0.212312514
0.190197327
0.149310586
0.260126393
0.115149485
0.125418184
0.18942006
0.118149107
0.117835293
0.172673811
0.119002821
0.126055033
0.187659521
0.185205005
0.182849187
0.116862462
0.113111974
0.12542143
0.175057205
0.126864337
0.176218651
0.105942454
0.191953882
0.109533068
0.126686414
0.12604197
0.110572607
0.169785167
0.192896603
0.124740572
0.105335305
0.181733158
0.126450975
0.193657901
0.109094171
0.121084347
0.119585141
0.18067882
0.124754366
0.121194971
0.113421604
0.199118707
0.120581752
0.123201428
0.112947635
0.199405214
0.118820273
0.194467066
0.139457159
0.122085324
0.207810103
0.127238785
0.142071442
0.135402281
0.185030134
0.117510442
0.130970326
0.203497039
0.112685073
0.123192423
0.114474405
0.117449097
0.119929349
0.178402479
0.044960255
0.184778022
0.13186773
0.113406861
0.121064466
0.121199285
0.17902352
0.126698071
0.121117545
0.120106224
0.105877834
0.122465264
0.119232435
0.122804551
0.181922471
0.108515085
0.137086941
0.183930017
0.115787167
0.11794999
0.121208862
0.1163856
0.112712585
0.125896637
0.116050806
0.122970697
0.209042021
0.114536011
0.12732074
0.11918999
0.126965367
0.114274393
0.110505228
0.124278297
0.126557099
0.139104688
0.187700593
0.148332242
0.122385495
0.119986772
0.13254469
0.11980965
0.120371393
0.118032327
0.125577788
0.116801037
0.134561984
0.123288516
0.203589458
0.133222843
0.120893941
0.115931088
0.055410411
0.189834458
0.122465816
0.113808715
0.125036054
0.130117908
0.118056582
0.122033541
0.116559544
0.125301083
0.116939394
0.111072544
0.058279055
0.18299224
0.109533422
0.127404332
0.049588377
0.126764598
0.123352779
0.178826006
0.142064223
0.123598934
0.135688938
0.116330035
0.132189803
0.11364705
0.123380271
0.122618636
0.121231604
0.124962892
0.127782382
0.115393903
0.127666529
0.128069211
0.127825324
0.202362599
0.148214603
0.047154273
0.207692422
0.140212978
0.131902642
0.126609117
0.131795201
0.146280119
0.132221744
0.15205663
0.149419624
0.12882288
0.143127792
0.127806696
0.108093882
0.127447566
0.125514061
0.151355249
0.142197844
0.128111287
0.126984641
0.111681458
0.059230937
0.18387953
0.131424016
0.127260813
0.123185942
0.12301305
0.198837465
0.201908502
0.118353592
0.116802308
0.220584087
0.122908136
0.143131345
0.195682054
0.056502461
0.120201008
0.127099372
0.111206286
0.120740443
0.139805891
0.130569691
0.121373414
0.128916776
0.116291152
0.129381268
0.12844324
0.132286855
0.127196939
0.119538936
0.119440131
0.055392152
0.133164942
0.123128391
0.119713239
0.122640955
0.140901944
0.232109835
0.05099387
0.157270089
0.120189717
0.149400334
0.148006771
0.135561395
0.114432766
0.124214831
0.127588578
0.133017473
0.127939599
0.129795445
0.124383374
0.130871816
0.059958365
0.13643085
0.116147862
0.126365698
0.126586554
0.073479631
0.113455843
0.138131483
0.122755288
0.130340758
0.123803689
0.133688313
0.133731208
0.058815443
0.129348146
0.21755063
0.042720055
0.205209919
0.053552988
0.135913777
0.129090985
0.126755467
0.054468177
0.143547192
0.127561568
0.125057471
0.135675527
0.153656282
0.135994205
0.124501244
0.126975785
0.212048008
0.052969663
0.208492798
0.136217596
0.128715781
0.130466549
0.133027079
0.142198028
0.139497757
0.107914384
0.157040902
0.132118866
0.059025412
0.231755541
0.072431096
0.237608949
0.132376274
0.128392519
0.048447206
0.143444946
0.216714502
0.113765814
0.138750278
0.054166292
0.125345491
0.123151466
0.130740236
0.130040994
0.132336243
0.131015359
0.120283476
0.0819551
0.12897466
0.119582391
0.099782304
0.146355992
0.071737352
0.12601306
0.119875996
0.149615954
0.164011924
0.113920676
0.148488657
0.204961462
0.046083269
0.132240597
0.120290724
0.133797299
0.118957392
0.050409076
0.132897864
0.130272527
0.123380237
0.124661648
0.117979764
0.051353461
0.137747574
0.135186988
0.144030045
0.139358571
0.143437272
0.150285771
0.123336112
0.13059152
0.134459129
0.057900639
0.128294124
0.130709936
0.142835662
0.124570338
0.157248237
0.05272827
0.159828425
0.131068281
0.129289379
0.159432997
0.050178936
0.137730101
0.139991591
0.117379933
0.152119491
0.130505849
0.131793548
0.133941495
0.132393249
0.064910953
0.135175145
0.137056085
0.128050285
0.124205486
0.054973603
0.137880713
0.13144247
0.130749717
0.132145462
0.129195918
0.133067051
0.050073898
0.137713601
0.127667884
0.066216357
0.218338173
0.137968321
0.063958844
0.134524174
0.129759992
0.132279705
0.05465503
0.136361192
0.137618427
0.135622113
0.135047758
0.076815814
0.134360997
0.131905126
0.135704933
0.132362084
0.058676068
0.131205555
0.138812341
0.129555152
0.137330679
0.054380999
0.240273459
0.062407346
0.150339943
0.141908294
0.07518077
0.12682164
0.152640432
0.147469145
0.054359045
0.157950419
0.246097952
0.061191826
0.228765419
0.145952982
0.053982302
0.137021757
0.138045217
0.123689763
0.069250831
0.125857663
0.117612075
0.138906825
0.064224531
0.14983241
0.142864541
0.141848561
0.12703107
0.134363931
0.129297234
0.143293068
0.05942274
0.151750642
0.129740556
0.141618794
0.157558284
0.15051779
0.130591822
0.147420673
0.129570717
0.066815203
0.127132384
0.129291855
0.237577516
0.150971169
0.133464307
0.136805642
0.137268469
0.138594698
0.058999963
0.151746548
0.148502547
0.126773309
0.079724232
0.134307193
0.164472372
0.159711969
0.148037259
0.14977967
0.16937488
0.048476567
0.154464051
0.041739919
let: End of buffer




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:22                                                       ` Eli Zaretskii
  2021-06-11 18:28                                                         ` Daniel Colascione
@ 2021-06-11 18:47                                                         ` Alan Mackenzie
  2021-06-11 19:32                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-11 18:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, emacs-devel, monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 21:22:56 +0300, Eli Zaretskii wrote:
> > Cc: rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
> >  emacs-devel@gnu.org
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Fri, 11 Jun 2021 11:02:34 -0700

> > >>     0.026s, 0.025s, 0.026s, 0.078s, 0.026s, 0.027s.

> > >> That is, with the exception of the fourth timing, the scroll operation
> > >> takes a little over 1/40 second.

> > >> This is in an Emacs-28 compiled with default optimisation, on a 4
> > >> year-old first generation Ryzen machine.

> > >> For me personally, this scrolling speed, in conjunction with
> > >> fast-but-imprecise-scrolling, is acceptable.  I also accept there are
> > >> people with slower machines.
> > > I suggest to compare these times with Emacs 23 to see how we
> > > regressed.

> > Regression is acceptable in exchange for correctness so long as absolute 
> > performance is adequate. We're not using 80486s anymore.

> Here are my times using an optimized build of Emacs 27.2 on a 3.4GHz
> Core i7 box:

How many buffer lines were in your window?

>   0.015625
>   0.03125
>   0.015625
>   0.046875
>   0.09375
>   0.0625
>   0.015625
>   0.03125
>   0.015625
>   0.03125
>   0.015625
>   0.03125

> You consider this to be adequate performance for a single
> window-scroll?  (I don't have an optimized build of Emacs 28, but
> there's no reason to believe it is faster; quite the opposite.)

What does adequate mean?  With those timings, the font-locking would keep
up with an auto-repeated C-v at around 30 repetitions per second.

[ .... ]

> We can stick our heads in the sand as much as we want, but facts are
> stubborn things.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:28                                                         ` Daniel Colascione
@ 2021-06-11 19:12                                                           ` Alan Mackenzie
  2021-06-11 19:23                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-11 19:12 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, Eli Zaretskii, emacs-devel, monnier, rms

Hello, Daniel.

On Fri, Jun 11, 2021 at 11:28:18 -0700, Daniel Colascione wrote:

[ .... ]

> native-comp?

Native compilation speeds up CC Mode only marginally.  On basically the
same benchmark, it was 13% faster with N.C.

> Hrm. That doesn't seem consistent with Alan's report that we spend a ton 
> of time doing work like deciding whether a brace occurs at top-level. My 
> question stands: what core facilities can we add to accelerate cc-mode's 
> parsing here? There's got to be some efficiency we can gain here.

My gut feeling, not really backed up by much, is that only something like
LSP is really going to help.  There's nothing particularly inefficient in
CC Mode's fontification, it just does a very thorough job.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:28                                                         ` Daniel Colascione
  2021-06-11 19:12                                                           ` Alan Mackenzie
@ 2021-06-11 19:23                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:23 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> Cc: acm@muc.de, rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>  emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 11 Jun 2021 11:28:18 -0700
> 
> >    0.015625
> >    0.03125
> >    0.015625
> >    0.046875
> >    0.09375
> >    0.0625
> >    0.015625
> >    0.03125
> >    0.015625
> >    0.03125
> >    0.015625
> >    0.03125
> >
> > You consider this to be adequate performance for a single
> > window-scroll?  (I don't have an optimized build of Emacs 28, but
> > there's no reason to believe it is faster; quite the opposite.)
> 
> native-comp?

No (it's Emacs 27).  But Alan already timed the native and non-native
versions in Emacs 28, and didn't find any tangible difference.  In
fact, the byte-compiled code was slightly faster.

> > 	- font-lock-fontify-region                                  157  65%
> > 	 - c-font-lock-fontify-region                               157  65%
> > 	  - font-lock-default-fontify-region                        146  60%
> > 	   - font-lock-fontify-keywords-region                      143  59%
> > 	    - c-font-lock-declarations                               97  40%
> > 	     - c-find-decl-spots                                     97  40%
> > 	      - #<compiled -0x1ffffffff94b65d0>                      73  30%
> > 	       - c-forward-decl-or-cast-1                            38  15%
> > 		- c-forward-type                                     22   9%
> > 		 - c-check-qualified-type                             7   2%
> >
> > We can stick our heads in the sand as much as we want, but facts are
> > stubborn things.
> 
> Hrm. That doesn't seem consistent with Alan's report that we spend a ton 
> of time doing work like deciding whether a brace occurs at top-level.

Maybe we should produce profiles on different systems and compare
them, so that we are sure the data is solid and repeatable?

Or maybe Alan was talking about Emacs 28, where something has changed
considerably?

> My question stands: what core facilities can we add to accelerate
> cc-mode's parsing here? There's got to be some efficiency we can
> gain here.

I don't think we have an answer to that.  Alan, do you have some
suggestions?

If we don't have anything we already figured out, I guess the answer
should be found by studying the code of the hot spots and looking for
optimization opportunities or algorithmic changes.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:42                                                       ` Stefan Monnier
@ 2021-06-11 19:31                                                         ` Eli Zaretskii
  2021-06-11 19:57                                                           ` Stefan Monnier
  2021-06-11 20:06                                                           ` Alan Mackenzie
  2021-06-11 19:48                                                         ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
>   rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 14:42:31 -0400
> 
> I used pretty much your above test, except I started it at BOB of
> xdisp.c and used:
> 
>     M_: (dotimes (_ 700)
>           (message "%S" (benchmark-elapse (scroll-up) (sit-for 0)))
>           (sleep-for 0.05))
> 
> As you can see, the speed doesn't get noticeably worse as we go further
> into the file (the first few screenfuls were a bit faster, but after
> that it's a wash).

Can you produce a profile for that?

> Eli, do you see similar results?

Will Emacs 27.2 do?  If you must see results from an optimized build
of Emacs 28, I'll have to build one first.

> Would you say that this shows the slow behaviors that bother you?

Of course.  100 msec for a single window-scroll is awfully slow.
Especially since the display code itself takes only a fraction of
that time.

> E.g. there used to be a time where I found CC-mode unusably slow in some
> cases, but these were typically while editing rather than while
> scrolling (i.e. even simple buffer modifications incurred delays
> measured in seconds).

Yes, there are other use cases, but even this simple benchmark already
shows that we have a serious problem, IMO.  Compare this with Emacs 23
or with Emacs 28 in Fundamental mode.

> FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
> about as well as CC-mode, but solves an easier problem since it doesn't
> try to handle as much of C as CC-mode does (e.g. no support for K&R, no
> highlighting of types), nor does it try to handle C++, Java, ...), and
> most of the times for it are between 0.02 and 0.04.

That is much better, but still too slow, IMO.  Think: it's the time
that it takes us to fontify a single windowful, only a couple of
dozens of lines.  Why does it take so long?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:47                                                         ` Alan Mackenzie
@ 2021-06-11 19:32                                                           ` Eli Zaretskii
  2021-06-11 19:46                                                             ` Alan Mackenzie
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:32 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 18:47:37 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
>   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> How many buffer lines were in your window?

34.  It was in "emacs -Q".



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:32                                                           ` Eli Zaretskii
@ 2021-06-11 19:46                                                             ` Alan Mackenzie
  2021-06-11 19:50                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-11 19:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, emacs-devel, monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 22:32:39 +0300, Eli Zaretskii wrote:
> > Date: Fri, 11 Jun 2021 18:47:37 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, rudalics@gmx.at,
> >   monnier@iro.umontreal.ca, rms@gnu.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > How many buffer lines were in your window?

> 34.  It was in "emacs -Q".

Thanks.  I didn't know emacs -Q on a GUI always gave the same window
height.  On my tty, I get 65 lines.

So, given your windows are about half the height of mine, our timings
were broadly comparable.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 18:42                                                       ` Stefan Monnier
  2021-06-11 19:31                                                         ` Eli Zaretskii
@ 2021-06-11 19:48                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
>  rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 14:42:31 -0400
> 
> Eli, do you see similar results?

With Emacs 27.2 built with -O2, I see the same times as for Alan's
benchmark: between 30 and 50 msec per scroll.  With Emacs 28 built
with -O0, I see times that are roughly double of what you show:
average of 200 msec per scroll.  The factor of 2 wrt a -Og build is
expected, so I think our results are consistent.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:46                                                             ` Alan Mackenzie
@ 2021-06-11 19:50                                                               ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-11 19:50 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 19:46:10 +0000
> Cc: dancol@dancol.org, rudalics@gmx.at, monnier@iro.umontreal.ca, rms@gnu.org,
>   emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > > How many buffer lines were in your window?
> 
> > 34.  It was in "emacs -Q".
> 
> Thanks.  I didn't know emacs -Q on a GUI always gave the same window
> height.  On my tty, I get 65 lines.
> 
> So, given your windows are about half the height of mine, our timings
> were broadly comparable.

??? Your window was twice as high, but your times are 30% shorter.
How do you conclude that the times are comparable?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:31                                                         ` Eli Zaretskii
@ 2021-06-11 19:57                                                           ` Stefan Monnier
  2021-06-11 23:25                                                             ` Ergus
  2021-06-12  6:38                                                             ` Eli Zaretskii
  2021-06-11 20:06                                                           ` Alan Mackenzie
  1 sibling, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-11 19:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, acm, rudalics, rms, emacs-devel

> Will Emacs 27.2 do?  If you must see results from an optimized build
> of Emacs 28, I'll have to build one first.

As mentioned, mine was not an optimized build, on the contrary.

>> FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
>> about as well as CC-mode, but solves an easier problem since it doesn't
>> try to handle as much of C as CC-mode does (e.g. no support for K&R, no
>> highlighting of types), nor does it try to handle C++, Java, ...), and
>> most of the times for it are between 0.02 and 0.04.
>
> That is much better, but still too slow, IMO.  Think: it's the time
> that it takes us to fontify a single windowful, only a couple of
> dozens of lines.  Why does it take so long?

For comparison, here it is for lisp/subr.el: it seems actually slightly
slower than what I for with xdisp.c when using sm-c-mode.


        Stefan


0.075539393
0.030856317
0.040824289
0.029961978
0.012222597
0.020277377
0.08354889
0.027791121
0.040834603
0.029304419
0.040401518
0.042230931
0.041249748
0.041759172
0.022928028
0.088205301
0.01791448
0.039915906
0.041701691
0.036885006
0.037948645
0.039082061
0.03723824
0.090796121
0.021685216
0.040341389
0.041098352
0.012256459
0.038770756
0.047087185
0.036423884
0.04461722
0.082821279
0.02936458
0.038498799
0.029450549
0.039748644
0.037981817
0.041704413
0.03614839
0.040829019
0.03792122
0.088297379
0.031529965
0.038930449
0.035313203
0.040872462
0.040254486
0.043807937
0.037344524
0.041937701
0.086986891
0.02128011
0.038679053
0.037372497
0.042372958
0.045191831
0.026552158
0.038718167
0.040198771
0.086453442
0.020748667
0.036524354
0.038769191
0.036234863
0.0399449
0.040732675
0.039041865
0.037608296
0.078606241
0.022010691
0.03774944
0.028604627
0.040171841
0.039866605
0.035715879
0.041613829
0.035701447
0.037601563
0.085249827
0.018101252
0.041692999
0.033016519
0.037679106
0.039894138
0.036513263
0.04271547
0.038203434
0.089595139
0.022256597
0.040981642
0.037780354
0.036986214
0.033088927
0.03626288
0.037085366
0.023201762
0.088236647
0.019961394
0.033811429
0.040647559
0.034390619
0.039764122
0.022225068
0.026550511
0.037590894
0.085617853
0.022528
0.040158828
0.039360065
0.015918947
let: End of buffer




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:31                                                         ` Eli Zaretskii
  2021-06-11 19:57                                                           ` Stefan Monnier
@ 2021-06-11 20:06                                                           ` Alan Mackenzie
  2021-06-12  6:44                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-11 20:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, dancol, emacs-devel, Stefan Monnier, rms

Hello, Eli.

On Fri, Jun 11, 2021 at 22:31:45 +0300, Eli Zaretskii wrote:
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
> >   rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> > Date: Fri, 11 Jun 2021 14:42:31 -0400

[ .... ]

> > Would you say that this shows the slow behaviors that bother you?

> Of course.  100 msec for a single window-scroll is awfully slow.
> Especially since the display code itself takes only a fraction of
> that time.

> > E.g. there used to be a time where I found CC-mode unusably slow in
> > some cases, but these were typically while editing rather than while
> > scrolling (i.e. even simple buffer modifications incurred delays
> > measured in seconds).

> Yes, there are other use cases, but even this simple benchmark already
> shows that we have a serious problem, IMO.  Compare this with Emacs 23
> or with Emacs 28 in Fundamental mode.

Why do we have a problem?  If the time taken to fontify a window is less
than the auto-repeat time (the two times are close on a modern machine),
this is surely not a problem for somebody with such a machine.  It could
be a problem for somebody with a slower machine, or running an
unoptimised Emacs.

> > FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
> > about as well as CC-mode, but solves an easier problem since it doesn't
> > try to handle as much of C as CC-mode does (e.g. no support for K&R, no
> > highlighting of types), nor does it try to handle C++, Java, ...), and
> > most of the times for it are between 0.02 and 0.04.

> That is much better, but still too slow, IMO.  Think: it's the time
> that it takes us to fontify a single windowful, only a couple of
> dozens of lines.  Why does it take so long?

It does a very thorough job.  For example, one bug fix from many years
ago that I remember involved the fontification of foo in the following:

        ....
        int bar;
    } foo;

What face should foo have?  To answer that, you've got to go back over
the brace expression to see what's there.  If it's

    struct foo
    {
        int baz;
        ....

, we need font-lock-variable-name-face for foo.  On the other hand, if we
have

    typedef struct foo
    {
        int baz;
        ....

, we need font-lock-type-face.  Before the bug fix, foo just got variable
name face.  scan-lists backward over the brace expression takes time,
particularly for something the size of struct frame or even bigger.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10 21:02                                         ` Stefan Monnier
@ 2021-06-11 20:21                                           ` Ergus
  2021-06-11 20:27                                             ` Stefan Monnier
  0 siblings, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-11 20:21 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Alan Mackenzie, Eli Zaretskii, rudalics, dancol, rms, emacs-devel

On Thu, Jun 10, 2021 at 05:02:09PM -0400, Stefan Monnier wrote:
>> 1) What is finally the most desirable/long path/future feature?
>> I mean, finally what is preferred by the developers to support in the future?
>>
>> lsp or tree-sitter?
>      ^^
>     and
>
>
>-- Stefan
>
For what I know about tree-sitter it does not provide the parsers with
the library. Usually they require to be distributed with the programs
with one parser/language. They are veen in different github
repositories.

In Rust application (like helix editor) that's not an issue because
cargo handles that. But for emacs I don't know how can be solved the
technical and legal issues with the dependencies.

Are we going to add their source code to emacs? In systems like mine,
tree-sitter is a package, but the parsers are not.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:21                                           ` Ergus
@ 2021-06-11 20:27                                             ` Stefan Monnier
  2021-06-11 20:37                                               ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-11 20:27 UTC (permalink / raw)
  To: Ergus; +Cc: Alan Mackenzie, Eli Zaretskii, rudalics, dancol, rms, emacs-devel

> For what I know about tree-sitter it does not provide the parsers with
> the library.

Of course, not, how could it?  There's a never-ending stream of
programming languages out there.

I don't see why you think that's a problem,


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:27                                             ` Stefan Monnier
@ 2021-06-11 20:37                                               ` Daniel Colascione
  2021-06-11 20:52                                                 ` Stefan Monnier
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-11 20:37 UTC (permalink / raw)
  To: Stefan Monnier, Ergus
  Cc: Alan Mackenzie, Eli Zaretskii, emacs-devel, rms, rudalics

On 6/11/21 1:27 PM, Stefan Monnier wrote:

>> For what I know about tree-sitter it does not provide the parsers with
>> the library.
> Of course, not, how could it?  There's a never-ending stream of
> programming languages out there.
>
> I don't see why you think that's a problem,

It's not just licensing.

Another problem with stock tree-sitter is that it makes Emacs less 
self-hosting. Tree-sitter grammars are written in JavaScript. You don't 
need JavaScript to use a grammar, but you do need JavaScript to 
customize a grammar. In addition, Tree-sitter compiles these JavaScript 
grammars to C. To use a customized grammar, an Emacs user would have to 
run node.js (or equally capable JS environment), generate C code, 
compile that C code, and load it into Emacs as a module. That's a big 
departure from the traditional approach to Emacs customization.

These technical choices on the part of the Tree-sitter people are 
unfortunate. I'd prefer an elisp reimplementation of the Tree-sitter 
algorithms, but I doubt we're going to get that any time soon.

Maybe Tree-sitter could be changed to generate an elisp parser and 
compile parsers in a lightweight JS environment like Duktape.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:37                                               ` Daniel Colascione
@ 2021-06-11 20:52                                                 ` Stefan Monnier
  2021-06-12  6:46                                                   ` Eli Zaretskii
  2021-06-12  8:47                                                   ` Daniele Nicolodi
  0 siblings, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-11 20:52 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Ergus, Alan Mackenzie, Eli Zaretskii, rudalics, rms, emacs-devel

> Another problem with stock tree-sitter is that it makes Emacs less
> self-hosting. Tree-sitter grammars are written in JavaScript.

Yes, there are some technical disadvantages to tree-sitter, indeed.
None of them make it unusable, but they do make it less convenient for
ELisp hackers and Emacs users.  So it's not a perfect solution, but
I don't think that should mean we don't want it in our toolbox.

        Stefan

PS: I think we can expect 99% of Emacs users have a Javascript engine
already installed (in the form of a web browser), and with native
compilation Emacs now comes with a runtime dependency on (a substantial
chunk of) GCC, so the extra dependencies introduced by tree-sitter seem
quite workable for the Emacs end-user.  The ELisp hackers working on the
major mode who want to tweak the grammar will suffer more, tho.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:57                                                           ` Stefan Monnier
@ 2021-06-11 23:25                                                             ` Ergus
  2021-06-11 23:52                                                               ` Óscar Fuentes
  2021-06-12  5:20                                                               ` Theodor Thornhill
  2021-06-12  6:38                                                             ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Ergus @ 2021-06-11 23:25 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, dancol, acm, rudalics, rms, emacs-devel

Going a bit more into this. And reconsidering tree-sitter.

As there is already a tree-sitter module package with some interesting
functionalities. (I know Eli didn't like some details in it's
implementation)

But maybe it is a good time to try to disable the cc-mode font-locking
(I don't actually know if it is possible to do that), and repeat the
scrolling benchmark only with the tree-sitter-mode and
tree-sitter-hl-mode enabled?

Just to see how it compares and how much of that approach is useful?



On Fri, Jun 11, 2021 at 03:57:10PM -0400, Stefan Monnier wrote:
>> Will Emacs 27.2 do?  If you must see results from an optimized build
>> of Emacs 28, I'll have to build one first.
>
>As mentioned, mine was not an optimized build, on the contrary.
>
>>> FWIW, I ran this same test with `sm-c-mode` (which should handle `xdisp.c`
>>> about as well as CC-mode, but solves an easier problem since it doesn't
>>> try to handle as much of C as CC-mode does (e.g. no support for K&R, no
>>> highlighting of types), nor does it try to handle C++, Java, ...), and
>>> most of the times for it are between 0.02 and 0.04.
>>
>> That is much better, but still too slow, IMO.  Think: it's the time
>> that it takes us to fontify a single windowful, only a couple of
>> dozens of lines.  Why does it take so long?
>
>For comparison, here it is for lisp/subr.el: it seems actually slightly
>slower than what I for with xdisp.c when using sm-c-mode.
>
>
>        Stefan
>
>
>0.075539393
>0.030856317
>0.040824289
>0.029961978
>0.012222597
>0.020277377
>0.08354889
>0.027791121
>0.040834603
>0.029304419
>0.040401518
>0.042230931
>0.041249748
>0.041759172
>0.022928028
>0.088205301
>0.01791448
>0.039915906
>0.041701691
>0.036885006
>0.037948645
>0.039082061
>0.03723824
>0.090796121
>0.021685216
>0.040341389
>0.041098352
>0.012256459
>0.038770756
>0.047087185
>0.036423884
>0.04461722
>0.082821279
>0.02936458
>0.038498799
>0.029450549
>0.039748644
>0.037981817
>0.041704413
>0.03614839
>0.040829019
>0.03792122
>0.088297379
>0.031529965
>0.038930449
>0.035313203
>0.040872462
>0.040254486
>0.043807937
>0.037344524
>0.041937701
>0.086986891
>0.02128011
>0.038679053
>0.037372497
>0.042372958
>0.045191831
>0.026552158
>0.038718167
>0.040198771
>0.086453442
>0.020748667
>0.036524354
>0.038769191
>0.036234863
>0.0399449
>0.040732675
>0.039041865
>0.037608296
>0.078606241
>0.022010691
>0.03774944
>0.028604627
>0.040171841
>0.039866605
>0.035715879
>0.041613829
>0.035701447
>0.037601563
>0.085249827
>0.018101252
>0.041692999
>0.033016519
>0.037679106
>0.039894138
>0.036513263
>0.04271547
>0.038203434
>0.089595139
>0.022256597
>0.040981642
>0.037780354
>0.036986214
>0.033088927
>0.03626288
>0.037085366
>0.023201762
>0.088236647
>0.019961394
>0.033811429
>0.040647559
>0.034390619
>0.039764122
>0.022225068
>0.026550511
>0.037590894
>0.085617853
>0.022528
>0.040158828
>0.039360065
>0.015918947
>let: End of buffer
>
>



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:25                                                             ` Ergus
@ 2021-06-11 23:52                                                               ` Óscar Fuentes
  2021-06-12  1:08                                                                 ` Ergus
  2021-06-12  6:50                                                                 ` Eli Zaretskii
  2021-06-12  5:20                                                               ` Theodor Thornhill
  1 sibling, 2 replies; 274+ messages in thread
From: Óscar Fuentes @ 2021-06-11 23:52 UTC (permalink / raw)
  To: emacs-devel

Ergus <spacibba@aol.com> writes:

> Going a bit more into this. And reconsidering tree-sitter.
>
> As there is already a tree-sitter module package with some interesting
> functionalities. (I know Eli didn't like some details in it's
> implementation)
>
> But maybe it is a good time to try to disable the cc-mode font-locking
> (I don't actually know if it is possible to do that), and repeat the
> scrolling benchmark only with the tree-sitter-mode and
> tree-sitter-hl-mode enabled?
>
> Just to see how it compares and how much of that approach is useful?

More easily, you can use some of the editors that already use
tree-sitter for fontification of C/C++ and do the PgDn test (which looks
like a rather silly test to me, because who navigates large files by
holding PgDn and why Emacs should support that terrible use case well?)
This would provide a valuable comparison point for little effort.

Although I'm more interested on accuracy, but it seems that the thread
was effectively and hopelessly hijacked :-/




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:52                                                               ` Óscar Fuentes
@ 2021-06-12  1:08                                                                 ` Ergus
  2021-06-12  3:20                                                                   ` Stefan Monnier
  2021-06-12  6:58                                                                   ` Eli Zaretskii
  2021-06-12  6:50                                                                 ` Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Ergus @ 2021-06-12  1:08 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

On Sat, Jun 12, 2021 at 01:52:12AM +0200, ï¿½scar Fuentes wrote:
>Ergus <spacibba@aol.com> writes:
>
>> Going a bit more into this. And reconsidering tree-sitter.
>>
>> As there is already a tree-sitter module package with some interesting
>> functionalities. (I know Eli didn't like some details in it's
>> implementation)
>>
>> But maybe it is a good time to try to disable the cc-mode font-locking
>> (I don't actually know if it is possible to do that), and repeat the
>> scrolling benchmark only with the tree-sitter-mode and
>> tree-sitter-hl-mode enabled?
>>
>> Just to see how it compares and how much of that approach is useful?
>
>More easily, you can use some of the editors that already use
>tree-sitter for fontification of C/C++ and do the PgDn test 

We have all the lisp machine overhead in the middle. So doing this will
be like comparing apples with pears.

>(which looks
>like a rather silly test to me, because who navigates large files by
>holding PgDn and why Emacs should support that terrible use case well?)

The scrolling test is because during scrolling we call re-display,
font-look and some hooks. So it is the easiest way to measure all the
syntax highlighting in action.

>This would provide a valuable comparison point for little effort.
>

I don't think so. The tree-sitter mode does not require special effort
to install. And comparing emacs vs emacs is more realistic and useful
IMHO (neovim redisplay is ridiculously fast).

But any way just to start: tree-sitter parses all the text in xdisp.c,
(in my machine), in 0.12 seconds from scratch and re-parses it (reusing
the tree) 10 times faster; in 0.008 ~ 0.01 seconds.

If we count that we don't need to re-parse the file (buffer), but only
the modified regions (that is possible to specify with the api). Then
the times are ridiculous small.

In this case the parse is mostly already done, so scrolling won't need
to parse the text to add the highlighting... so maybe we need something
else to measure the impact (maybe something that modifies the text)

BTW: Eli was concerned about the extra copy of the buffer text to send
it to tree-sitter. In this case the time to memcopy an array with all
xdisp text is ~0.00085 seconds.

Any way if we don't want the copy we can use
ts_parser_set_included_ranges to exclude the gap and pass the text
pointer directly without any copy.

>Although I'm more interested on accuracy, but it seems that the thread
>was effectively and hopelessly hijacked :-/
>
>
To improve accuracy we need to improve the parsing OR add more work to
cc-mode. So that's why we are looking for alternatives.

There are already some interesting information to see tree sitter in
action:

https://www.youtube.com/watch?v=ZwibVdNtFjs
https://www.youtube.com/watch?v=oSrXK8ovBfQ

Where you can actually see that the accuracy should also improve (and
probably some navigation commands)

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  1:08                                                                 ` Ergus
@ 2021-06-12  3:20                                                                   ` Stefan Monnier
  2021-06-12 11:07                                                                     ` Ergus
  2021-06-12  6:58                                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12  3:20 UTC (permalink / raw)
  To: Ergus; +Cc: Óscar Fuentes, emacs-devel

> But any way just to start: tree-sitter parses all the text in xdisp.c,
> (in my machine), in 0.12 seconds from scratch and re-parses it (reusing
> the tree) 10 times faster; in 0.008 ~ 0.01 seconds.

Do those times include passing the result of the parse to ELisp and
processing it by applying text-properties for highlighting or is it just
the time for tree-sitter to do the actual parse for itself?


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:25                                                             ` Ergus
  2021-06-11 23:52                                                               ` Óscar Fuentes
@ 2021-06-12  5:20                                                               ` Theodor Thornhill
  2021-06-12 13:40                                                                 ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Theodor Thornhill @ 2021-06-12  5:20 UTC (permalink / raw)
  To: Ergus, Stefan Monnier
  Cc: Eli Zaretskii, dancol, acm, rudalics, rms, emacs-devel

Ergus <spacibba@aol.com> writes:

> Going a bit more into this. And reconsidering tree-sitter.
>
> As there is already a tree-sitter module package with some interesting
> functionalities. (I know Eli didn't like some details in it's
> implementation)

This module us used by csharp mode, in its own
`csharp-tree-sitter-mode`, and uses these packages from melpa:

 - tree-sitter-mode
 - tree-sitter-langs
 - tree-sitter-indent

This bug (https://github.com/emacs-csharp/csharp-mode/issues/164) is an
even simpler test of the performance from CC Mode, which alan _is_
addressing right now, but should be interesting given this thread.
csharp-mode grinds to a halt here, but csharp-tree-sitter-mode handles
this perfectly.

As for correctness there is no comparison.  The tree sitter variant
covers things that aren't even possible using CC Mode variant, like
string interpolation, complicated, nested generics, preprocessor
directives and much, much more. 

@Stefan - I'm not sure I understand what you mean by troublesome for
elisp hackers.  These grammars have a lisp-like dsl, and is pretty
usable through C-M-x and defvars, see:
https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.

IME experience it's not the same as normal elisp hacking, but it's good
enough.  That's only an opinion though.

I don't understand why this shouldn't be doable? There could be an nongnu-ELPA
package with defined grammars, and a way to download and compile
parsers.

As a side node, it is easy to design structural editing also, like
`delete-defun` `change-in-string`, `beginning-and-end-of-defun` etc.

Just take a small look at
https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el.

These 410 lines covers way more than what CC Mode is atm.  It would be
*great* to move the tree sitter part to emacs.

Just my two cents :)

Theodor Thornhill

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 19:57                                                           ` Stefan Monnier
  2021-06-11 23:25                                                             ` Ergus
@ 2021-06-12  6:38                                                             ` Eli Zaretskii
  2021-06-12 13:44                                                               ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: dancol@dancol.org,  acm@muc.de,  rudalics@gmx.at,  rms@gnu.org,
>   emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 15:57:10 -0400
> 
> > Will Emacs 27.2 do?  If you must see results from an optimized build
> > of Emacs 28, I'll have to build one first.
> 
> As mentioned, mine was not an optimized build, on the contrary.

Well, using -Og _is_ optimizing, albeit less than -O2 does.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:06                                                           ` Alan Mackenzie
@ 2021-06-12  6:44                                                             ` Eli Zaretskii
  2021-06-12  8:00                                                               ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, dancol, emacs-devel, monnier, rms

> Date: Fri, 11 Jun 2021 20:06:30 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, dancol@dancol.org,
>   rudalics@gmx.at, rms@gnu.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> Why do we have a problem?  If the time taken to fontify a window is less
> than the auto-repeat time (the two times are close on a modern machine),
> this is surely not a problem for somebody with such a machine.  It could
> be a problem for somebody with a slower machine, or running an
> unoptimised Emacs.

It is a problem given how much the current fast machines can do during
that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
machine instructions.

> > That is much better, but still too slow, IMO.  Think: it's the time
> > that it takes us to fontify a single windowful, only a couple of
> > dozens of lines.  Why does it take so long?
> 
> It does a very thorough job.


AFAIU, sm-c-mode doesn't.  And it still takes tens of milliseconds.

> For example, one bug fix from many years
> ago that I remember involved the fontification of foo in the following:
> 
>         ....
>         int bar;
>     } foo;
> 
> What face should foo have?  To answer that, you've got to go back over
> the brace expression to see what's there.  If it's
> 
>     struct foo
>     {
>         int baz;
>         ....
> 
> , we need font-lock-variable-name-face for foo.  On the other hand, if we
> have
> 
>     typedef struct foo
>     {
>         int baz;
>         ....
> 
> , we need font-lock-type-face.  Before the bug fix, foo just got variable
> name face.  scan-lists backward over the brace expression takes time,
> particularly for something the size of struct frame or even bigger.

We should either find a way of making this analysis faster, or give up
on fontifying these two cases differently.  It is IMO unacceptable
that redisplay is slowed down so much by mode-specific fontifications.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:52                                                 ` Stefan Monnier
@ 2021-06-12  6:46                                                   ` Eli Zaretskii
  2021-06-12  8:03                                                     ` Daniel Colascione
  2021-06-12  8:47                                                   ` Daniele Nicolodi
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: spacibba, rms, emacs-devel, rudalics, acm, dancol

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ergus <spacibba@aol.com>,  Alan Mackenzie <acm@muc.de>,  Eli Zaretskii
>  <eliz@gnu.org>,  rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
> Date: Fri, 11 Jun 2021 16:52:37 -0400
> 
> > Another problem with stock tree-sitter is that it makes Emacs less
> > self-hosting. Tree-sitter grammars are written in JavaScript.
> 
> Yes, there are some technical disadvantages to tree-sitter, indeed.
> None of them make it unusable, but they do make it less convenient for
> ELisp hackers and Emacs users.  So it's not a perfect solution, but
> I don't think that should mean we don't want it in our toolbox.

I agree that these issues shouldn't prevent us from trying to use TS,
at least as an option.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 23:52                                                               ` Óscar Fuentes
  2021-06-12  1:08                                                                 ` Ergus
@ 2021-06-12  6:50                                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:50 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

> From: Óscar Fuentes <ofv@wanadoo.es>
> Date: Sat, 12 Jun 2021 01:52:12 +0200
> 
> More easily, you can use some of the editors that already use
> tree-sitter for fontification of C/C++ and do the PgDn test

I don't think that would teach us much, due to stark differences in
architectural design.  AFAIK, those other editors don't even implement
buffer text similar enough to how we do that, and that has significant
influence on the efficiency.

> (which looks like a rather silly test to me, because who navigates
> large files by holding PgDn and why Emacs should support that
> terrible use case well?)

We use it because it's easy, and because it measures the time it takes
to fontify a single window, not necessarily because this is what Emacs
users do most of the time.  If you want to suggest other use cases,
feel free, we can add them to the suite.  I'm quite sure the results
will be similar, modulo the redisplay optimizations.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  1:08                                                                 ` Ergus
  2021-06-12  3:20                                                                   ` Stefan Monnier
@ 2021-06-12  6:58                                                                   ` Eli Zaretskii
  2021-06-12 11:01                                                                     ` Ergus
  2021-06-12 14:00                                                                     ` Stefan Monnier
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  6:58 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 03:08:44 +0200
> From: Ergus <spacibba@aol.com>
> Cc: emacs-devel@gnu.org
> 
> BTW: Eli was concerned about the extra copy of the buffer text to send
> it to tree-sitter. In this case the time to memcopy an array with all
> xdisp text is ~0.00085 seconds.

If the intent is to use buffer-(sub)string, then you forget the price
of consing.  That would trigger frequent GC cycles, which will all but
kill the otherwise fast performance.

> Any way if we don't want the copy we can use
> ts_parser_set_included_ranges to exclude the gap and pass the text
> pointer directly without any copy.

I hope someone will try that and report the results.

The other design issue with TS integration is that I'd like it to plug
into the JIT font-lock interface of the display engine, so that we
don't unnecessarily fontify parts of the buffer that won't be
displayed, and always do fontify the parts that will be.  I don't
really care if TS actually processes a much larger chunk of text, if
it does that quickly enough, but processing the resulting faces will
take time on the Emacs side, and that is better avoided.  More
importantly, integration into JIT font-lock machinery means we don't
need to use other hooks, which is a step back, since using such hooks
for fontification was already shown to have serious problems in pre-21
Emacs: they don't always catch all the changes which require
re-fontification.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:44                                                             ` Eli Zaretskii
@ 2021-06-12  8:00                                                               ` Daniel Colascione
  2021-06-12  8:08                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-12  8:00 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: rudalics, emacs-devel, monnier, rms



On June 11, 2021 11:45:04 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> Date: Fri, 11 Jun 2021 20:06:30 +0000
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, dancol@dancol.org,
>> rudalics@gmx.at, rms@gnu.org, emacs-devel@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>>
>> Why do we have a problem?  If the time taken to fontify a window is less
>> than the auto-repeat time (the two times are close on a modern machine),
>> this is surely not a problem for somebody with such a machine.  It could
>> be a problem for somebody with a slower machine, or running an
>> unoptimised Emacs.
>
> It is a problem given how much the current fast machines can do during
> that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
> machine instructions.


And if you count electrons, the numbers are in the trillions. So what? Who 
cares how many machine instructions it is? What matters is the latency.
>
>
>>> That is much better, but still too slow, IMO.  Think: it's the time
>>> that it takes us to fontify a single windowful, only a couple of
>>> dozens of lines.  Why does it take so long?
>>
>> It does a very thorough job.
>
>
> AFAIU, sm-c-mode doesn't.  And it still takes tens of milliseconds.
>
>> For example, one bug fix from many years
>> ago that I remember involved the fontification of foo in the following:
>>
>> ....
>> int bar;
>> } foo;
>>
>> What face should foo have?  To answer that, you've got to go back over
>> the brace expression to see what's there.  If it's
>>
>> struct foo
>> {
>> int baz;
>> ....
>>
>> , we need font-lock-variable-name-face for foo.  On the other hand, if we
>> have
>>
>> typedef struct foo
>> {
>> int baz;
>> ....
>>
>> , we need font-lock-type-face.  Before the bug fix, foo just got variable
>> name face.  scan-lists backward over the brace expression takes time,
>> particularly for something the size of struct frame or even bigger.
>
> We should either find a way of making this analysis faster, or give up
> on fontifying these two cases differently.  It is IMO unacceptable
> that redisplay is slowed down so much by mode-specific fontifications.

This is a great example of where incorrect fontification diminishes the 
utility of syntax highlighting more generally. If I can't trust the color 
of a symbol to distinguish a variable declaration from a type declaration, 
why bother fontifying as either? Maybe you'd be more interested in a basic 
c-mode that fontified only comments and strings, and that very quickly, but 
I wouldn't want that.






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:46                                                   ` Eli Zaretskii
@ 2021-06-12  8:03                                                     ` Daniel Colascione
  2021-06-12  8:13                                                       ` Eli Zaretskii
  2021-06-12 13:51                                                       ` Stefan Monnier
  0 siblings, 2 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-06-12  8:03 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: acm, spacibba, emacs-devel, rms, rudalics



On June 11, 2021 11:46:18 PM Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Ergus <spacibba@aol.com>,  Alan Mackenzie <acm@muc.de>,  Eli Zaretskii
>> <eliz@gnu.org>,  rudalics@gmx.at,  rms@gnu.org,  emacs-devel@gnu.org
>> Date: Fri, 11 Jun 2021 16:52:37 -0400
>>
>>> Another problem with stock tree-sitter is that it makes Emacs less
>>> self-hosting. Tree-sitter grammars are written in JavaScript.
>>
>> Yes, there are some technical disadvantages to tree-sitter, indeed.
>> None of them make it unusable, but they do make it less convenient for
>> ELisp hackers and Emacs users.  So it's not a perfect solution, but
>> I don't think that should mean we don't want it in our toolbox.
>
> I agree that these issues shouldn't prevent us from trying to use TS,
> at least as an option.


Sure, but it'd be nice to package TS in such a way that it becomes more 
idiomatically lispy, at least if TS becomes the primary fontification 
engine for some modes. At the very least, it should be possible for users 
to apply ad hoc fontification on top of whatever TS supports. And how could 
something like TS work with, say, bison and flex files without fully 
general multi-mode support (which we also lack)?







^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:00                                                               ` Daniel Colascione
@ 2021-06-12  8:08                                                                 ` Eli Zaretskii
  2021-06-12  9:31                                                                   ` Alan Mackenzie
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  8:08 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel, monnier, rms, rudalics

> From: Daniel Colascione <dancol@dancol.org>
> CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <rms@gnu.org>, <emacs-devel@gnu.org>
> Date: Sat, 12 Jun 2021 01:00:22 -0700
> 
> > It is a problem given how much the current fast machines can do during
> > that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
> > machine instructions.
> 
> And if you count electrons, the numbers are in the trillions. So what? Who 
> cares how many machine instructions it is? What matters is the latency.

I'm saying that, given how much these machines can do in 30 msec, it
doesn't sound right that we cannot refontify 35 lines of text with all
that processing power.  It tells me that our code is either very
inefficient or does a lot of unnecessary processing (or both).

Alan thought that the performance we have is acceptable.  The numbers
I mentioned would hopefully convince him otherwise.

> > We should either find a way of making this analysis faster, or give up
> > on fontifying these two cases differently.  It is IMO unacceptable
> > that redisplay is slowed down so much by mode-specific fontifications.
> 
> This is a great example of where incorrect fontification diminishes the 
> utility of syntax highlighting more generally. If I can't trust the color 
> of a symbol to distinguish a variable declaration from a type declaration, 
> why bother fontifying as either?

I think we are saying the same, just in different words.

Do you agree that slowing down redisplay so much due to fontification
is unacceptable?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:03                                                     ` Daniel Colascione
@ 2021-06-12  8:13                                                       ` Eli Zaretskii
  2021-06-12 13:51                                                       ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12  8:13 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: spacibba, rms, emacs-devel, rudalics, monnier, acm

> From: Daniel Colascione <dancol@dancol.org>
> CC: <spacibba@aol.com>, <acm@muc.de>, <rudalics@gmx.at>, <rms@gnu.org>, <emacs-devel@gnu.org>
> Date: Sat, 12 Jun 2021 01:03:03 -0700
> 
> >> Yes, there are some technical disadvantages to tree-sitter, indeed.
> >> None of them make it unusable, but they do make it less convenient for
> >> ELisp hackers and Emacs users.  So it's not a perfect solution, but
> >> I don't think that should mean we don't want it in our toolbox.
> >
> > I agree that these issues shouldn't prevent us from trying to use TS,
> > at least as an option.
> 
> Sure, but it'd be nice to package TS in such a way that it becomes more 
> idiomatically lispy, at least if TS becomes the primary fontification 
> engine for some modes. At the very least, it should be possible for users 
> to apply ad hoc fontification on top of whatever TS supports.

I agree.  Do you consider these goals impractical for some reason?  If
not, then (assuming we otherwise like the results of using TS in
Emacs) we could work towards those goals as followup.

> And how could something like TS work with, say, bison and flex files
> without fully general multi-mode support (which we also lack)?

Good question.  Shouldn't limiting TS to the relevant portions of
buffer text provide the solution?  If not, perhaps we should ask the
TS folks what they suggest.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-11 20:52                                                 ` Stefan Monnier
  2021-06-12  6:46                                                   ` Eli Zaretskii
@ 2021-06-12  8:47                                                   ` Daniele Nicolodi
  2021-06-12  8:57                                                     ` tomas
  2021-06-12 14:04                                                     ` Stefan Monnier
  1 sibling, 2 replies; 274+ messages in thread
From: Daniele Nicolodi @ 2021-06-12  8:47 UTC (permalink / raw)
  To: emacs-devel

On 11/06/2021 22:52, Stefan Monnier wrote:
> PS: I think we can expect 99% of Emacs users have a Javascript engine
> already installed (in the form of a web browser),

The JS engine in a web browser and node are two very different beasts.
The main problem with node is that it is very hard to get it to work in
a self contained way that does not involve downloading JS packages from
the network. I also anticipate some resistance in the Emacs community on
depending on the JS ecosystem where licensing is much more "liberal"
than within the GNU project.

Cheers,
Dan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:47                                                   ` Daniele Nicolodi
@ 2021-06-12  8:57                                                     ` tomas
  2021-06-12 14:04                                                     ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: tomas @ 2021-06-12  8:57 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Sat, Jun 12, 2021 at 10:47:28AM +0200, Daniele Nicolodi wrote:
> On 11/06/2021 22:52, Stefan Monnier wrote:
> > PS: I think we can expect 99% of Emacs users have a Javascript engine
> > already installed (in the form of a web browser),
> 
> The JS engine in a web browser and node are two very different beasts.

I try to keep both of them at a safe distance, FWIW.

> The main problem with node is that it is very hard to get it to work in
> a self contained way that does not involve downloading JS packages from
> the network.

Yep. This is one of the reasons. Watching with horror some npm build
process (gotta do that from time to time to earn my beans) has borne
the clear decision: for me, just... no. Not in my free time, not as
a voluntary project.

>              I also anticipate some resistance in the Emacs community on
> depending on the JS ecosystem where licensing is much more "liberal"
> than within the GNU project.

Those "liberal" licenses could easily be re-licensed to GPL. A possible
advantage is that we get to enjoy the loud whining about how that's
unfair, while the whiners are usually fine with other actors taking the
software proprietary ;-)

(I know, I know)

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:08                                                                 ` Eli Zaretskii
@ 2021-06-12  9:31                                                                   ` Alan Mackenzie
  0 siblings, 0 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-12  9:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Daniel Colascione, emacs-devel, monnier, rms

Hello, Eli.

On Sat, Jun 12, 2021 at 11:08:30 +0300, Eli Zaretskii wrote:
> > From: Daniel Colascione <dancol@dancol.org>
> > CC: <monnier@iro.umontreal.ca>, <rudalics@gmx.at>, <rms@gnu.org>, <emacs-devel@gnu.org>
> > Date: Sat, 12 Jun 2021 01:00:22 -0700

> > > It is a problem given how much the current fast machines can do during
> > > that time.  At 3 GHz, 30 msec of CPU time is equivalent to 100 million
> > > machine instructions.

> > And if you count electrons, the numbers are in the trillions. So what? Who 
> > cares how many machine instructions it is? What matters is the latency.

> I'm saying that, given how much these machines can do in 30 msec, it
> doesn't sound right that we cannot refontify 35 lines of text with all
> that processing power.  It tells me that our code is either very
> inefficient or does a lot of unnecessary processing (or both).

Or, due to the quirks of the CC Mode languages, it simply needs that
much processing to do an accurate job (or all three).

> Alan thought that the performance we have is acceptable.  The numbers
> I mentioned would hopefully convince him otherwise.

I think the performance is fully acceptable to a normal user on a 3.4
GHz modern machine.  If the processing power is available, why not make
use of it?

> > > We should either find a way of making this analysis faster, or give up
> > > on fontifying these two cases differently.  It is IMO unacceptable
> > > that redisplay is slowed down so much by mode-specific fontifications.

As mentioned already, we have the facility of setting
font-lock-maximum-decoration to 2, which triples fontification speed.
This comes at the cost of accuracy.  A lot of the bug reports I've
fielded over the years have been about fontification inaccuracies.

> > This is a great example of where incorrect fontification diminishes the 
> > utility of syntax highlighting more generally. If I can't trust the color 
> > of a symbol to distinguish a variable declaration from a type declaration, 
> > why bother fontifying as either?

> I think we are saying the same, just in different words.

> Do you agree that slowing down redisplay so much due to fontification
> is unacceptable?

I think I would answer that on a modern machine (certainly from the last
5 years), using a normally optimised Emacs build, the fontification
isn't slowed down.  On a somewhat slower machine, it could become
unacceptable, but that there are accpetable workarounds (setting
font-lock-maximum-decoration, or enabling fast-but-imprecise-scrolling,
or enabling deferred fontification).  On a much slower machine, the
above doesn't hold, no.

I can't agree that we should expect C Mode to fontify with around the
same amount of processing as Emacs Lisp Mode.  This isn't reasonable.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:58                                                                   ` Eli Zaretskii
@ 2021-06-12 11:01                                                                     ` Ergus
  2021-06-12 11:25                                                                       ` Eli Zaretskii
  2021-06-12 14:00                                                                     ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-12 11:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Sat, Jun 12, 2021 at 09:58:58AM +0300, Eli Zaretskii wrote:
>> Date: Sat, 12 Jun 2021 03:08:44 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: emacs-devel@gnu.org
>>
>> BTW: Eli was concerned about the extra copy of the buffer text to send
>> it to tree-sitter. In this case the time to memcopy an array with all
>> xdisp text is ~0.00085 seconds.
>
>If the intent is to use buffer-(sub)string, then you forget the price
>of consing.  That would trigger frequent GC cycles, which will all but
>kill the otherwise fast performance.
>
>> Any way if we don't want the copy we can use
>> ts_parser_set_included_ranges to exclude the gap and pass the text
>> pointer directly without any copy.
>
>I hope someone will try that and report the results.
>
>The other design issue with TS integration is that I'd like it to plug
>into the JIT font-lock interface of the display engine, so that we
>don't unnecessarily fontify parts of the buffer that won't be
>displayed, and always do fontify the parts that will be. 

If I understand something about our cc-mode functionalities (and many of
those functionalities we don't want to loose like indentation and code
navigation). Probably the "right" way to use tree-sitter (maybe Alan
wants give a more precise technical description) is not only fontify but
use the tree information to add contextual information to the text
(something that I think cc-mode does.) And then let font-lock do the
magic.

The tree-sitter tree is basically contextual information, and (for
example) if we have processed the whole buffer and we already have the
tree, then scrolling won't need to parse anything, adding or removing
text is a localized modification, so with the previous tree we can
re-parse only the modified region. The choice may be then if we
propertize the text of the whole buffer or just the visible region OR if
we want to "propertize on demand".

This will save us from the hard parsing in cc-mode to fontify "on the
fly".

> I don't
>really care if TS actually processes a much larger chunk of text, if
>it does that quickly enough, but processing the resulting faces will
>take time on the Emacs side, and that is better avoided. 

But then we won't get all the contextual information we need for
indentation, code navigation or fold the code right?

so we'll be still "sub-utilizing" the tree sitter features that may give
useful functionalities we already have in cc-mode, and we may also like
to have in other more "limited" modes.

> More
>importantly, integration into JIT font-lock machinery means we don't
>need to use other hooks, which is a step back, since using such hooks
>for fontification was already shown to have serious problems in pre-21
>Emacs: they don't always catch all the changes which require
>re-fontification.
>
I see two approaches here:

1) add the tree-sitter properties/faces to the buffer text (fully or
partially on the visible regions)

2) use the tree-sitter information directly from the tree and add the
visible properties from there.

This second one will require a more complete api of tree-sitter
functions exposed to elisp, but in my opinion it worth it in accuracy,
speed and simplicity (a single API to rule them all). And to support
many languages we don't actually have like rust or the fancy C++ > 11. 

+

Remember that TS has the partial parsing options (specifying the regions
to parse), the re-parsing option (using a previous tree for the same
buffer as a hint which reduces the times abruptly), or even a tree
comparison function that produces a new tree with the differences with
the "hint" tree to know what needs to be updated.

Plus all the navigation function like find parent or child nodes,
parsing error handling, iterate over nodes and so on.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  3:20                                                                   ` Stefan Monnier
@ 2021-06-12 11:07                                                                     ` Ergus
  0 siblings, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-12 11:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Óscar Fuentes, emacs-devel

On Fri, Jun 11, 2021 at 11:20:58PM -0400, Stefan Monnier wrote:
>> But any way just to start: tree-sitter parses all the text in xdisp.c,
>> (in my machine), in 0.12 seconds from scratch and re-parses it (reusing
>> the tree) 10 times faster; in 0.008 ~ 0.01 seconds.
>


>or is it just
>the time for tree-sitter to do the actual parse for itself?
>

This one. 

It was just a 5 minutes benchmark:

https://github.com/Ergus/tree-sitter-benchmark


>
>        Stefan
>



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 11:01                                                                     ` Ergus
@ 2021-06-12 11:25                                                                       ` Eli Zaretskii
  2021-06-12 15:04                                                                         ` Ergus
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12 11:25 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 13:01:03 +0200
> From: Ergus <spacibba@aol.com>
> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
> 
> If I understand something about our cc-mode functionalities (and many of
> those functionalities we don't want to loose like indentation and code
> navigation). Probably the "right" way to use tree-sitter (maybe Alan
> wants give a more precise technical description) is not only fontify but
> use the tree information to add contextual information to the text
> (something that I think cc-mode does.) And then let font-lock do the
> magic.
> 
> The tree-sitter tree is basically contextual information, and (for
> example) if we have processed the whole buffer and we already have the
> tree, then scrolling won't need to parse anything, adding or removing
> text is a localized modification, so with the previous tree we can
> re-parse only the modified region. The choice may be then if we
> propertize the text of the whole buffer or just the visible region OR if
> we want to "propertize on demand".
> 
> This will save us from the hard parsing in cc-mode to fontify "on the
> fly".

I'm not sure I understand what you are suggesting.  Can you describe
your suggestion in terms of 'face' text properties and the 'fontified'
property, and explain how those should fit into the existing redisplay
mechanisms?

> > I don't
> >really care if TS actually processes a much larger chunk of text, if
> >it does that quickly enough, but processing the resulting faces will
> >take time on the Emacs side, and that is better avoided. 
> 
> But then we won't get all the contextual information we need for
> indentation, code navigation or fold the code right?

Why not?

> I see two approaches here:
> 
> 1) add the tree-sitter properties/faces to the buffer text (fully or
> partially on the visible regions)
> 
> 2) use the tree-sitter information directly from the tree and add the
> visible properties from there.
> 
> This second one will require a more complete api of tree-sitter
> functions exposed to elisp, but in my opinion it worth it in accuracy,
> speed and simplicity (a single API to rule them all). And to support
> many languages we don't actually have like rust or the fancy C++ > 11. 

Why can't we have both?  The information you are talking about, which
is needed by Emacs features other than fontification, can be used by
those other Emacs features when needed.  You seem to be saying that
these two alternatives are mutually-exclusive, but you didn't explain
why.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  5:20                                                               ` Theodor Thornhill
@ 2021-06-12 13:40                                                                 ` Stefan Monnier
  2021-06-12 15:56                                                                   ` Theodor Thornhill
  0 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 13:40 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, rms, emacs-devel

> @Stefan - I'm not sure I understand what you mean by troublesome for
> elisp hackers.  These grammars have a lisp-like dsl, and is pretty
> usable through C-M-x and defvars, see:
> https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.

AFAIK the grammar itself is still written in Javascript.

> IME experience it's not the same as normal elisp hacking, but it's good
> enough.  That's only an opinion though.

The disadvantages I see for ELisp hackers are just technical hurdles
that can be overcome with extra tooling.  I'm not particularly worried
about them, indeed.

> These 410 lines covers way more than what CC Mode is atm.  It would be
> *great* to move the tree sitter part to emacs.

Agreed.  Maybe a first step would be to get copyright assignments and
include the tree sitter module in GNU ELPA?


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:38                                                             ` Eli Zaretskii
@ 2021-06-12 13:44                                                               ` Stefan Monnier
  2021-06-12 14:14                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 13:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, acm, rudalics, rms, emacs-devel

Eli Zaretskii [2021-06-12 09:38:09] wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: dancol@dancol.org,  acm@muc.de,  rudalics@gmx.at,  rms@gnu.org,
>>   emacs-devel@gnu.org
>> Date: Fri, 11 Jun 2021 15:57:10 -0400
>> > Will Emacs 27.2 do?  If you must see results from an optimized build
>> > of Emacs 28, I'll have to build one first.
>> As mentioned, mine was not an optimized build, on the contrary.
> Well, using -Og _is_ optimizing, albeit less than -O2 does.

Ah, I see what you mean.  For me the first step of optimizing is to
disable the extra checks, whereas this was built with

    --enable-checking --enable-check-lisp-object-type


-- Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:03                                                     ` Daniel Colascione
  2021-06-12  8:13                                                       ` Eli Zaretskii
@ 2021-06-12 13:51                                                       ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 13:51 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Eli Zaretskii, spacibba, acm, rudalics, rms, emacs-devel

> Sure, but it'd be nice to package TS in such a way that it becomes more
> idiomatically lispy, at least if TS becomes the primary fontification engine
> for some modes. At the very least, it should be possible for users to apply
> ad hoc fontification on top of whatever TS supports. And how could something
> like TS work with, say, bison and flex files without fully general
> multi-mode support (which we also lack)?

While I don't think you can compose compiled tree-sitter grammars, the
source grammars can easily be composed, so tree-sitter is perfectly able
to handle "multi-mode" buffers.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  6:58                                                                   ` Eli Zaretskii
  2021-06-12 11:01                                                                     ` Ergus
@ 2021-06-12 14:00                                                                     ` Stefan Monnier
  2021-06-12 14:20                                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 14:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ergus, ofv, emacs-devel

> The other design issue with TS integration is that I'd like it to plug
> into the JIT font-lock interface of the display engine, so that we
> don't unnecessarily fontify parts of the buffer that won't be
> displayed, and always do fontify the parts that will be.

Hm... AFAIK that's already what emacs-tree-sitter does.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12  8:47                                                   ` Daniele Nicolodi
  2021-06-12  8:57                                                     ` tomas
@ 2021-06-12 14:04                                                     ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 14:04 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: emacs-devel

Daniele Nicolodi [2021-06-12 10:47:28] wrote:
> On 11/06/2021 22:52, Stefan Monnier wrote:
>> PS: I think we can expect 99% of Emacs users have a Javascript engine
>> already installed (in the form of a web browser),
> The JS engine in a web browser and node are two very different beasts.
> The main problem with node is that it is very hard to get it to work in
> a self contained way that does not involve downloading JS packages from
> the network. I also anticipate some resistance in the Emacs community on
> depending on the JS ecosystem where licensing is much more "liberal"
> than within the GNU project.

I prefer to look at it as "how will we do it" than "what problems may
prevent us from doing it".  Otherwise, we'll never get there.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 13:44                                                               ` Stefan Monnier
@ 2021-06-12 14:14                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12 14:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel, rms, rudalics

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: dancol@dancol.org,  acm@muc.de,  rudalics@gmx.at,  rms@gnu.org,
>   emacs-devel@gnu.org
> Date: Sat, 12 Jun 2021 09:44:10 -0400
> 
> > Well, using -Og _is_ optimizing, albeit less than -O2 does.
> 
> Ah, I see what you mean.  For me the first step of optimizing is to
> disable the extra checks, whereas this was built with
> 
>     --enable-checking --enable-check-lisp-object-type

I think the effect of these on redisplay speed is exaggerated.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 14:00                                                                     ` Stefan Monnier
@ 2021-06-12 14:20                                                                       ` Eli Zaretskii
  2021-06-12 14:33                                                                         ` Stefan Monnier
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12 14:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: ofv, spacibba, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ergus <spacibba@aol.com>,  ofv@wanadoo.es,  emacs-devel@gnu.org
> Date: Sat, 12 Jun 2021 10:00:26 -0400
> 
> > The other design issue with TS integration is that I'd like it to plug
> > into the JIT font-lock interface of the display engine, so that we
> > don't unnecessarily fontify parts of the buffer that won't be
> > displayed, and always do fontify the parts that will be.
> 
> Hm... AFAIK that's already what emacs-tree-sitter does.

Can you point me to the code which does that?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 14:20                                                                       ` Eli Zaretskii
@ 2021-06-12 14:33                                                                         ` Stefan Monnier
  2021-06-12 15:06                                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 14:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, ofv, emacs-devel

Eli Zaretskii [2021-06-12 17:20:36] wrote:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: Ergus <spacibba@aol.com>,  ofv@wanadoo.es,  emacs-devel@gnu.org
>> Date: Sat, 12 Jun 2021 10:00:26 -0400
>> 
>> > The other design issue with TS integration is that I'd like it to plug
>> > into the JIT font-lock interface of the display engine, so that we
>> > don't unnecessarily fontify parts of the buffer that won't be
>> > displayed, and always do fontify the parts that will be.
>> 
>> Hm... AFAIK that's already what emacs-tree-sitter does.
>
> Can you point me to the code which does that?

The code is in `tree-sitter-hl.el`, where they define
`tree-sitter-hl-mode` which is enabled by `tree-sitter-hl--setup`
where they

    (add-function :override (local 'font-lock-fontify-region-function)
                  #'tree-sitter-hl--highlight-region)


-- Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 11:25                                                                       ` Eli Zaretskii
@ 2021-06-12 15:04                                                                         ` Ergus
  2021-06-12 15:16                                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-12 15:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Sat, Jun 12, 2021 at 02:25:45PM +0300, Eli Zaretskii wrote:
>> Date: Sat, 12 Jun 2021 13:01:03 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
>>
>> If I understand something about our cc-mode functionalities (and many of
>> those functionalities we don't want to loose like indentation and code
>> navigation). Probably the "right" way to use tree-sitter (maybe Alan
>> wants give a more precise technical description) is not only fontify but
>> use the tree information to add contextual information to the text
>> (something that I think cc-mode does.) And then let font-lock do the
>> magic.
>>
>> The tree-sitter tree is basically contextual information, and (for
>> example) if we have processed the whole buffer and we already have the
>> tree, then scrolling won't need to parse anything, adding or removing
>> text is a localized modification, so with the previous tree we can
>> re-parse only the modified region. The choice may be then if we
>> propertize the text of the whole buffer or just the visible region OR if
>> we want to "propertize on demand".
>>
>> This will save us from the hard parsing in cc-mode to fontify "on the
>> fly".
>
>I'm not sure I understand what you are suggesting.  Can you describe
>your suggestion in terms of 'face' text properties and the 'fontified'
>property, and explain how those should fit into the existing redisplay
>mechanisms?
>
cc-mode have something similar to the tree sitter properties. It is the
information we get in c-syntactic-context or c-langelem-sym. 

I don't actually know where is this information stored now by cc-mode.

But right now it is set in the text just by regions (visible ones) that
are parsed on demand (that's why they impact commands like
scrolling). So there are two operation, 1) the parsing and then 2) setting
this properties to the text (or where they are stored somehow).

In the other hand when we want to get things like
c-defun-name-and-limits we also search on the fly with functions like
c-declaration-limits-1 or c-go-list-backward, that search on the fly and
try to recognize or find the contextual information.

With tree sitter on the other hand:

suppose we have a buffer like:

int main()
{
	int i = 5;

	return 0;
}

The tree sitter parser returns a tree that may be represented like:

(translation_unit
  (function_definition type:
		      (primitive_type) declarator:
		      (function_declarator declarator: (identifier)
					   parameters: (parameter_list))
		      body:
		      (compound_statement
		       (declaration type: (primitive_type)
				    declarator:
				    (init_declarator
				     declarator: (identifier)
				     value: (number_literal)))
		       (return_statement (number_literal)))))

This tree can be traversed, accessed and recalculated very fast; but
after a change, it can be updated even faster and only by sections if we
know the rest haven't change.

When we have a visible region (suppose that we only see the line: int i
= 5; because our screen is very small for this example)

as we know where that line starts in the buffer then we can find the
nearest node that extends in this region using functions like:

ts_node_first_child_for_byte
ts_node_descendant_for_byte_range
ts_node_named_descendant_for_byte_range

the design choice comes here.

1) We can iterate (or traverse) the "usefull" subtree over them to
convert that information in text properties directly (using
ts_tree_cursor_current_field_id). 

But If I remember correctly that could have some implications in
redisplay... right?. Even when we modify properties that are not visible
or belong to an outer node.

2) We never convert the tree information into properties (as we know
them in the text now), but just use the ts_tree_cursor_* set of
functions to access the information and tell to the display engine to
use some faces for it.

So in the lisp side instead of accessing stored information in the
properties we just call a wrapper around tree-sitter C functions.

----

The first approach may be probably simpler to implement, but less
optimal because of the translation between C-Lisp types and adding
properties constantly on every update adds extra work on the lisp side.

This may be optimized a bit using for example
ts_tree_get_changed_ranges.

The second approach may require a bit more of work, but will solve the
issue of indentation and code navigation for all the modes with a common
pattern and a single api. While the display engine could access directly
to all the information from C to C.

The key difference may be that (for example) basic commands like: up-list

1) with the first approach will search on the buffer for text properties
changes, syntax-ppss and so on.

2) with the second one will just call ts_node_parent and go to
ts_node_start_byte.

>> > I don't
>> >really care if TS actually processes a much larger chunk of text, if
>> >it does that quickly enough, but processing the resulting faces will
>> >take time on the Emacs side, and that is better avoided.
>>
>> But then we won't get all the contextual information we need for
>> indentation, code navigation or fold the code right?
>
>Why not?
>
translating also that information may be a lot of work too. 

>> I see two approaches here:
>>
>> 1) add the tree-sitter properties/faces to the buffer text (fully or
>> partially on the visible regions)
>>
>> 2) use the tree-sitter information directly from the tree and add the
>> visible properties from there.
>>
>> This second one will require a more complete api of tree-sitter
>> functions exposed to elisp, but in my opinion it worth it in accuracy,
>> speed and simplicity (a single API to rule them all). And to support
>> many languages we don't actually have like rust or the fancy C++ > 11.
>
>Why can't we have both?  The information you are talking about, which
>is needed by Emacs features other than fontification, can be used by
>those other Emacs features when needed.  You seem to be saying that
>these two alternatives are mutually-exclusive, but you didn't explain
>why.
>
They are not exclusive, but redundant. If we use the current
infrastructure then we will spend a lot of time translating properties
and contextual information. And avoiding to have part of them
outdated. Navigation and indentation will continue to be based on
properties we need to set and update all the time to make the match one
by one.

Basically we will be duplicating the information that is already in the
tree. Creating many list objects, overloading the gc, and so on. So we
potentially will save only the parsing time.

The first one may work with a very primitive api to handle and iterate
the tree-sitter tree. The second one will require to use cursors,
finders and some other features from the tree-sitter API; improving
performance for sure but replacing a lot of the work lisp is doing now.

The second approach will probably make happy the C developers more than
the Lisp ones.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 14:33                                                                         ` Stefan Monnier
@ 2021-06-12 15:06                                                                           ` Eli Zaretskii
  2021-06-12 15:46                                                                             ` Stefan Monnier
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12 15:06 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: ofv, spacibba, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: spacibba@aol.com,  ofv@wanadoo.es,  emacs-devel@gnu.org
> Date: Sat, 12 Jun 2021 10:33:59 -0400
> 
> The code is in `tree-sitter-hl.el`, where they define
> `tree-sitter-hl-mode` which is enabled by `tree-sitter-hl--setup`
> where they
> 
>     (add-function :override (local 'font-lock-fontify-region-function)
>                   #'tree-sitter-hl--highlight-region)

I've seen that, but it's full of FIXMEs that basically tell me this is
incomplete and perhaps even kludgey?

I don't really understand why the workarounds are needed (nor why
font-lock-keywords would need to still be supported with TS).

And I cannot say I'm happy with the uses of buffer-substring and the
many conversions between character positions and byte positions.

Maybe these could be cleaned up?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:04                                                                         ` Ergus
@ 2021-06-12 15:16                                                                           ` Eli Zaretskii
  2021-06-12 15:23                                                                             ` Ergus
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12 15:16 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 17:04:02 +0200
> From: Ergus <spacibba@aol.com>
> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
> 
> >> I see two approaches here:
> >>
> >> 1) add the tree-sitter properties/faces to the buffer text (fully or
> >> partially on the visible regions)
> >>
> >> 2) use the tree-sitter information directly from the tree and add the
> >> visible properties from there.
> >>
> >> This second one will require a more complete api of tree-sitter
> >> functions exposed to elisp, but in my opinion it worth it in accuracy,
> >> speed and simplicity (a single API to rule them all). And to support
> >> many languages we don't actually have like rust or the fancy C++ > 11.
> >
> >Why can't we have both?  The information you are talking about, which
> >is needed by Emacs features other than fontification, can be used by
> >those other Emacs features when needed.  You seem to be saying that
> >these two alternatives are mutually-exclusive, but you didn't explain
> >why.
> >
> They are not exclusive, but redundant. If we use the current
> infrastructure then we will spend a lot of time translating properties
> and contextual information.

That depends on what you mean by "current infrastructure".

> And avoiding to have part of them outdated. Navigation and
> indentation will continue to be based on properties we need to set
> and update all the time to make the match one by one.
> 
> Basically we will be duplicating the information that is already in the
> tree. Creating many list objects, overloading the gc, and so on. So we
> potentially will save only the parsing time.

Why would we do a silly thing like that?

> The first one may work with a very primitive api to handle and iterate
> the tree-sitter tree. The second one will require to use cursors,
> finders and some other features from the tree-sitter API; improving
> performance for sure but replacing a lot of the work lisp is doing now.
> 
> The second approach will probably make happy the C developers more than
> the Lisp ones.

So where's the dilemma?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:16                                                                           ` Eli Zaretskii
@ 2021-06-12 15:23                                                                             ` Ergus
  2021-06-12 15:35                                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-12 15:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: ofv, emacs-devel

On Sat, Jun 12, 2021 at 06:16:02PM +0300, Eli Zaretskii wrote:
>> Date: Sat, 12 Jun 2021 17:04:02 +0200
>> From: Ergus <spacibba@aol.com>
>> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
>>
>> >> I see two approaches here:
>> >>
>> >> 1) add the tree-sitter properties/faces to the buffer text (fully or
>> >> partially on the visible regions)
>> >>
>> >> 2) use the tree-sitter information directly from the tree and add the
>> >> visible properties from there.
>> >>
>> >> This second one will require a more complete api of tree-sitter
>> >> functions exposed to elisp, but in my opinion it worth it in accuracy,
>> >> speed and simplicity (a single API to rule them all). And to support
>> >> many languages we don't actually have like rust or the fancy C++ > 11.
>> >
>> >Why can't we have both?  The information you are talking about, which
>> >is needed by Emacs features other than fontification, can be used by
>> >those other Emacs features when needed.  You seem to be saying that
>> >these two alternatives are mutually-exclusive, but you didn't explain
>> >why.
>> >
>> They are not exclusive, but redundant. If we use the current
>> infrastructure then we will spend a lot of time translating properties
>> and contextual information.
>
>That depends on what you mean by "current infrastructure".
>
Properties, properties navigation.

>> And avoiding to have part of them outdated. Navigation and
>> indentation will continue to be based on properties we need to set
>> and update all the time to make the match one by one.
>>
>> Basically we will be duplicating the information that is already in the
>> tree. Creating many list objects, overloading the gc, and so on. So we
>> potentially will save only the parsing time.
>
>Why would we do a silly thing like that?
>
to convert the tree into some lisp objects we can use with lisp
functions (to check, read, compare and so on)

>> The first one may work with a very primitive api to handle and iterate
>> the tree-sitter tree. The second one will require to use cursors,
>> finders and some other features from the tree-sitter API; improving
>> performance for sure but replacing a lot of the work lisp is doing now.
>>
>> The second approach will probably make happy the C developers more than
>> the Lisp ones.
>
>So where's the dilemma?
>
For me none, but lisp developers may not like to rely so much in an
external library.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:23                                                                             ` Ergus
@ 2021-06-12 15:35                                                                               ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-12 15:35 UTC (permalink / raw)
  To: Ergus; +Cc: ofv, emacs-devel

> Date: Sat, 12 Jun 2021 17:23:44 +0200
> From: Ergus <spacibba@aol.com>
> Cc: ofv@wanadoo.es, emacs-devel@gnu.org
> 
> >> They are not exclusive, but redundant. If we use the current
> >> infrastructure then we will spend a lot of time translating properties
> >> and contextual information.
> >
> >That depends on what you mean by "current infrastructure".
> >
> Properties, properties navigation.

If you mean the special properties used by CC Mode, then we are not
restricted by using them.  We can invent new ones, if needed.  Or use
something other than text properties, if that makes sense.  IOW, I
don't see why this would be something we need to bother about at this
point.

> >> And avoiding to have part of them outdated. Navigation and
> >> indentation will continue to be based on properties we need to set
> >> and update all the time to make the match one by one.
> >>
> >> Basically we will be duplicating the information that is already in the
> >> tree. Creating many list objects, overloading the gc, and so on. So we
> >> potentially will save only the parsing time.
> >
> >Why would we do a silly thing like that?
> >
> to convert the tree into some lisp objects we can use with lisp
> functions (to check, read, compare and so on)
> 
> >> The first one may work with a very primitive api to handle and iterate
> >> the tree-sitter tree. The second one will require to use cursors,
> >> finders and some other features from the tree-sitter API; improving
> >> performance for sure but replacing a lot of the work lisp is doing now.
> >>
> >> The second approach will probably make happy the C developers more than
> >> the Lisp ones.
> >
> >So where's the dilemma?
> >
> For me none, but lisp developers may not like to rely so much in an
> external library.

We could have accessor functions exposed to Lisp, if that's needed.

Again, I don't see why this should bother us now.  We have enough
means to solve these problems.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:06                                                                           ` Eli Zaretskii
@ 2021-06-12 15:46                                                                             ` Stefan Monnier
  0 siblings, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 15:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, ofv, emacs-devel

> I've seen that, but it's full of FIXMEs that basically tell me this is
> incomplete and perhaps even kludgey?

I haven't looked in detail at how it works, but w.r.t its interaction
with font-lock and jit-lock it seems sane.

> I don't really understand why the workarounds are needed (nor why
> font-lock-keywords would need to still be supported with TS).

`font-lock-keywords` is (ab)used by several other packages, like
hi-lock, so a major mode that uses font-lock but sets it up in a way
that ignores `font-lock-keywords` introduces problems.

Maybe instead of overriding `font-lock-fontify-region-function` it would
be better to use a single entry in `font-lock-keywords` which calls something
like `tree-sitter-hl--highlight-region`, but these are minor details
that don't affect the general approach.

> And I cannot say I'm happy with the uses of buffer-substring and the
> many conversions between character positions and byte positions.
> Maybe these could be cleaned up?

I'm pretty sure the code (and the authors) would welcome help making it
cleaner, yes.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 13:40                                                                 ` Stefan Monnier
@ 2021-06-12 15:56                                                                   ` Theodor Thornhill
  2021-06-12 16:59                                                                     ` Ergus
  2021-06-12 17:25                                                                     ` Stefan Monnier
  0 siblings, 2 replies; 274+ messages in thread
From: Theodor Thornhill @ 2021-06-12 15:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> @Stefan - I'm not sure I understand what you mean by troublesome for
>> elisp hackers.  These grammars have a lisp-like dsl, and is pretty
>> usable through C-M-x and defvars, see:
>> https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.
>
> AFAIK the grammar itself is still written in Javascript.
>

Yeah, but compiled parsers can be supplied through CI or something like that.

[...]
>
> Agreed.  Maybe a first step would be to get copyright assignments and
> include the tree sitter module in GNU ELPA?
>

If I read some of these mails correctly it seems like that wouldn't be
possible due to interest from some of the parties involved in the main
package.  I don't know the details on that, though.  And Eli seems
unhappy with what's there.

As for making a little more concrete proposal for how to move forward,
would this be something like what we want?

- create/use c or rust bindings
- create an elisp-layer for interaction with the parse tree
- hook fontification and indentation into that elisp-layer

It feels like the elisp-layer will be the easiest part.  I'm not really
well versed in where to look in the c code of emacs for where and how to
link this, so some pointers would be nice.

It looks like most people agree that tree sitter support is wanted, so
maybe it's time to start doing it?  I can surely have a stab at it, but
I'd like some guidance for how to proceed best - if it's wanted, that
is.

--
Theodor

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:56                                                                   ` Theodor Thornhill
@ 2021-06-12 16:59                                                                     ` Ergus
  2021-06-12 17:51                                                                       ` Theodor Thornhill
  2021-06-12 17:25                                                                     ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Ergus @ 2021-06-12 16:59 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Stefan Monnier, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

On Sat, Jun 12, 2021 at 05:56:34PM +0200, Theodor Thornhill wrote:
>Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> @Stefan - I'm not sure I understand what you mean by troublesome for
>>> elisp hackers.  These grammars have a lisp-like dsl, and is pretty
>>> usable through C-M-x and defvars, see:
>>> https://github.com/emacs-csharp/csharp-mode/blob/master/csharp-tree-sitter.el#L44.
>>
>> AFAIK the grammar itself is still written in Javascript.
>>
>
>Yeah, but compiled parsers can be supplied through CI or something like that.
>
>
>[...]
>>
>> Agreed.  Maybe a first step would be to get copyright assignments and
>> include the tree sitter module in GNU ELPA?
>>
>
>If I read some of these mails correctly it seems like that wouldn't be
>possible due to interest from some of the parties involved in the main
>package.  I don't know the details on that, though.  And Eli seems
>unhappy with what's there.
>
>As for making a little more concrete proposal for how to move forward,
>would this be something like what we want?
>
>- create/use c or rust bindings

Hi: 

Eli and the others will give better info for sure, but just to start
(and also they may correct my ideas):

First there is needed a "mode-local" initialization for the parser based
on the major mode (as explained in the TS doc). The parser probably must
be stored somewhere in the "mode" to avoid parser duplication for the
same language. This should be executed probably once/mode (it may be
perfectly in the lisp side then) and will be a wrapper to call:

ts_parser_new
ts_parser_set_language

After that in the C side I think that all we need is in buffer.{h,c}.

to pass the current_buffer->text->beg (or similar) directly to
ts_parser_parse_string or ts_parser_parse_string_encoding. 

Here we must exclude the gap region maybe with ts_parser_included_ranges
(all that information seems to be there as macros in buffer.h).

Once we have a tree we associate it with the buffer it belongs to. And
then comes the rest.

>- create an elisp-layer for interaction with the parse tree

Basically we need to expose some of them, but it is better if we can
handle the most we can in the C side. Using simpler data types and
handling entire regions with the ts_tree_cursor_* functionalities. Must
of course, some of the will be needed for other functionalities. 

I don't know if we can manage the font-locking from C? But I think that
text properties can.

So the next step is just traverse the visible region of the tree to
convert the info in text properties.

Here will be needed a sort of translation between
ts_language_symbol_count and font-lock faces.

>- hook fontification and indentation into that elisp-layer
>

If I understood what Eli wants to prevent, if we set the properties and
faces in step 2; then these hooks may not be needed.

In most cases we will need to call ts_parser_parse_string somewhere
`after-change-functions` (or maybe earlier I don't know) passing it the
old tree and getting the differences with the new one with
ts_tree_get_changed_ranges.

This returns something much smaller than the tree so maybe we can
convert it into a lisp list to use it in font-lock in the lisp side if
we can't handle most of it in C.

>It feels like the elisp-layer will be the easiest part.  I'm not really
>well versed in where to look in the c code of emacs for where and how to
>link this, so some pointers would be nice.
>
>It looks like most people agree that tree sitter support is wanted, so
>maybe it's time to start doing it?  I can surely have a stab at it, but
>I'd like some guidance for how to proceed best - if it's wanted, that
>is.
>
>--
>Theodor
>

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 15:56                                                                   ` Theodor Thornhill
  2021-06-12 16:59                                                                     ` Ergus
@ 2021-06-12 17:25                                                                     ` Stefan Monnier
  2021-06-12 17:53                                                                       ` Theodor Thornhill
                                                                                         ` (2 more replies)
  1 sibling, 3 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-06-12 17:25 UTC (permalink / raw)
  To: Theodor Thornhill
  Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

>> Agreed.  Maybe a first step would be to get copyright assignments and
>> include the tree sitter module in GNU ELPA?
> If I read some of these mails correctly it seems like that wouldn't be
> possible due to interest from some of the parties involved in the main
> package.  I don't know the details on that, though.

Before we start a parallel effort, we definitely should make every effort
to get copyright assignments for the existing code.  Maybe we can't take
the package as-is because some contributors won't accept to sign the
paperwork, but we can probably get paperwork for a significant fraction
of the code.

That would already help reduce duplicated efforts.

This is very important, not just to reduce the amount of work, but also
to avoid alienating interested parties.

> And Eli seems unhappy with what's there.

That doesn't mean we have to start over from scratch.

> As for making a little more concrete proposal for how to move forward,
> would this be something like what we want?
> - create/use c or rust bindings

I think we'd want to link to the C API of tree-sitter.
There's no point going through Rust at this point, AFAICT.

> - create an elisp-layer for interaction with the parse tree
> - hook fontification and indentation into that elisp-layer

Sounds about right.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-09  8:34                         ` martin rudalics
  2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
@ 2021-06-12 17:29                           ` João Távora
  2021-06-13  8:50                             ` martin rudalics
  1 sibling, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-12 17:29 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

[-- Attachment #1: Type: text/plain, Size: 628 bytes --]

On Wed, Jun 9, 2021, 09:34 martin rudalics <rudalics@gmx.at> wrote:

> I do not like, for example, that inserting a quotation mark somewhere
> into a Lisp buffer, with some delay repaints the entire rest of the
> buffer just to undo that when I insert the closing quotation mark.
>

Since recently, that shouldn't happen anymore unless you wait a relatively
long time. That time is configurable. Search for "antiblink". I added the
feature and am interested in knowing if it's not performing as it should.

Alternatively, you can also try a parenthesis pairing solution such as
electric-pair-mode.

João

>
>

[-- Attachment #2: Type: text/html, Size: 1202 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 16:59                                                                     ` Ergus
@ 2021-06-12 17:51                                                                       ` Theodor Thornhill
  0 siblings, 0 replies; 274+ messages in thread
From: Theodor Thornhill @ 2021-06-12 17:51 UTC (permalink / raw)
  To: Ergus; +Cc: Stefan Monnier, Eli Zaretskii, dancol, acm, rudalics, emacs-devel



> Hi: 
>
> Eli and the others will give better info for sure, but just to start
> (and also they may correct my ideas):
>
> First there is needed a "mode-local" initialization for the parser based
> on the major mode (as explained in the TS doc). The parser probably must
> be stored somewhere in the "mode" to avoid parser duplication for the

[...]

Thanks for the input!  Will probably prove invaluable down the line :)
I was hoping it would be as "simple" as this.

--
Theo



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:25                                                                     ` Stefan Monnier
@ 2021-06-12 17:53                                                                       ` Theodor Thornhill
  2021-06-12 17:54                                                                       ` Ergus
  2021-06-12 18:02                                                                       ` Daniel Colascione
  2 siblings, 0 replies; 274+ messages in thread
From: Theodor Thornhill @ 2021-06-12 17:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ergus, Eli Zaretskii, dancol, acm, rudalics, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> Agreed.  Maybe a first step would be to get copyright assignments and
>>> include the tree sitter module in GNU ELPA?
>> If I read some of these mails correctly it seems like that wouldn't be
>> possible due to interest from some of the parties involved in the main
>> package.  I don't know the details on that, though.
>
> Before we start a parallel effort, we definitely should make every effort
> to get copyright assignments for the existing code.  Maybe we can't take
> the package as-is because some contributors won't accept to sign the
> paperwork, but we can probably get paperwork for a significant fraction
> of the code.

Sure - I can open an issue and see where we're at.

>
>> And Eli seems unhappy with what's there.
>
> That doesn't mean we have to start over from scratch.
>

No, absolutely.

>> As for making a little more concrete proposal for how to move forward,
>> would this be something like what we want?
>> - create/use c or rust bindings
>
> I think we'd want to link to the C API of tree-sitter.
> There's no point going through Rust at this point, AFAICT.
>
>> - create an elisp-layer for interaction with the parse tree
>> - hook fontification and indentation into that elisp-layer
>
> Sounds about right.
>

Ok good!

--
Theo



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:25                                                                     ` Stefan Monnier
  2021-06-12 17:53                                                                       ` Theodor Thornhill
@ 2021-06-12 17:54                                                                       ` Ergus
  2021-06-12 18:02                                                                       ` Daniel Colascione
  2 siblings, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-12 17:54 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Theodor Thornhill, Eli Zaretskii, dancol, acm, rudalics,
	emacs-devel

On Sat, Jun 12, 2021 at 01:25:14PM -0400, Stefan Monnier wrote:
>>> Agreed.  Maybe a first step would be to get copyright assignments and
>>> include the tree sitter module in GNU ELPA?
>> If I read some of these mails correctly it seems like that wouldn't be
>> possible due to interest from some of the parties involved in the main
>> package.  I don't know the details on that, though.
>
>Before we start a parallel effort, we definitely should make every effort
>to get copyright assignments for the existing code.  Maybe we can't take
>the package as-is because some contributors won't accept to sign the
>paperwork, but we can probably get paperwork for a significant fraction
>of the code.
>
>That would already help reduce duplicated efforts.
>
>This is very important, not just to reduce the amount of work, but also
>to avoid alienating interested parties.
>
I agree, but it looks like Eli wants a different approach for the
calling and a part of the performance issues come from the font-lock and
Lisp hooks and translations forth and back. Something that a package
implemented with modules still do at some level.

Will you write to the authors on GH? There are only 17 contributors, so
not a crazy number of copyrights to get.

What is wondering me is that managing copyright usually is a never
ending problem. We are still waiting for use-package to get all of
them. And every time we say to gather copyrights then there is a dead
time and the topic is forgotten after a while.

The package was designed to be an external feature so it may use some
"inefficient" solutions (lisp calls from C, substring, font-lock init
functions) that could be cleaned and improvement to access internal C
code directly; that will require a deeper knowledge of the package and
emacs C code to be handled, and I don't know how available will be the
developers to do so.



>> And Eli seems unhappy with what's there.
>
>That doesn't mean we have to start over from scratch.
>
That's true. But the approach implemented with modules or internally in
emacs may be very different right?

>> As for making a little more concrete proposal for how to move forward,
>> would this be something like what we want?
>> - create/use c or rust bindings
>
>I think we'd want to link to the C API of tree-sitter.
>There's no point going through Rust at this point, AFAICT.
>
>> - create an elisp-layer for interaction with the parse tree
>> - hook fontification and indentation into that elisp-layer
>
>Sounds about right.
>
>
>        Stefan
>
>



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:25                                                                     ` Stefan Monnier
  2021-06-12 17:53                                                                       ` Theodor Thornhill
  2021-06-12 17:54                                                                       ` Ergus
@ 2021-06-12 18:02                                                                       ` Daniel Colascione
  2021-06-12 18:39                                                                         ` Ergus
  2 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-06-12 18:02 UTC (permalink / raw)
  To: Stefan Monnier, Theodor Thornhill
  Cc: acm, Ergus, emacs-devel, Eli Zaretskii, rudalics



On June 12, 2021 10:25:19 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>>> Agreed.  Maybe a first step would be to get copyright assignments and
>>> include the tree sitter module in GNU ELPA?
>> If I read some of these mails correctly it seems like that wouldn't be
>> possible due to interest from some of the parties involved in the main
>> package.  I don't know the details on that, though.
>
> Before we start a parallel effort, we definitely should make every effort
> to get copyright assignments for the existing code.  Maybe we can't take
> the package as-is because some contributors won't accept to sign the
> paperwork, but we can probably get paperwork for a significant fraction
> of the code.
>
> That would already help reduce duplicated efforts.
>
> This is very important, not just to reduce the amount of work, but also
> to avoid alienating interested parties.
>
>> And Eli seems unhappy with what's there.
>
> That doesn't mean we have to start over from scratch.
>
>> As for making a little more concrete proposal for how to move forward,
>> would this be something like what we want?
>> - create/use c or rust bindings
>
> I think we'd want to link to the C API of tree-sitter.
> There's no point going through Rust at this point, AFAICT.
>
>> - create an elisp-layer for interaction with the parse tree
>> - hook fontification and indentation into that elisp-layer
>
> Sounds about right.
>
>
>        Stefan

It's very important that the actual parsers be modules, at least 
optionally. It must be possible to customize and develop on a running 
Emacs, without a restart. To do that, if we stick with a model where 
generated parsers are in C, we must unload and reload compiled code. I am 
convinced we can make the module interface efficient enough for this to 
work without measurable overhead.






^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 18:02                                                                       ` Daniel Colascione
@ 2021-06-12 18:39                                                                         ` Ergus
  0 siblings, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-12 18:39 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Stefan Monnier, Theodor Thornhill, Eli Zaretskii, acm, rudalics,
	emacs-devel

On Sat, Jun 12, 2021 at 11:02:36AM -0700, Daniel Colascione wrote:
>
>
>On June 12, 2021 10:25:19 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
>>>>Agreed.  Maybe a first step would be to get copyright assignments and
>>>>include the tree sitter module in GNU ELPA?
>>>If I read some of these mails correctly it seems like that wouldn't be
>>>possible due to interest from some of the parties involved in the main
>>>package.  I don't know the details on that, though.
>>
>>Before we start a parallel effort, we definitely should make every effort
>>to get copyright assignments for the existing code.  Maybe we can't take
>>the package as-is because some contributors won't accept to sign the
>>paperwork, but we can probably get paperwork for a significant fraction
>>of the code.
>>
>>That would already help reduce duplicated efforts.
>>
>>This is very important, not just to reduce the amount of work, but also
>>to avoid alienating interested parties.
>>
>>>And Eli seems unhappy with what's there.
>>
>>That doesn't mean we have to start over from scratch.
>>
>>>As for making a little more concrete proposal for how to move forward,
>>>would this be something like what we want?
>>>- create/use c or rust bindings
>>
>>I think we'd want to link to the C API of tree-sitter.
>>There's no point going through Rust at this point, AFAICT.
>>
>>>- create an elisp-layer for interaction with the parse tree
>>>- hook fontification and indentation into that elisp-layer
>>
>>Sounds about right.
>>
>>
>>       Stefan
>
>It's very important that the actual parsers be modules, at least 
>optionally. It must be possible to customize and develop on a running 
>Emacs, without a restart. To do that, if we stick with a model where 
>generated parsers are in C, we must unload and reload compiled code. I 
>am convinced we can make the module interface efficient enough for 
>this to work without measurable overhead.
>
Yes of course. Once we have the internal infrastructure the parsers
should be modules that autocompile during the installation (like
vterm). 

If we make a simple infrastructure, those modules won't even require any
lisp code, just some instructions to download and compile the shared
object somewhere with a proper name?




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
@ 2021-06-13  8:50                             ` martin rudalics
  2021-06-13  9:05                               ` João Távora
  0 siblings, 1 reply; 274+ messages in thread
From: martin rudalics @ 2021-06-13  8:50 UTC (permalink / raw)
  To: João Távora
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

 >> I do not like, for example, that inserting a quotation mark somewhere
 >> into a Lisp buffer, with some delay repaints the entire rest of the
 >> buffer just to undo that when I insert the closing quotation mark.
 >>
 >
 > Since recently, that shouldn't happen anymore unless you wait a relatively
 > long time. That time is configurable. Search for "antiblink". I added the
 > feature and am interested in knowing if it's not performing as it should.

The idea is good but still not what I want.  I don't want the entire
rest of my buffer get refontified even when I do not stay on the same
line.

What I really wanted is a simple mechanism that refontifies text only at
most until the next open paren in column zero.  ISTR that Emacs behaved
like that in the past - maybe I got used to it back then, maybe also my
memory fails.

 > Alternatively, you can also try a parenthesis pairing solution such as
 > electric-pair-mode.

I dislike electricity.

martin



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-13  8:50                             ` martin rudalics
@ 2021-06-13  9:05                               ` João Távora
  2021-06-13  9:39                                 ` martin rudalics
  0 siblings, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-13  9:05 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

On Sun, Jun 13, 2021 at 9:50 AM martin rudalics <rudalics@gmx.at> wrote:
>
>  >> I do not like, for example, that inserting a quotation mark somewhere
>  >> into a Lisp buffer, with some delay repaints the entire rest of the
>  >> buffer just to undo that when I insert the closing quotation mark.
>  >>
>  >
>  > Since recently, that shouldn't happen anymore unless you wait a relatively
>  > long time. That time is configurable. Search for "antiblink". I added the
>  > feature and am interested in knowing if it's not performing as it should.
>
> The idea is good but still not what I want.  I don't want the entire
> rest of my buffer get refontified even when I do not stay on the same
> line.

Right.  I was  strictly addressing your description of adding a single
quote and then watching the whole buffer repaint itself.  If one does other
editing or movement actions after that, then antiblink throws in the towels
and says "all bets are off, better not assume more things about the user's
fontification intentions".

That's because maybe you _do_ want the whole buffer to be refontified
and are intending to go to some point up to EOB to put the closing quote.

> What I really wanted is a simple mechanism that refontifies text only at
> most until the next open paren in column zero.  ISTR that Emacs behaved
> like that in the past - maybe I got used to it back then, maybe also my
> memory fails.

I don't have that recollection (but I think you have been using Emacs for
longer than me).

Regardless, I would file a bug if I saw that behaviour :-) , because it might
be my sincere intention to have the buffer be repainted.  I could be composing
a docstring or an example where an opening parenthesis happens to
have fallen on the first column of a line.

In fact I seem to remember that in SLIME or SLY that is already problematic
when evaluating a chunk of code where a docstring happens to have that
quirk.   And that it is because of open-paren-at-column-0-is-defun-start.

In another ISTR moment, ISTR that other editors do the same as Emacs
does here , i.e. repaint. I haven't seen an equivalent to antiblink.

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-13  9:05                               ` João Távora
@ 2021-06-13  9:39                                 ` martin rudalics
  2021-06-13 10:06                                   ` João Távora
  0 siblings, 1 reply; 274+ messages in thread
From: martin rudalics @ 2021-06-13  9:39 UTC (permalink / raw)
  To: João Távora
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

 >> The idea is good but still not what I want.  I don't want the entire
 >> rest of my buffer get refontified even when I do not stay on the same
 >> line.
 >
 > Right.  I was  strictly addressing your description of adding a single
 > quote and then watching the whole buffer repaint itself.  If one does other
 > editing or movement actions after that, then antiblink throws in the towels
 > and says "all bets are off, better not assume more things about the user's
 > fontification intentions".
 >
 > That's because maybe you _do_ want the whole buffer to be refontified
 > and are intending to go to some point up to EOB to put the closing quote.

I admit that this probably meets most users' intentions.  What I do not
understand about its implementation is that if `jit-lock-contextually'
is t (as it is usually set by `jit-lock-register'), you unconditionally
add the antiblink mechanism to `post-command-hook' which IIUC causes a
`syntax-ppss' call unconditionally getting run for each command even
when `jit-lock-antiblink-grace' is nil.  Is that perception correct?  If
so, I think that you should not do that.

 > Regardless, I would file a bug if I saw that behaviour :-) ,

... even with `open-paren-in-column-0-is-defun-start' non-nil?

 > because it might
 > be my sincere intention to have the buffer be repainted.  I could be composing
 > a docstring or an example where an opening parenthesis happens to
 > have fallen on the first column of a line.
 >
 > In fact I seem to remember that in SLIME or SLY that is already problematic
 > when evaluating a chunk of code where a docstring happens to have that
 > quirk.   And that it is because of open-paren-at-column-0-is-defun-start.

But `open-paren-in-column-0-is-defun-start' is a customizable variable.
Which means that a user should be able to use it to control the behavior
of Emacs in this regard.  And it is t by default for no obvious reason.

 > In another ISTR moment, ISTR that other editors do the same as Emacs
 > does here , i.e. repaint. I haven't seen an equivalent to antiblink.

martin



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-13  9:39                                 ` martin rudalics
@ 2021-06-13 10:06                                   ` João Távora
  2021-06-13 14:52                                     ` martin rudalics
  0 siblings, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-13 10:06 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

martin rudalics <rudalics@gmx.at> writes:

> I admit that this probably meets most users' intentions.  What I do not
> understand about its implementation is that if `jit-lock-contextually'
> is t (as it is usually set by `jit-lock-register'), you unconditionally
> add the antiblink mechanism to `post-command-hook' which IIUC causes a
> `syntax-ppss' call unconditionally getting run for each command even
> when `jit-lock-antiblink-grace' is nil.  Is that perception correct?  If
> so, I think that you should not do that.p

Oof, I don't have the implementation of it in L1 cache right now.  May
be.  May be not.  The implementation was reviewed closely at the time,
including some extensive performance tests in xdisp.c, coordinated by
Eli.

But looking summarily at the code it doesn't seem to be
"unconditionally" at all.  There are four different conditions that have
to be cumulatively verified before that invocation of syntax-ppss takes
place.  And I seem to remember that syntax-ppss isn't very expensive
anyway.

>> Regardless, I would file a bug if I saw that behaviour :-) ,
>
> ... even with `open-paren-in-column-0-is-defun-start' non-nil?

Depends on whether parts of Emacs's _require_ it to be non-nil.

> But `open-paren-in-column-0-is-defun-start' is a customizable variable.
> Which means that a user should be able to use it to control the behavior
> of Emacs in this regard.  And it is t by default for no obvious
> reason.

I didn't know it was customizbale (or rather didn't bother to check, I
admit).  But again, I seem to remember that customizing it back to nil
wasn't an option and that Emacs would break.  Maybe that has changed?
Is it really truly optional?

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-13 10:06                                   ` João Távora
@ 2021-06-13 14:52                                     ` martin rudalics
  2021-06-13 15:25                                       ` João Távora
  0 siblings, 1 reply; 274+ messages in thread
From: martin rudalics @ 2021-06-13 14:52 UTC (permalink / raw)
  To: João Távora
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

 > But looking summarily at the code it doesn't seem to be
 > "unconditionally" at all.  There are four different conditions that have
 > to be cumulatively verified before that invocation of syntax-ppss takes
 > place.

I think you're right - it seems to end up in "same-line".  The two
`line-beginning-position' calls and the `copy-marker' are gratuitous
though.  This should become a minor mode and crowd `post-command-hook'
only if enabled - otherwise you needlessly punish electric users.  And
it should be documented somewhere.

 > I didn't know it was customizbale (or rather didn't bother to check, I
 > admit).  But again, I seem to remember that customizing it back to nil
 > wasn't an option and that Emacs would break.  Maybe that has changed?
 > Is it really truly optional?

Sure.

martin



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-13 14:52                                     ` martin rudalics
@ 2021-06-13 15:25                                       ` João Távora
  2021-06-14  8:29                                         ` martin rudalics
  0 siblings, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-13 15:25 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

martin rudalics <rudalics@gmx.at> writes:

> The two `line-beginning-position' calls and the `copy-marker' are
> gratuitous though.

One good way to demonstrate such a claim that is to show a patch that
doesn't change behaviour where this gratutiousness isn't present.  I'll
be happy to merge it.

> This should become a minor mode and crowd `post-command-hook'
> only if enabled - otherwise you needlessly punish electric users.  

There is the variable jit-lock-antiblink-grace.  electric-pair-mode --
which I also designed in its current form -- isn't on by default.  The
code in jit-lock-antiblink-grace could theoretically check for it, but
IMO it's a non-issue: as I mentioned I benchmarked the effects of this
in large files according to specific instructions by Eli who was also
concerned about the performance hit and found no evidence of any kind of
punishment.

> And it should be documented somewhere.

See C-h v jit-lock-antiblink-grace.

>> I didn't know it was customizbale (or rather didn't bother to check, I
>> admit).  But again, I seem to remember that customizing it back to nil
>> wasn't an option and that Emacs would break.  Maybe that has changed?
>> Is it really truly optional?
>
> Sure.

If it's a performance-only optimization with non-zero functional
detriment it should be off by default.

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-13 15:25                                       ` João Távora
@ 2021-06-14  8:29                                         ` martin rudalics
  2021-06-14  8:40                                           ` João Távora
                                                             ` (2 more replies)
  0 siblings, 3 replies; 274+ messages in thread
From: martin rudalics @ 2021-06-14  8:29 UTC (permalink / raw)
  To: João Távora
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

 > I benchmarked the effects of this
 > in large files according to specific instructions by Eli who was also
 > concerned about the performance hit and found no evidence of any kind of
 > punishment.

Why on earth should file size have any impact on the performance of this
option?  It introduces a constant overhead that affects every single
character a user types.  You simply delay unfontifying the remaining
part of a buffer until editing leaves the current line.

 >> And it should be documented somewhere.
 >
 > See C-h v jit-lock-antiblink-grace.

It should be documented in at least one manual.

 > If it's a performance-only optimization with non-zero functional
 > detriment it should be off by default.

It is undocumented practice to steadily undermine the role of
`open-paren-in-column-0-is-defun-start' in all modes with
`beginning-of-defun' the only place left where it currently has any
impact.  For some reason, people are afraid though of setting it off by
default (or declaring it obsolete).

martin

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-14  8:29                                         ` martin rudalics
@ 2021-06-14  8:40                                           ` João Távora
  2021-06-14  9:00                                             ` martin rudalics
  2021-06-14 11:28                                           ` Eli Zaretskii
  2021-06-14 14:39                                           ` Stefan Monnier
  2 siblings, 1 reply; 274+ messages in thread
From: João Távora @ 2021-06-14  8:40 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

> On Mon, Jun 14, 2021 at 9:29 AM martin rudalics <rudalics@gmx.at> wrote:
>
>  > I benchmarked the effects of this
>  > in large files according to specific instructions by Eli who was also
>  > concerned about the performance hit and found no evidence of any kind of
>  > punishment.
>
> Why on earth should file size have any impact on the performance of this
> option?

A question for Eli perhaps?  The file was xdisp.c where jit-lock is supposedly
quite demanding.

>  It introduces a constant overhead that affects every single
> character a user types.

And that overhead was found to be negligible.  Even small enough
not be easily measured.  In the meantime, if you are concerned with
this, you can produce a benchmark that evidences this overhead.

> > See C-h v jit-lock-antiblink-grace.
> It should be documented in at least one manual.

It could, but other customization options such as jit-lock-stealth-load and
jit-lock-contextually, jut-lock-stealth-nice, jit-lock-stealth-time,
and probably
others aren't.  Not all defcustom's have manual entries (for the record, IMO
this is a good idea: the manual should have a different reading cadence).
But no problem if you want to make a patch that basically adds the contents
of the docstring to display.texi.

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-14  8:40                                           ` João Távora
@ 2021-06-14  9:00                                             ` martin rudalics
  2021-06-14  9:14                                               ` João Távora
  0 siblings, 1 reply; 274+ messages in thread
From: martin rudalics @ 2021-06-14  9:00 UTC (permalink / raw)
  To: João Távora
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

 >> It should be documented in at least one manual.
 >
 > It could, but other customization options such as jit-lock-stealth-load and
 > jit-lock-contextually, jut-lock-stealth-nice, jit-lock-stealth-time,
 > and probably
 > others aren't.  Not all defcustom's have manual entries (for the record, IMO
 > this is a good idea: the manual should have a different reading cadence).

It's a bad state of affairs and nobody seems to care.  The "Multiline
Font Lock Constructs" section, for example, painstakingly explains how
to use `jit-lock-contextually' to handle multiline constructs but does
not bother whether that option has been ever described anywhere.

 > But no problem if you want to make a patch that basically adds the contents
 > of the docstring to display.texi.

You probably should not have followed the bad example of others.  But
I'll stop here telling you how a new option should be introduced.  Sorry
if I have bothered you, it was not my intention.

martin

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-14  9:00                                             ` martin rudalics
@ 2021-06-14  9:14                                               ` João Távora
  0 siblings, 0 replies; 274+ messages in thread
From: João Távora @ 2021-06-14  9:14 UTC (permalink / raw)
  To: martin rudalics
  Cc: Richard Stallman, emacs-devel, Stefan Monnier, Alan Mackenzie,
	Eli Zaretskii, Daniel Colascione

On Mon, Jun 14, 2021 at 10:00 AM martin rudalics <rudalics@gmx.at> wrote:

>  > But no problem if you want to make a patch that basically adds the contents
>  > of the docstring to display.texi.
>
> You probably should not have followed the bad example of others.  But
> I'll stop here telling you how a new option should be introduced.  Sorry
> if I have bothered you, it was not my intention.

You haven't bothered me, don't worry.

For the record, I wasn't following any examples, rather thinking
independently.  Maybe we just have differing views on how  the
Emacs manual should work.  Here, it wouldn't shock me to see
antiblink mentioned in it, but it doesn't shock me to see it absent
either.

João

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-14  8:29                                         ` martin rudalics
  2021-06-14  8:40                                           ` João Távora
@ 2021-06-14 11:28                                           ` Eli Zaretskii
  2021-06-14 14:39                                           ` Stefan Monnier
  2 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-06-14 11:28 UTC (permalink / raw)
  To: martin rudalics; +Cc: rms, emacs-devel, joaotavora, acm, dancol, monnier

> From: martin rudalics <rudalics@gmx.at>
> Date: Mon, 14 Jun 2021 10:29:27 +0200
> Cc: Richard Stallman <rms@gnu.org>, emacs-devel <emacs-devel@gnu.org>,
>  Stefan Monnier <monnier@iro.umontreal.ca>, Alan Mackenzie <acm@muc.de>,
>  Eli Zaretskii <eliz@gnu.org>, Daniel Colascione <dancol@dancol.org>
> 
>  > See C-h v jit-lock-antiblink-grace.
> 
> It should be documented in at least one manual.

It's an obscure variable whose effect is not easy to explain to
users.  In my testing, its effect was insignificant, so I didn't see
the need to document it in the manual.  If you can show a use case
where its effect is significant, perhaps I will change my mind.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-14  8:29                                         ` martin rudalics
  2021-06-14  8:40                                           ` João Távora
  2021-06-14 11:28                                           ` Eli Zaretskii
@ 2021-06-14 14:39                                           ` Stefan Monnier
  2021-06-15 22:38                                             ` Ergus
  2 siblings, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-06-14 14:39 UTC (permalink / raw)
  To: martin rudalics
  Cc: João Távora, Richard Stallman, emacs-devel,
	Alan Mackenzie, Eli Zaretskii, Daniel Colascione

> It is undocumented practice to steadily undermine the role of
> `open-paren-in-column-0-is-defun-start' in all modes with
> `beginning-of-defun' the only place left where it currently has any
> impact.

You seem to attribute malice to the perpetrators (e.g. yours truly ;-).

Here's my reasoning:

`open-paren-in-column-0-is-defun-start` was used at a few different
places which feel into two categories:

1- `beginning-of-defun`, where the effect is clear, deterministic, and reliable.
2- The rest (mostly `back_comment` in src/syntax.c, but also in some
   parts of font-lock which used `beginning-of-defun`) where the effect
   was not clear and reliable, it was a form of optimization which took
   effect in some cases but not all.

Part (2) has disappeared now, replaced by the `syntax-ppss` cache which
gives more reliable optimization (both in the sense that it speeds
things up more reliably and that it gives a more reliable behavior).

You liked some of the side-effects of (2), apparently.  I can agree with
that, but the old code did not really provide the feature you describe
(e.g. an unclosed comment/string in one defun would not magically stop
from "bleeding" into the next defun, although in some cases it indeed
did stop bleeding at some buffer position which depended on how the
chunks of text happened to be rehighlighted).

So, I suggest you implement the behavior you describe (you might be able
to do that fairly easily by taking some of the code used for
multi-major-mode support (since those also need to confine
syntax-propertization and font-locking in separate blocks)).

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-14 14:39                                           ` Stefan Monnier
@ 2021-06-15 22:38                                             ` Ergus
  0 siblings, 0 replies; 274+ messages in thread
From: Ergus @ 2021-06-15 22:38 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: martin rudalics, João Távora, Richard Stallman,
	emacs-devel, Alan Mackenzie, Eli Zaretskii, Daniel Colascione

Hi Stefan:

Finally anyone wrote to the emacs-tree-sitter module package to request
them to join to elpa and do the paperwork?



On Mon, Jun 14, 2021 at 10:39:49AM -0400, Stefan Monnier wrote:
>> It is undocumented practice to steadily undermine the role of
>> `open-paren-in-column-0-is-defun-start' in all modes with
>> `beginning-of-defun' the only place left where it currently has any
>> impact.
>
>You seem to attribute malice to the perpetrators (e.g. yours truly ;-).
>
>Here's my reasoning:
>
>`open-paren-in-column-0-is-defun-start` was used at a few different
>places which feel into two categories:
>
>1- `beginning-of-defun`, where the effect is clear, deterministic, and reliable.
>2- The rest (mostly `back_comment` in src/syntax.c, but also in some
>   parts of font-lock which used `beginning-of-defun`) where the effect
>   was not clear and reliable, it was a form of optimization which took
>   effect in some cases but not all.
>
>Part (2) has disappeared now, replaced by the `syntax-ppss` cache which
>gives more reliable optimization (both in the sense that it speeds
>things up more reliably and that it gives a more reliable behavior).
>
>You liked some of the side-effects of (2), apparently.  I can agree with
>that, but the old code did not really provide the feature you describe
>(e.g. an unclosed comment/string in one defun would not magically stop
>from "bleeding" into the next defun, although in some cases it indeed
>did stop bleeding at some buffer position which depended on how the
>chunks of text happened to be rehighlighted).
>
>So, I suggest you implement the behavior you describe (you might be able
>to do that fairly easily by taking some of the code used for
>multi-major-mode support (since those also need to confine
>syntax-propertization and font-locking in separate blocks)).
>
>
>        Stefan
>
>



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-10  2:21                                       ` Daniel Colascione
@ 2021-06-19  9:25                                         ` Alan Mackenzie
  2021-06-19 15:24                                           ` Alan Mackenzie
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-19  9:25 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, Eli Zaretskii, monnier, rms, emacs-devel

Hello, Daniel.

On Wed, Jun 09, 2021 at 19:21:20 -0700, Daniel Colascione wrote:


> On June 9, 2021 1:20:32 PM Alan Mackenzie <acm@muc.de> wrote:

[ .... ]

> > You could instead try to specify which tokens should get
> > font-lock-type-face and which shouldn't, thus giving something
> > concrete to discuss.  I think this will be difficult to do well, and
> > may lead to the result which I alluded to above.

> Sure. To be more precise: what I propose is not applying 
> font-lock-type-face to symbols when we think that symbol is a type solely 
> because it's been entered into cc-mode's table of dynamically discovered 
> types for the current buffer.

OK, I'll make a trial implementation of this, controlled by a user option
to switch it on and off.  Give me just a little time.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-19  9:25                                         ` Alan Mackenzie
@ 2021-06-19 15:24                                           ` Alan Mackenzie
  2021-07-09 14:06                                             ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-06-19 15:24 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: rudalics, Eli Zaretskii, emacs-devel, monnier, rms

Hello again, Daniel.

On Sat, Jun 19, 2021 at 09:25:30 +0000, Alan Mackenzie wrote:
> On Wed, Jun 09, 2021 at 19:21:20 -0700, Daniel Colascione wrote:


> > On June 9, 2021 1:20:32 PM Alan Mackenzie <acm@muc.de> wrote:

> [ .... ]

> > > You could instead try to specify which tokens should get
> > > font-lock-type-face and which shouldn't, thus giving something
> > > concrete to discuss.  I think this will be difficult to do well, and
> > > may lead to the result which I alluded to above.

> > Sure. To be more precise: what I propose is not applying 
> > font-lock-type-face to symbols when we think that symbol is a type solely 
> > because it's been entered into cc-mode's table of dynamically discovered 
> > types for the current buffer.

> OK, I'll make a trial implementation of this, controlled by a user option
> to switch it on and off.  Give me just a little time.

Would you please try the following, setting the new user option to nil,
and let me know how well it meets expectations.  Thanks!


diff -r 92a4592886a1 cc-engine.el
--- a/cc-engine.el	Sun Apr 25 17:26:38 2021 +0000
+++ b/cc-engine.el	Sat Jun 19 15:15:37 2021 +0000
@@ -10441,6 +10441,8 @@
 		 ;; There seems no reason to exclude a token from
 		 ;; fontification just because it's "a known type that can't
 		 ;; be a name or other expression".  2013-09-18.
+		 (or c-fontify-found-types
+		     (not (eq at-type 'found)))
 		 )
 	(let ((c-promote-possible-types t))
 	  (save-excursion
diff -r 92a4592886a1 cc-vars.el
--- a/cc-vars.el	Sun Apr 25 17:26:38 2021 +0000
+++ b/cc-vars.el	Sat Jun 19 15:15:37 2021 +0000
@@ -1639,6 +1639,15 @@
   :type 'c-extra-types-widget
   :group 'c)
 
+(defcustom c-fontify-found-types t
+  "If this variable is non-nil \"found types\" will be fontified as types.
+A \"found type\" is a symbol which is identified as a type by its
+context in the source code.  `c-fontify-found-types' non-nil then
+causes the same symbol to be fontified elsewhere as a type even
+where its context is ambiguous."
+  :type 'boolean
+  :group 'c)
+
 (defcustom c-asymmetry-fontification-flag t
   "Whether to fontify certain ambiguous constructs by white space asymmetry.
 

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-06-19 15:24                                           ` Alan Mackenzie
@ 2021-07-09 14:06                                             ` Daniel Colascione
  2021-07-11 18:12                                               ` Stephen Leake
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-07-09 14:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: rudalics, Eli Zaretskii, monnier, rms, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1297 bytes --]

On 6/19/21 8:24 AM, Alan Mackenzie wrote:
>>> Sure. To be more precise: what I propose is not applying
>>> font-lock-type-face to symbols when we think that symbol is a type solely
>>> because it's been entered into cc-mode's table of dynamically discovered
>>> types for the current buffer.
>> OK, I'll make a trial implementation of this, controlled by a user option
>> to switch it on and off.  Give me just a little time.
> Would you please try the following, setting the new user option to nil,
> and let me know how well it meets expectations.  Thanks!
>
>
> diff -r 92a4592886a1 cc-engine.el
> --- a/cc-engine.el	Sun Apr 25 17:26:

Thanks. This patch is an improvement in the sense that there's more 
consistency in fontification. cc-mode is still very inconsistent in 
fontification though. Take a look at the attached screenshot, which is 
of this program:

void
deallocate_one_arg(void* const ptr_to_arg,
                    MethodProgramPc& pc);

void
deallocate_one_arg(void* const ptr_to_arg,
                    MethodProgramPc& pc);

I'm confused about why "MethodProgramPc" is fontified as a type in the 
first prototype but not the second. I still wish we had a 
general-purpose mechanism for ensuring that problems like this didn't 
happen.



[-- Attachment #2: prototypes.png --]
[-- Type: image/png, Size: 26648 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-07-09 14:06                                             ` Daniel Colascione
@ 2021-07-11 18:12                                               ` Stephen Leake
  2021-07-15 18:13                                                 ` Perry E. Metzger
  0 siblings, 1 reply; 274+ messages in thread
From: Stephen Leake @ 2021-07-11 18:12 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: rms, emacs-devel, rudalics, monnier, Alan Mackenzie,
	Eli Zaretskii

Daniel Colascione <dancol@dancol.org> writes:

> ...  cc-mode is still very inconsistent in
> fontification though. Take a look at the attached screenshot, which is
> of this program:
>
> void
> deallocate_one_arg(void* const ptr_to_arg,
>                    MethodProgramPc& pc);
>
> void
> deallocate_one_arg(void* const ptr_to_arg,
>                    MethodProgramPc& pc);
>
> I'm confused about why "MethodProgramPc" is fontified as a type in the
> first prototype but not the second. I still wish we had a
> general-purpose mechanism for ensuring that problems like this didn't
> happen.

The ELPA package wisi provides a parser-based fontification engine,
which makes things like this more consistent. It is currently used for
ada-mode. Note that syntax errors in the source can cause bad
fontification, but the wisi parser has very robust error-correction, so
it usually does a good job even with syntax errors.

However, parsing C and C++ is complicated by macros; wisi makes no
provision for that.

Another option is an LSP based system via the EPLA package eglot; that
delegates fontification to the language server. I don't know how well
the C/C++ language servers perform for this.

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: cc-mode fontification feels random
  2021-07-11 18:12                                               ` Stephen Leake
@ 2021-07-15 18:13                                                 ` Perry E. Metzger
  2021-07-15 22:43                                                   ` Tree Sitter (was Re: cc-mode fontification feels random) Perry E. Metzger
  0 siblings, 1 reply; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-15 18:13 UTC (permalink / raw)
  To: emacs-devel


On 7/11/21 14:12, Stephen Leake wrote:
> The ELPA package wisi provides a parser-based fontification engine,
> which makes things like this more consistent. It is currently used for
> ada-mode. Note that syntax errors in the source can cause bad
> fontification, but the wisi parser has very robust error-correction, so
> it usually does a good job even with syntax errors.
>
> However, parsing C and C++ is complicated by macros; wisi makes no
> provision for that.
>
> Another option is an LSP based system via the EPLA package eglot; that
> delegates fontification to the language server. I don't know how well
> the C/C++ language servers perform for this.

Using LSP for fontification is unfortunately not sufficiently high 
performance. LSP is really intended for things like providing type 
information or enabling refactorings.

I note that several other modern editors now make use of the "Tree 
Sitter" library (see https://github.com/tree-sitter/tree-sitter ) which 
was designed explicitly to provide a C library for incremental 
programming language parsing for text editors. It allows for very 
consistent fontification in other editors like Atom, and is available 
under the MIT license, which would permit it to be included in Emacs.

A very good presentation a few years ago by the author, including an 
explanation of how Tree Sitter enables high quality fontification in 
editors like Atom, can be viewed on youtube: 
https://www.youtube.com/watch?v=Jes3bD6P0To

Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-15 18:13                                                 ` Perry E. Metzger
@ 2021-07-15 22:43                                                   ` Perry E. Metzger
  2021-07-19 23:49                                                     ` Stephen Leake
  0 siblings, 1 reply; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-15 22:43 UTC (permalink / raw)
  To: emacs-devel

On 7/15/21 14:13, Perry E. Metzger wrote:
> Using LSP for fontification is unfortunately not sufficiently high 
> performance. LSP is really intended for things like providing type 
> information or enabling refactorings.
>
> I note that several other modern editors now make use of the "Tree 
> Sitter" library (see https://github.com/tree-sitter/tree-sitter ) 
> which was designed explicitly to provide a C library for incremental 
> programming language parsing for text editors. It allows for very 
> consistent fontification in other editors like Atom, and is available 
> under the MIT license, which would permit it to be included in Emacs.
>
> A very good presentation a few years ago by the author, including an 
> explanation of how Tree Sitter enables high quality fontification in 
> editors like Atom, can be viewed on youtube: 
> https://www.youtube.com/watch?v=Jes3bD6P0To
>
Apologies for not having been present for, er, the extensive previous 
discussion on Tree Sitter. I discovered it looking at the archives. I 
still believe that it would be a great thing to integrate into the base 
of Emacs. The algorithms it employs are excellent, it's extremely fast, 
and it handles the issues of real editors (like dealing with partial 
code fragments and incrementally changing the parse on every keystroke) 
very efficiently.

Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-15 22:43                                                   ` Tree Sitter (was Re: cc-mode fontification feels random) Perry E. Metzger
@ 2021-07-19 23:49                                                     ` Stephen Leake
  2021-07-20 14:53                                                       ` Perry E. Metzger
  0 siblings, 1 reply; 274+ messages in thread
From: Stephen Leake @ 2021-07-19 23:49 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: emacs-devel

"Perry E. Metzger" <perry@piermont.com> writes:

> Apologies for not having been present for, er, the extensive previous
> discussion on Tree Sitter. I discovered it looking at the archives. 

Ok.

> I still believe that it would be a great thing to integrate into the
> base of Emacs. The algorithms it employs are excellent, it's extremely
> fast, and it handles the issues of real editors (like dealing with
> partial code fragments and incrementally changing the parse on every
> keystroke) very efficiently.

+1.

I'm working on adding incremental parse to wisi (and have been for
almost a year now ...). I believe wisi has stronger error recovery than
tree-sitter, which allows it to support indentation.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-19 23:49                                                     ` Stephen Leake
@ 2021-07-20 14:53                                                       ` Perry E. Metzger
  2021-07-21  0:04                                                         ` Stephen Leake
  0 siblings, 1 reply; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-20 14:53 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

On 7/19/21 19:49, Stephen Leake wrote:
> "Perry E. Metzger" <perry@piermont.com> writes:
>
>> Apologies for not having been present for, er, the extensive previous
>> discussion on Tree Sitter. I discovered it looking at the archives.
> Ok.
>
>> I still believe that it would be a great thing to integrate into the
>> base of Emacs. The algorithms it employs are excellent, it's extremely
>> fast, and it handles the issues of real editors (like dealing with
>> partial code fragments and incrementally changing the parse on every
>> keystroke) very efficiently.
> +1.
>
> I'm working on adding incremental parse to wisi (and have been for
> almost a year now ...). I believe wisi has stronger error recovery than
> tree-sitter, which allows it to support indentation.


Tree sitter can reparse an entire file in a few milliseconds. This is 
almost impossible to achieve in elisp I suspect. Because of this, it can 
reparse on every keystroke, which is quite an astonishing thing.


Perry




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-20 14:53                                                       ` Perry E. Metzger
@ 2021-07-21  0:04                                                         ` Stephen Leake
  2021-07-21  1:28                                                           ` Stefan Monnier
  2021-07-22 14:00                                                           ` Perry E. Metzger
  0 siblings, 2 replies; 274+ messages in thread
From: Stephen Leake @ 2021-07-21  0:04 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: emacs-devel

"Perry E. Metzger" <perry@piermont.com> writes:

> On 7/19/21 19:49, Stephen Leake wrote:
>> "Perry E. Metzger" <perry@piermont.com> writes:
>>
>>> Apologies for not having been present for, er, the extensive previous
>>> discussion on Tree Sitter. I discovered it looking at the archives.
>> Ok.
>>
>>> I still believe that it would be a great thing to integrate into the
>>> base of Emacs. The algorithms it employs are excellent, it's extremely
>>> fast, and it handles the issues of real editors (like dealing with
>>> partial code fragments and incrementally changing the parse on every
>>> keystroke) very efficiently.
>> +1.
>>
>> I'm working on adding incremental parse to wisi (and have been for
>> almost a year now ...). I believe wisi has stronger error recovery than
>> tree-sitter, which allows it to support indentation.
>
>
> Tree sitter can reparse an entire file in a few milliseconds. This is
> almost impossible to achieve in elisp I suspect. 

Yes. wisi.el is an elisp interface to an external process that is
implemented in Ada. At some point, it will also support an internal
module, which will avoid having to send text; larger files will be
faster. When I finish implementing incremental parse, it should be as
fast as tree-sitter.

> Because of this, it can reparse on every keystroke, which is quite an
> astonishing thing.

There are some reports in the tree-sitter issues of reparsing taking
longer, on large files. So there are some parts of the algorithm that
are proportional to the buffer length, while most of the algorithm is
proportional to the changes length. 

Consider; if the parse tree stores absolute buffer position for each
token, then when you insert 5 chars at the beginning of the buffer, the
buffer position of every node in the tree must be shifted by 5 chars.
That process is linear in the length of the buffer (it can also be very
fast).

Alternately, you can only store the length of each token (as tree-sitter
does); then when processing queries, you have to add up all the lengths
of the preceding tokens in order to report the buffer position of the
information you are computing. That is also linear in the length of the
buffer.

We'll have to see how fast wisi is; I'm making good progress in my
testing, but there are still a few non-incremental algorithms to convert.

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-21  0:04                                                         ` Stephen Leake
@ 2021-07-21  1:28                                                           ` Stefan Monnier
  2021-07-21 14:43                                                             ` Perry E. Metzger
  2021-07-22 14:00                                                           ` Perry E. Metzger
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-07-21  1:28 UTC (permalink / raw)
  To: Stephen Leake; +Cc: Perry E. Metzger, emacs-devel

> Alternately, you can only store the length of each token (as tree-sitter
> does); then when processing queries, you have to add up all the lengths
> of the preceding tokens in order to report the buffer position of the
> information you are computing. That is also linear in the length of the
> buffer.

You can probably get better than linear performance in "the usual case"
by storing the total length of the subtree at each node of the AST.
It's still theoretically linear in the worst case, of course.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-21  1:28                                                           ` Stefan Monnier
@ 2021-07-21 14:43                                                             ` Perry E. Metzger
  2021-07-21 16:21                                                               ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-21 14:43 UTC (permalink / raw)
  To: Stefan Monnier, Stephen Leake; +Cc: emacs-devel

On 7/20/21 21:28, Stefan Monnier wrote:
>> Alternately, you can only store the length of each token (as tree-sitter
>> does); then when processing queries, you have to add up all the lengths
>> of the preceding tokens in order to report the buffer position of the
>> information you are computing. That is also linear in the length of the
>> buffer.
> You can probably get better than linear performance in "the usual case"
> by storing the total length of the subtree at each node of the AST.
> It's still theoretically linear in the worst case, of course.
>
Thought I would note that there's a substantial literature now on 
incremental parsing, especially the sort that is needed for editor 
tools. One doesn't need to reinvent the algorithms, they're out there 
waiting to be used. The Tree Sitter project is based on previous 
published work.

There are good links at the end of the 
https://tree-sitter.github.io/tree-sitter/ web page, but I thought I'd 
link to some a couple of them directly:

Practical Algorithms for Incremental Software Development Environments
https://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-946.pdf

Incremental Analysis of Real Programming Languages
https://www.semanticscholar.org/paper/Incremental-analysis-of-real-programming-languages-Wagner-Graham/ca69018c29cc415820ed207d7e1d391e2da1656f?p2df

There's also this paper on error recovery for LR parsers, but for some 
reason it won't load for me right now.
http://www.dtic.mil/dtic/tr/fulltext/u2/a043470.pdf


Perry




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-21 14:43                                                             ` Perry E. Metzger
@ 2021-07-21 16:21                                                               ` Daniel Colascione
  2021-07-21 19:15                                                                 ` Perry E. Metzger
  0 siblings, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-07-21 16:21 UTC (permalink / raw)
  To: Perry E. Metzger, Stefan Monnier, Stephen Leake; +Cc: emacs-devel

On 7/21/21 7:43 AM, Perry E. Metzger wrote:
> On 7/20/21 21:28, Stefan Monnier wrote:
>>> Alternately, you can only store the length of each token (as 
>>> tree-sitter
>>> does); then when processing queries, you have to add up all the lengths
>>> of the preceding tokens in order to report the buffer position of the
>>> information you are computing. That is also linear in the length of the
>>> buffer.
>> You can probably get better than linear performance in "the usual case"
>> by storing the total length of the subtree at each node of the AST.
>> It's still theoretically linear in the worst case, of course.
>>
> Thought I would note that there's a substantial literature now on 
> incremental parsing, especially the sort that is needed for editor 
> tools. One doesn't need to reinvent the algorithms, they're out there 
> waiting to be used. The Tree Sitter project is based on previous 
> published work.

There is indeed a big literature! I wish there were a bigger literature 
on *composable* incremental parsers though. IMHO, what we need is an 
incremental GLR system (yes, GLR is bad worst-case, but it's not a 
practical concern) that spits out a parse *forest* which we then pare 
down to a parse tree with ad-hoc syntactic consistency rules. Something 
like this naturally supports multi-language modes and incorporation of 
out-of-band semantic information.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-21 16:21                                                               ` Daniel Colascione
@ 2021-07-21 19:15                                                                 ` Perry E. Metzger
  2021-07-22  1:16                                                                   ` Daniel Colascione
  0 siblings, 1 reply; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-21 19:15 UTC (permalink / raw)
  To: Daniel Colascione, Stefan Monnier, Stephen Leake; +Cc: emacs-devel

On 7/21/21 12:21, Daniel Colascione wrote:
> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>> Thought I would note that there's a substantial literature now on 
>> incremental parsing, especially the sort that is needed for editor 
>> tools. One doesn't need to reinvent the algorithms, they're out there 
>> waiting to be used. The Tree Sitter project is based on previous 
>> published work.
>
> There is indeed a big literature! I wish there were a bigger 
> literature on *composable* incremental parsers though. IMHO, what we 
> need is an incremental GLR system (yes, GLR is bad worst-case, but 
> it's not a practical concern) that spits out a parse *forest* which we 
> then pare down to a parse tree with ad-hoc syntactic consistency 
> rules. Something like this naturally supports multi-language modes and 
> incorporation of out-of-band semantic information.
>
Tree sitter handles GLR.

Perry




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-21 19:15                                                                 ` Perry E. Metzger
@ 2021-07-22  1:16                                                                   ` Daniel Colascione
  2021-07-22 13:18                                                                     ` Perry E. Metzger
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-07-22  1:16 UTC (permalink / raw)
  To: Perry E. Metzger, Stefan Monnier, Stephen Leake; +Cc: emacs-devel


On 7/21/21 12:15 PM, Perry E. Metzger wrote:
> On 7/21/21 12:21, Daniel Colascione wrote:
>> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>>> Thought I would note that there's a substantial literature now on 
>>> incremental parsing, especially the sort that is needed for editor 
>>> tools. One doesn't need to reinvent the algorithms, they're out 
>>> there waiting to be used. The Tree Sitter project is based on 
>>> previous published work.
>>
>> There is indeed a big literature! I wish there were a bigger 
>> literature on *composable* incremental parsers though. IMHO, what we 
>> need is an incremental GLR system (yes, GLR is bad worst-case, but 
>> it's not a practical concern) that spits out a parse *forest* which 
>> we then pare down to a parse tree with ad-hoc syntactic consistency 
>> rules. Something like this naturally supports multi-language modes 
>> and incorporation of out-of-band semantic information.
>>
> Tree sitter handles GLR.
>

Cool. How does it prune the parse forest?




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-22  1:16                                                                   ` Daniel Colascione
@ 2021-07-22 13:18                                                                     ` Perry E. Metzger
  2021-07-22 13:49                                                                     ` Yuan Fu
  2021-07-24 20:05                                                                     ` [SPAM UNSURE] " Stephen Leake
  2 siblings, 0 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-22 13:18 UTC (permalink / raw)
  To: Daniel Colascione, Stefan Monnier, Stephen Leake; +Cc: emacs-devel

On 7/21/21 21:16, Daniel Colascione wrote:
>
> On 7/21/21 12:15 PM, Perry E. Metzger wrote:
>> On 7/21/21 12:21, Daniel Colascione wrote:
>>> There is indeed a big literature! I wish there were a bigger 
>>> literature on *composable* incremental parsers though. IMHO, what we 
>>> need is an incremental GLR system (yes, GLR is bad worst-case, but 
>>> it's not a practical concern) that spits out a parse *forest* which 
>>> we then pare down to a parse tree with ad-hoc syntactic consistency 
>>> rules. Something like this naturally supports multi-language modes 
>>> and incorporation of out-of-band semantic information.
>>>
>> Tree sitter handles GLR.
>>
>
> Cool. How does it prune the parse forest?

I'd read the papers it is based on (and the documentation), they'll do a 
better job than me.

Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-22  1:16                                                                   ` Daniel Colascione
  2021-07-22 13:18                                                                     ` Perry E. Metzger
@ 2021-07-22 13:49                                                                     ` Yuan Fu
  2021-07-24 20:05                                                                     ` [SPAM UNSURE] " Stephen Leake
  2 siblings, 0 replies; 274+ messages in thread
From: Yuan Fu @ 2021-07-22 13:49 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: emacs-devel, Stephen Leake, Stefan Monnier, Perry E. Metzger

[-- Attachment #1: Type: text/plain, Size: 1193 bytes --]



> On Jul 21, 2021, at 9:16 PM, Daniel Colascione <dancol@dancol.org> wrote:
> 
> 
> On 7/21/21 12:15 PM, Perry E. Metzger wrote:
>> On 7/21/21 12:21, Daniel Colascione wrote:
>>> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>>>> Thought I would note that there's a substantial literature now on incremental parsing, especially the sort that is needed for editor tools. One doesn't need to reinvent the algorithms, they're out there waiting to be used. The Tree Sitter project is based on previous published work.
>>> 
>>> There is indeed a big literature! I wish there were a bigger literature on *composable* incremental parsers though. IMHO, what we need is an incremental GLR system (yes, GLR is bad worst-case, but it's not a practical concern) that spits out a parse *forest* which we then pare down to a parse tree with ad-hoc syntactic consistency rules. Something like this naturally supports multi-language modes and incorporation of out-of-band semantic information.
>>> 
>> Tree sitter handles GLR.
>> 
> 
> Cool. How does it prune the parse forest?

I’m not an expert, but the author talked about using grammar definition to reject branches in his talk.

Yuan

[-- Attachment #2: Type: text/html, Size: 4014 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-21  0:04                                                         ` Stephen Leake
  2021-07-21  1:28                                                           ` Stefan Monnier
@ 2021-07-22 14:00                                                           ` Perry E. Metzger
  2021-07-24  1:17                                                             ` Richard Stallman
  2021-07-24 19:59                                                             ` Stephen Leake
  1 sibling, 2 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-22 14:00 UTC (permalink / raw)
  To: emacs-devel

On 7/20/21 20:04, Stephen Leake wrote:
> Yes. wisi.el is an elisp interface to an external process that is
> implemented in Ada. At some point, it will also support an internal
> module, which will avoid having to send text; larger files will be
> faster. When I finish implementing incremental parse, it should be as
> fast as tree-sitter.

I agree that Ada is in many ways a superior language to C -- C is horrid 
-- but Emacs is written in C, and having a core feature of Emacs depend 
on Ada code is unlikely to be widely acceptable. Everyone compiling 
Emacs has a C compiler, but not everyone has an Ada compiler. Making an 
Ada compiler a prerequisite to compiling Emacs might be controversial.

Perry

(*C + elisp, but of course the elisp infrastructure is written in C and 
ships with Emacs.)





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-22 14:00                                                           ` Perry E. Metzger
@ 2021-07-24  1:17                                                             ` Richard Stallman
  2021-07-25 16:13                                                               ` Stephen Leake
  2021-07-24 19:59                                                             ` Stephen Leake
  1 sibling, 1 reply; 274+ messages in thread
From: Richard Stallman @ 2021-07-24  1:17 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

The features that the DoD liked so much about Ada, to me make it feel
very clunky.  You have to declare so much!

What advantages does wisi.el's Ada module have over Tree Sitter?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-22 14:00                                                           ` Perry E. Metzger
  2021-07-24  1:17                                                             ` Richard Stallman
@ 2021-07-24 19:59                                                             ` Stephen Leake
  2021-07-24 21:21                                                               ` OFF-TOPIC: Ada availability (was: Tree Sitter) Óscar Fuentes
  1 sibling, 1 reply; 274+ messages in thread
From: Stephen Leake @ 2021-07-24 19:59 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: emacs-devel

"Perry E. Metzger" <perry@piermont.com> writes:

> On 7/20/21 20:04, Stephen Leake wrote:
>> Yes. wisi.el is an elisp interface to an external process that is
>> implemented in Ada. At some point, it will also support an internal
>> module, which will avoid having to send text; larger files will be
>> faster. When I finish implementing incremental parse, it should be as
>> fast as tree-sitter.
>
> I agree that Ada is in many ways a superior language to C -- C is
> horrid -- but Emacs is written in C, and having a core feature of
> Emacs depend on Ada code is unlikely to be widely acceptable. 

Yes, I never intended wisi to be a core part of Emacs, and I can't stand
programming in C. But I suppose anything is possible.

> Everyone compiling Emacs has a C compiler, but not everyone has an Ada
> compiler. 

Actually, anyone that uses gcc can easily have an Ada compiler; it's
either already there or easily installed. 

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-22  1:16                                                                   ` Daniel Colascione
  2021-07-22 13:18                                                                     ` Perry E. Metzger
  2021-07-22 13:49                                                                     ` Yuan Fu
@ 2021-07-24 20:05                                                                     ` Stephen Leake
  2021-07-25  0:41                                                                       ` Daniel Colascione
  2021-07-25 18:01                                                                       ` Perry E. Metzger
  2 siblings, 2 replies; 274+ messages in thread
From: Stephen Leake @ 2021-07-24 20:05 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, Stefan Monnier, Perry E. Metzger

Daniel Colascione <dancol@dancol.org> writes:

> On 7/21/21 12:15 PM, Perry E. Metzger wrote:
>> On 7/21/21 12:21, Daniel Colascione wrote:
>>> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>>>> Thought I would note that there's a substantial literature now on
>>>> incremental parsing, especially the sort that is needed for editor
>>>> tools. One doesn't need to reinvent the algorithms, they're out
>>>> there waiting to be used. The Tree Sitter project is based on
>>>> previous published work.
>>>
>>> There is indeed a big literature! I wish there were a bigger
>>> literature on *composable* incremental parsers though. IMHO, what
>>> we need is an incremental GLR system (yes, GLR is bad worst-case,
>>> but it's not a practical concern) that spits out a parse *forest*
>>> which we then pare down to a parse tree with ad-hoc syntactic
>>> consistency rules. Something like this naturally supports
>>> multi-language modes and incorporation of out-of-band semantic
>>> information.
>>>
>> Tree sitter handles GLR.
>>
>
> Cool. How does it prune the parse forest?

wisi also uses GLR. It prunes trees during parse when the parse stacks
contained in the trees are identical; it uses error recover cost and
length to decide which tree to delete, or picks one at random. It's an
error if more than one tree is alive at the end of parse. That's because
programming languages must be unambiguous. It would be possible to adapt
the wisi parser to use some other pruning strategy. 

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 274+ messages in thread

* OFF-TOPIC: Ada availability (was: Tree Sitter)
  2021-07-24 19:59                                                             ` Stephen Leake
@ 2021-07-24 21:21                                                               ` Óscar Fuentes
  2021-07-25  7:31                                                                 ` tomas
  0 siblings, 1 reply; 274+ messages in thread
From: Óscar Fuentes @ 2021-07-24 21:21 UTC (permalink / raw)
  To: emacs-devel

Stephen Leake <stephen_leake@stephe-leake.org> writes:

>> Everyone compiling Emacs has a C compiler, but not everyone has an Ada
>> compiler. 
>
> Actually, anyone that uses gcc can easily have an Ada compiler; it's
> either already there or easily installed. 

While gcc 11.2 is days away from being released, MSYS2 is stuck with gcc
10.3 because Ada does not build. This is the second time on the last few
years that MSYS2 can't upgrade to the latest gcc because of Ada.

I think that MSYS2 will eventually drop Ada support, which is a radical
step because gcc-Ada can only be bootstrapped with gcc-Ada, so once it
is removed it will be a burden to get it back.

I'm afraid that eventually all non-primary platforms will suffer from
this problem, because of the limited resources of gcc-Ada's maintainers
and the small user community.

Then we could discuss the wisdom of depending on a key component written
on a language with very few hackers around ("few" considering that the
scarceness of C contributors is a concern for Emacs maintainers.)

Don't get me wrong, I've heard great praises for Ada from people whom I
respect, so I'm ready to concede that it is a great language. But, as we
all know too well, that's not necessary nor sufficient to be a sensible
choice on practical terms.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-24 20:05                                                                     ` [SPAM UNSURE] " Stephen Leake
@ 2021-07-25  0:41                                                                       ` Daniel Colascione
  2021-07-26  4:24                                                                         ` [SPAM UNSURE] " Stephen Leake
  2021-07-25 18:01                                                                       ` Perry E. Metzger
  1 sibling, 1 reply; 274+ messages in thread
From: Daniel Colascione @ 2021-07-25  0:41 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel, Stefan Monnier, Perry E. Metzger

On 7/24/21 1:05 PM, Stephen Leake wrote:

> Daniel Colascione <dancol@dancol.org> writes:
>
>> On 7/21/21 12:15 PM, Perry E. Metzger wrote:
>>> On 7/21/21 12:21, Daniel Colascione wrote:
>>>> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>>>>> Thought I would note that there's a substantial literature now on
>>>>> incremental parsing, especially the sort that is needed for editor
>>>>> tools. One doesn't need to reinvent the algorithms, they're out
>>>>> there waiting to be used. The Tree Sitter project is based on
>>>>> previous published work.
>>>> There is indeed a big literature! I wish there were a bigger
>>>> literature on *composable* incremental parsers though. IMHO, what
>>>> we need is an incremental GLR system (yes, GLR is bad worst-case,
>>>> but it's not a practical concern) that spits out a parse *forest*
>>>> which we then pare down to a parse tree with ad-hoc syntactic
>>>> consistency rules. Something like this naturally supports
>>>> multi-language modes and incorporation of out-of-band semantic
>>>> information.
>>>>
>>> Tree sitter handles GLR.
>>>
>> Cool. How does it prune the parse forest?
> wisi also uses GLR. It prunes trees during parse when the parse stacks
> contained in the trees are identical; it uses error recover cost and
> length to decide which tree to delete, or picks one at random. It's an
> error if more than one tree is alive at the end of parse. That's because
> programming languages must be unambiguous. It would be possible to adapt
> the wisi parser to use some other pruning strategy.


Programs *as a whole*, properly understood by a compiler or execution 
environment, must be unambiguous. That's true. But when we're editing, 
we're dealing with program fragments, sometimes damaged by user 
modifications, and have to do our best given fragmentary information. 
All I'm suggesting is that it'd be useful to use language-specific 
semantic rules to disambiguate parse trees: for example, if in location 
L1, symbol T can be a type or a name, and in location L2, symbol T is 
definitely a type, then we should regard symbol T as a type in location 
L1 too. Iterate until we reach a fixed point, and only *then* apply the 
more general disambiguation strategies you've mentioned.




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: OFF-TOPIC: Ada availability (was: Tree Sitter)
  2021-07-24 21:21                                                               ` OFF-TOPIC: Ada availability (was: Tree Sitter) Óscar Fuentes
@ 2021-07-25  7:31                                                                 ` tomas
  0 siblings, 0 replies; 274+ messages in thread
From: tomas @ 2021-07-25  7:31 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]

On Sat, Jul 24, 2021 at 11:21:57PM +0200, Óscar Fuentes wrote:
> Stephen Leake <stephen_leake@stephe-leake.org> writes:
> 
> >> Everyone compiling Emacs has a C compiler, but not everyone has an Ada
> >> compiler. 
> >
> > Actually, anyone that uses gcc can easily have an Ada compiler; it's
> > either already there or easily installed. 
> 
> While gcc 11.2 is days away from being released, MSYS2 is stuck with gcc
> 10.3 because Ada does not build [...]

> Then we could discuss the wisdom of depending on a key component written
> on a language with very few hackers around [...]

Indeed an important point (although myself, I'm a staunch defender of
diversity in software).

For an example on how such a story may unfold, see the sad story of the
SKS keyserver [1]. A central component of the web of trust has an important
bug and nobody dares to touch it... because it's written in OCaml [2].

Cheers

[1] https://csirt.cy/en/sks-keyserver-network-under-attack/
[2] Yeah, that's simplifying a bit, but the point is how utterly
   important "code accessibility" is in free software.

 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-24  1:17                                                             ` Richard Stallman
@ 2021-07-25 16:13                                                               ` Stephen Leake
  2021-07-25 19:52                                                                 ` Ada (was Re: Tree Sitter) Perry E. Metzger
  2021-07-26  2:23                                                                 ` Tree Sitter (was Re: cc-mode fontification feels random) John Yates
  0 siblings, 2 replies; 274+ messages in thread
From: Stephen Leake @ 2021-07-25 16:13 UTC (permalink / raw)
  To: Richard Stallman; +Cc: emacs-devel, Perry E. Metzger

Richard Stallman <rms@gnu.org> writes:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
> The features that the DoD liked so much about Ada, to me make it feel
> very clunky.  You have to declare so much!

Yes, and then the compiler checks everything for you, so the code is
much more likely to be correct before you start testing.

It also helps when modifying/extending code; if it doesn't compile,
you've done something wrong, and the error messages point to what to
fix.

In addition, SPARK (https://www.adacore.com/sparkpro) is a formal proof
system designed for Ada, giving you even more power to build programs
that are correct.

A long time ago, I was working on a system that was programmed in C++. I
re-implemented it in Ada; it was pretty clear that I could write correct
code at least 4 times faster in Ada than in C++. Now I only write code
in something other than Ada if there is no way to use Ada (for example,
my music app on Android is in Java; it's nominally possible to write Ada
code for Android, but it takes a _lot_ of work, and would break with
every new release of Android).

> What advantages does wisi.el's Ada module have over Tree Sitter?

That's not entirely clear yet. I believe the error recovery in wisi is
more powerful than tree-sitter's, but I'm probably biased, and it's
hard to come up with a good objective metric until we get both fully
integrated into Emacs. It is clear that good error recovery is essential
to implementing indentation using a parser; tree-sitter is not
advertised as supporting indentation, while indentation is a primary
purpose of wisi.

The parser generator in wisi is more powerful in some ways; it can
handle LR1 table generation for Ada, using a grammar that closely
follows the grammar in the Ada Language Reference Manual; tree-sitter
can't handle that. tree-sitter could probably handle it if someone
spends time simplifying/optimizing the grammar.

The tree-sitter parser generator allows specifying token precedence to
resolve grammar conflicts; wisi has no support for that (it could be
added).

tree-sitter has been around for a while, and there are many people and
editors using it and working on it. wisi is just me working on it, and
Emacs using it for ada-mode.

tree-sitter also provides bindings to the parser for other languages.
That is possible with wisi, but I don't have the bandwidth for it.

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-24 20:05                                                                     ` [SPAM UNSURE] " Stephen Leake
  2021-07-25  0:41                                                                       ` Daniel Colascione
@ 2021-07-25 18:01                                                                       ` Perry E. Metzger
  1 sibling, 0 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-25 18:01 UTC (permalink / raw)
  To: Stephen Leake, Daniel Colascione; +Cc: Stefan Monnier, emacs-devel

On 7/24/21 16:05, Stephen Leake wrote:
>>> Tree sitter handles GLR.
>>>
>> Cool. How does it prune the parse forest?
> wisi also uses GLR. It prunes trees during parse when the parse stacks
> contained in the trees are identical; it uses error recover cost and
> length to decide which tree to delete, or picks one at random. It's an
> error if more than one tree is alive at the end of parse. That's because
> programming languages must be unambiguous. It would be possible to adapt
> the wisi parser to use some other pruning strategy.
>
So, you've said you don't intend for wisi to be shipped as part of GNU 
Emacs. Some of us are talking about incorporating Tree Sitter directly 
in GNU Emacs. Given this, I'm not sure why it is important to bring wisi 
up as though it was an alternative to Tree Sitter?


Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Ada (was Re: Tree Sitter)
  2021-07-25 16:13                                                               ` Stephen Leake
@ 2021-07-25 19:52                                                                 ` Perry E. Metzger
  2021-07-26  5:05                                                                   ` Stephen Leake
  2021-07-27  0:26                                                                   ` Richard Stallman
  2021-07-26  2:23                                                                 ` Tree Sitter (was Re: cc-mode fontification feels random) John Yates
  1 sibling, 2 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-25 19:52 UTC (permalink / raw)
  To: Stephen Leake, Richard Stallman; +Cc: emacs-devel

This is getting off topic, so I've changed the Subject: line. It 
probably shouldn't continue to be pursued here, this entire response may 
be seriously past what should be on the -devel list and I don't blame 
people for tuning out.

On 7/25/21 12:13, Stephen Leake wrote:
> Richard Stallman <rms@gnu.org> writes:
>
>>
>> The features that the DoD liked so much about Ada, to me make it feel
>> very clunky.  You have to declare so much!
> Yes, and then the compiler checks everything for you, so the code is
> much more likely to be correct before you start testing.
>
> It also helps when modifying/extending code; if it doesn't compile,
> you've done something wrong, and the error messages point to what to
> fix.

The Ada compiler doesn't actually statically guarantee all safety. In 
particular, issues like concurrency safety violations, use after free, 
null pointer dereference, etc. are all possible in Ada. It's better than 
C certainly, as it provides for things like array bounds checking and 
has a strong static type system, but Ada is very much an early 1980s 
language and it shows.

There are far more modern systems programming languages out there (like 
Rust) that statically guarantee far more, including that use after free 
is impossible, that  threads cannot have data races, that null pointers 
cannot exist. This should not be surprising, as type theory (and 
programming language theory in general) has advanced dramatically in the 
last 40 years. Rust in particular is an excellent language, and in 
addition to superior safety, has far better ergonomics.

I honestly cannot see why anyone would write a program now in Ada rather 
than in Rust if their interest was high assurance combined with high 
programmer productivity. There is no axis on which Ada is superior, and 
many on which it is far worse.

All that said, I _can_ think of good reasons to work in C when dealing 
with GNU Emacs, most specifically, that Emacs is (at least currently) 
written in C. Use of a language like Ada makes the tooling situation for 
a potential user much less pleasant.

> In addition, SPARK (https://www.adacore.com/sparkpro) is a formal proof
> system designed for Ada, giving you even more power to build programs
> that are correct.

Yes, and for C I can use VST from Princeton for the same purpose (VST 
being a Coq-based separation logic based on CompCert), there are several 
other formal semantics for C that can be used for the same purpose 
(including other Coq based CompCert derived semantics as well as the K 
based semantics done by Chucky Ellison), and the Rustbelt project (not 
yet quite as production ready) provides a Coq semantics for Rust with 
which can be used for the same purpose.

There are two formally verified operating system kernels in existence, 
SEL4 and CertiKOS. Both are written in C. I don't think working in C is 
an optimal path to creating such systems, it's a dangerous language, but 
I do want to point out that SPARK is nothing special at this point.

> A long time ago, I was working on a system that was programmed in C++. I
> re-implemented it in Ada; it was pretty clear that I could write correct
> code at least 4 times faster in Ada than in C++. Now I only write code
> in something other than Ada if there is no way to use Ada (for example,
> my music app on Android is in Java; it's nominally possible to write Ada
> code for Android, but it takes a _lot_ of work, and would break with
> every new release of Android).
>
I recommend you try out Rust. That said, it, too, is not the right path 
for writing Android apps.

>> What advantages does wisi.el's Ada module have over Tree Sitter?
> That's not entirely clear yet. I believe the error recovery in wisi is
> more powerful than tree-sitter's, but I'm probably biased, and it's
> hard to come up with a good objective metric until we get both fully
> integrated into Emacs. It is clear that good error recovery is essential
> to implementing indentation using a parser; tree-sitter is not
> advertised as supporting indentation, while indentation is a primary
> purpose of wisi.

Error recovery in Tree Sitter is excellent.

Tree sitter has also been used for indentation. The videos presenting 
Tree Sitter make it clear that font highlighting, code folding, 
indentation and many other purposes are all envisioned for the library.

>
> The parser generator in wisi is more powerful in some ways; it can
> handle LR1 table generation for Ada, using a grammar that closely
> follows the grammar in the Ada Language Reference Manual; tree-sitter
> can't handle that. tree-sitter could probably handle it if someone
> spends time simplifying/optimizing the grammar.

Tree sitter can handle arbitrary LR and GLR grammars. Almost any other 
grammar for a real programming language (say an LL1 grammar) can be 
mechanically transformed into one of those.

> The tree-sitter parser generator allows specifying token precedence to
> resolve grammar conflicts; wisi has no support for that (it could be
> added).
>
> tree-sitter has been around for a while, and there are many people and
> editors using it and working on it. wisi is just me working on it, and
> Emacs using it for ada-mode.
>
> tree-sitter also provides bindings to the parser for other languages.
> That is possible with wisi, but I don't have the bandwidth for it.
>
Generally, I'm in favor of people trying experiments. People should 
spend their time however they like, and should follow wherever their 
muse takes them. This is how we learn. I encourage you to keep working 
on your project.

However, in the current circumstance, I suspect that the wisi effort is 
less likely to produce a robust result that Emacs can rely on. It is 
written in a language not generally used for Emacs code. It is being 
worked on by single individual instead of a large community. It does not 
currently have obvious advantages in terms of features.

Perry

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-25 16:13                                                               ` Stephen Leake
  2021-07-25 19:52                                                                 ` Ada (was Re: Tree Sitter) Perry E. Metzger
@ 2021-07-26  2:23                                                                 ` John Yates
  1 sibling, 0 replies; 274+ messages in thread
From: John Yates @ 2021-07-26  2:23 UTC (permalink / raw)
  To: Stephen Leake; +Cc: Perry E. Metzger, Richard Stallman, Emacs developers

On Sun, Jul 25, 2021 at 12:14 PM Stephen Leake
<stephen_leake@stephe-leake.org> wrote:
>
> Richard Stallman <rms@gnu.org> writes:
> >
> > The features that the DoD liked so much about Ada, to me make it feel
> > very clunky.  You have to declare so much!
>
> Yes, and then the compiler checks everything for you, so the code is
> much more likely to be correct before you start testing.
>
> It also helps when modifying/extending code; if it doesn't compile,
> you've done something wrong, and the error messages point to what to
> fix.

I started my programming career in the early 70s programming PDP-8s
and PDP-11s in assembler.  In the late 70's I joined DEC's Vax Pascal V2
project.  Coding in Bliss-32 was a revelation: high level control flow and
no longer having to do my own register allocation.  Still there were neither
true types nor function signatures (think original pre-prototypes K&R C,)
Once the compiler was shipped my next project was writing a bug tracker
in Pascal.  If Bliss had been a revelation then coding in strongly typed
Pascal was my "conversion on the road to Damascus".

On rare occasions I still end up having to write some assembler.  But
otherwise give me as much declaration and type-checking as possible.

/john

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [SPAM UNSURE] Re: [SPAM UNSURE] Re: Tree Sitter (was Re: cc-mode fontification feels random)
  2021-07-25  0:41                                                                       ` Daniel Colascione
@ 2021-07-26  4:24                                                                         ` Stephen Leake
  0 siblings, 0 replies; 274+ messages in thread
From: Stephen Leake @ 2021-07-26  4:24 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel, Stefan Monnier, Perry E. Metzger

Daniel Colascione <dancol@dancol.org> writes:

> On 7/24/21 1:05 PM, Stephen Leake wrote:
>
>> Daniel Colascione <dancol@dancol.org> writes:
>>
>>> On 7/21/21 12:15 PM, Perry E. Metzger wrote:
>>>> On 7/21/21 12:21, Daniel Colascione wrote:
>>>>> On 7/21/21 7:43 AM, Perry E. Metzger wrote:
>>>>>> Thought I would note that there's a substantial literature now on
>>>>>> incremental parsing, especially the sort that is needed for editor
>>>>>> tools. One doesn't need to reinvent the algorithms, they're out
>>>>>> there waiting to be used. The Tree Sitter project is based on
>>>>>> previous published work.
>>>>> There is indeed a big literature! I wish there were a bigger
>>>>> literature on *composable* incremental parsers though. IMHO, what
>>>>> we need is an incremental GLR system (yes, GLR is bad worst-case,
>>>>> but it's not a practical concern) that spits out a parse *forest*
>>>>> which we then pare down to a parse tree with ad-hoc syntactic
>>>>> consistency rules. Something like this naturally supports
>>>>> multi-language modes and incorporation of out-of-band semantic
>>>>> information.
>>>>>
>>>> Tree sitter handles GLR.
>>>>
>>> Cool. How does it prune the parse forest?
>> wisi also uses GLR. It prunes trees during parse when the parse stacks
>> contained in the trees are identical; it uses error recover cost and
>> length to decide which tree to delete, or picks one at random. It's an
>> error if more than one tree is alive at the end of parse. That's because
>> programming languages must be unambiguous. It would be possible to adapt
>> the wisi parser to use some other pruning strategy.
>
>
> Programs *as a whole*, properly understood by a compiler or execution
> environment, must be unambiguous. That's true. But when we're editing,
> we're dealing with program fragments, sometimes damaged by user
> modifications, and have to do our best given fragmentary information.

Right. That's why wisi has robust error recovery.

> All I'm suggesting is that it'd be useful to use language-specific
> semantic rules to disambiguate parse trees: 

So far, wisi is only used for Ada; I did not need any disambiguation
rules that seemed language-specific. That may change when/if other
languages use wisi.

> for example, if in location L1, symbol T can be a type or a name, and
> in location L2, symbol T is definitely a type, then we should regard
> symbol T as a type in location L1 too. 

That might be possible, but it adds a layer of semantic analysis that
could be slow.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Ada (was Re: Tree Sitter)
  2021-07-25 19:52                                                                 ` Ada (was Re: Tree Sitter) Perry E. Metzger
@ 2021-07-26  5:05                                                                   ` Stephen Leake
  2021-07-26  9:42                                                                     ` Stephen Leake
  2021-07-26 13:45                                                                     ` Perry E. Metzger
  2021-07-27  0:26                                                                   ` Richard Stallman
  1 sibling, 2 replies; 274+ messages in thread
From: Stephen Leake @ 2021-07-26  5:05 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: Richard Stallman, emacs-devel

"Perry E. Metzger" <perry@piermont.com> writes:

> This is getting off topic, so I've changed the Subject: line. It
> probably shouldn't continue to be pursued here, this entire response
> may be seriously past what should be on the -devel list and I don't
> blame people for tuning out.
>
> On 7/25/21 12:13, Stephen Leake wrote:
>> Richard Stallman <rms@gnu.org> writes:
>>
>>>
>>> The features that the DoD liked so much about Ada, to me make it feel
>>> very clunky.  You have to declare so much!
>> Yes, and then the compiler checks everything for you, so the code is
>> much more likely to be correct before you start testing.
>>
>> It also helps when modifying/extending code; if it doesn't compile,
>> you've done something wrong, and the error messages point to what to
>> fix.

<snip>

> There are far more modern systems programming languages out there
> (like Rust) that statically guarantee far more, including that use
> after free is impossible, that  threads cannot have data races, that
> null pointers cannot exist. This should not be surprising, as type
> theory (and programming language theory in general) has advanced
> dramatically in the last 40 years.

Ada has kept up with some of that; the next ISO version is due in 2022.

> Rust in particular is an excellent language, and in addition to
> superior safety, has far better ergonomics.

I have heard good things about Rust, and some of the tree-sitter
infrastructure is written in Rust. I guess it's time to take my own
advice and learn a new language.

Rewritting wisi in Rust would be an interesting challenge. Although I'm
not clear that would make it more acceptable to the Emacs project.

> I honestly cannot see why anyone would write a program now in Ada
> rather than in Rust if their interest was high assurance combined with
> high programmer productivity. There is no axis on which Ada is
> superior, and many on which it is far worse.

In my defense, I started wisi when Ada was still very current, before
Rust was available.

>> In addition, SPARK (https://www.adacore.com/sparkpro) is a formal proof
>> system designed for Ada, giving you even more power to build programs
>> that are correct.
>
> Yes, and for C I can use VST from Princeton for the same purpose (VST
> being a Coq-based separation logic based on CompCert), there are
> several other formal semantics for C that can be used for the same
> purpose (including other Coq based CompCert derived semantics as well
> as the K based semantics done by Chucky Ellison), and the Rustbelt
> project (not yet quite as production ready) provides a Coq semantics
> for Rust with which can be used for the same purpose.

So Ada/SPARK is better than Rust/Rustbelt here :).

> There are two formally verified operating system kernels in existence,
> SEL4 and CertiKOS. Both are written in C. I don't think working in C
> is an optimal path to creating such systems, it's a dangerous
> language, but I do want to point out that SPARK is nothing special at
> this point.

ok.

>>> What advantages does wisi.el's Ada module have over Tree Sitter?
>> That's not entirely clear yet. I believe the error recovery in wisi is
>> more powerful than tree-sitter's, but I'm probably biased, and it's
>> hard to come up with a good objective metric until we get both fully
>> integrated into Emacs. It is clear that good error recovery is essential
>> to implementing indentation using a parser; tree-sitter is not
>> advertised as supporting indentation, while indentation is a primary
>> purpose of wisi.
>
> Error recovery in Tree Sitter is excellent.

Ok; that says nothing about whether it is better than wisi.

I propose a metric in my draft paper on wisitoken error correction [1];
length of 'diff' output on the corrected token list. (WisiToken is the
name of the parser generator/runtime used by the Gnu ELPA wisi package).
By that metric, on the set of files I used, wisitoken error correction
is better.

I made wisitoken error correction that robust in order to meet the
demands of indenting in the face of syntax errors.

I've read the papers provided as references for tree-sitter error
correction; it is not as powerful as wisi. For example, it does not
consider inserting tokens to fix the error, which is essential when
parsing a half-typed statement.

> Tree sitter has also been used for indentation. The videos presenting
> Tree Sitter make it clear that font highlighting, code folding,
> indentation and many other purposes are all envisioned for the
> library.

Ok, I missed that (I'd much rather read a document than watch a video).

I found https://codeberg.org/FelipeLema/tree-sitter-indent.el; I'll have
to play with it.

>> The parser generator in wisi is more powerful in some ways; it can
>> handle LR1 table generation for Ada, using a grammar that closely
>> follows the grammar in the Ada Language Reference Manual; tree-sitter
>> can't handle that. tree-sitter could probably handle it if someone
>> spends time simplifying/optimizing the grammar.
>
> Tree sitter can handle arbitrary LR and GLR grammars.

The Ada grammar is GLR; that's why wisi can handle it. I reported
tree-sitter not being able to handle Ada here:
https://github.com/tree-sitter/tree-sitter/issues/693

--
-- Stephe

[1] https://stephe-leake.org/ada/error_correction_algorithm.pdf



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Ada (was Re: Tree Sitter)
  2021-07-26  5:05                                                                   ` Stephen Leake
@ 2021-07-26  9:42                                                                     ` Stephen Leake
  2021-07-26 14:01                                                                       ` Perry E. Metzger
  2021-07-26 13:45                                                                     ` Perry E. Metzger
  1 sibling, 1 reply; 274+ messages in thread
From: Stephen Leake @ 2021-07-26  9:42 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: Richard Stallman, emacs-devel

Stephen Leake <stephen_leake@stephe-leake.org> writes:

>> There are far more modern systems programming languages out there
>> (like Rust) that statically guarantee far more, including that use
>> after free is impossible, that  threads cannot have data races, that
>> null pointers cannot exist. This should not be surprising, as type
>> theory (and programming language theory in general) has advanced
>> dramatically in the last 40 years.
>
> Ada has kept up with some of that; the next ISO version is due in
> 2022.

The current Ada/SPARK allows enforcing the ownership model for pointers.
Ada 2022 has structures supporting parallelizing loops and blocks.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Ada (was Re: Tree Sitter)
  2021-07-26  5:05                                                                   ` Stephen Leake
  2021-07-26  9:42                                                                     ` Stephen Leake
@ 2021-07-26 13:45                                                                     ` Perry E. Metzger
  1 sibling, 0 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-26 13:45 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

On 7/26/21 01:05, Stephen Leake wrote:
> "Perry E. Metzger" <perry@piermont.com> writes:
>
>> There are far more modern systems programming languages out there
>> (like Rust) that statically guarantee far more, including that use
>> after free is impossible, that  threads cannot have data races, that
>> null pointers cannot exist. This should not be surprising, as type
>> theory (and programming language theory in general) has advanced
>> dramatically in the last 40 years.
> Ada has kept up with some of that; the next ISO version is due in 2022.

I'm sure that improvements are being worked on, but the state of the art 
in language design really has moved on. No version of Ada has affine 
types, and no version of Ada _could_ get affine types without breaking 
the entire existing codebase.

It's not easy to retrofit it to the language without completely altering 
the language. If you start working in a language with linear or affine 
types you will immediately see why.

I agree that strongly typed languages with detection of array bounds 
violations and other undefined behavior are superior. Your own 
experience is evidence for that. You got lots of years of good work done 
in Ada that would not have been as easy in languages like C or C++. I 
agree those languages are full of traps for the unwary. I do not think, 
however, that Ada is the best choice for any new project being 
undertaken today. As I said, the state of the art has advanced 
dramatically in 40 years.

>
>> Rust in particular is an excellent language, and in addition to
>> superior safety, has far better ergonomics.
> I have heard good things about Rust, and some of the tree-sitter
> infrastructure is written in Rust. I guess it's time to take my own
> advice and learn a new language.
>
> Rewritting wisi in Rust would be an interesting challenge. Although I'm
> not clear that would make it more acceptable to the Emacs project.

I think you should try it anyway. At very worst, you will learn a great 
deal. You might also produce a library of interest to other people, and 
I think having more tools of this sort in the world is better.

That said, I think for good or ill, the current iteration of Emacs works 
best with code written in C. That might eventually change, of course. 
Even the Linux kernel is now getting support for code written in Rust. 
Perhaps eventually, with the arrival of better and better versions of 
the Rust GCC front end and other tools, it will make sense for Emacs to 
have a mixed implementation. I don't think that time has arrived yet.

(BTW, note that Tree Sitter is not written in Rust, though it does have 
a Rust API available.)

>>> In addition, SPARK (https://www.adacore.com/sparkpro) is a formal proof
>>> system designed for Ada, giving you even more power to build programs
>>> that are correct.
>> Yes, and for C I can use VST from Princeton for the same purpose (VST
>> being a Coq-based separation logic based on CompCert), there are
>> several other formal semantics for C that can be used for the same
>> purpose (including other Coq based CompCert derived semantics as well
>> as the K based semantics done by Chucky Ellison), and the Rustbelt
>> project (not yet quite as production ready) provides a Coq semantics
>> for Rust with which can be used for the same purpose.
> So Ada/SPARK is better than Rust/Rustbelt here :).

For now, but if I had to write a small high assurance real time kernel 
at the moment, I'd probably write it in a little language embedded in 
straight Coq with extraction to something with an efficient compiler 
(like Rust) even if the extraction wasn't verified.

>
>> Tree sitter can handle arbitrary LR and GLR grammars.
> The Ada grammar is GLR; that's why wisi can handle it. I reported
> tree-sitter not being able to handle Ada here:
> https://github.com/tree-sitter/tree-sitter/issues/693

Thank you for reporting that.

Perry

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Ada (was Re: Tree Sitter)
  2021-07-26  9:42                                                                     ` Stephen Leake
@ 2021-07-26 14:01                                                                       ` Perry E. Metzger
  0 siblings, 0 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-26 14:01 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

On 7/26/21 05:42, Stephen Leake wrote:
> Stephen Leake <stephen_leake@stephe-leake.org> writes:
>
>>> There are far more modern systems programming languages out there
>>> (like Rust) that statically guarantee far more, including that use
>>> after free is impossible, that  threads cannot have data races, that
>>> null pointers cannot exist. This should not be surprising, as type
>>> theory (and programming language theory in general) has advanced
>>> dramatically in the last 40 years.
>> Ada has kept up with some of that; the next ISO version is due in
>> 2022.
> The current Ada/SPARK allows enforcing the ownership model for pointers.
> Ada 2022 has structures supporting parallelizing loops and blocks.
>
A separation logic like SPARK is necessarily going to have tools to 
track pointers. That's rather different from the general language itself 
being able to do that. Vectorizing loops is cool but not in the same 
class as what Rust makes possible. But I really think we should drop 
this, it's not Emacs related any more.


Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Ada (was Re: Tree Sitter)
  2021-07-25 19:52                                                                 ` Ada (was Re: Tree Sitter) Perry E. Metzger
  2021-07-26  5:05                                                                   ` Stephen Leake
@ 2021-07-27  0:26                                                                   ` Richard Stallman
  2021-07-27 12:38                                                                     ` Perry E. Metzger
  1 sibling, 1 reply; 274+ messages in thread
From: Richard Stallman @ 2021-07-27  0:26 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: stephen_leake, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

How about moving the discussion of Ada and Rust to emacs-tangents?

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: Ada (was Re: Tree Sitter)
  2021-07-27  0:26                                                                   ` Richard Stallman
@ 2021-07-27 12:38                                                                     ` Perry E. Metzger
  0 siblings, 0 replies; 274+ messages in thread
From: Perry E. Metzger @ 2021-07-27 12:38 UTC (permalink / raw)
  To: rms; +Cc: stephen_leake, emacs-devel


On 7/26/21 20:26, Richard Stallman wrote:
> How about moving the discussion of Ada and Rust to emacs-tangents?
>
Entirely reasonable.

Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* [PATCH] Re: cc-mode fontification feels random
  2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
                   ` (2 preceding siblings ...)
  2021-06-04 15:54 ` Alan Mackenzie
@ 2021-08-30 18:50 ` Alan Mackenzie
  2021-08-30 19:03   ` Perry E. Metzger
  2021-08-30 19:25   ` Eli Zaretskii
  3 siblings, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-30 18:50 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

Hello, Daniel.

On Thu, Jun 03, 2021 at 20:16:53 -0700, Daniel Colascione wrote:
> As long as I can remember, cc-mode fontification has felt totally 
> random, with actual faces depending on happenstance of previously-parsed 
> types, luck of the draw in jit-lock chunking, and so on. Is there any 
> *general* way that we can make fontification more robust and consistent?

> For years and years now, I've been thinking we just need more 
> deterministic parser-and-based mode support, and I still think that, but 
> on a realistic level, that doesn't seem to be coming any time soon.

> In the meantime, is there any general approach we might be able to use 
> to get stuff like the attached to stop happening?

Here, "stuff like the attached" was having some types correctly
fontified, others not.  This was due to the order, somewhat random, in
which a type is recognised as such and entered into a CC Mode table, and
its use being scanned in a jit-lock chunk.

The following patch is an attempt to improve this situation.  It is best
used with jit-stealth-lock enabled.  I have tried it out with the
following settings:

    jit-lock-stealth-load: 200 ; i.e. inactive.
    jit-lock-stealth-nice: 0.1 ; 100 ms between fontifying stealth
                                 chunks.
    jit-lock-stealth-time: 1   ; 1 second idle time before stealth kicks
                                 in.

Whenever a new found type is entered into the CC Mode table, it marks
all occurrences of the type in the buffer for fontification (by setting
the 'fontified text property to nil on it), and causes an immediate
redisplay when there are occurrences of the new type in a window.

I think stealth lock could be enhanced by having it fontify several
500-byte chunks together, say until 0.05s time has been taken up.  This
could speed up stealth fontification while still leaving the keyboard
responsive to the user.

Anyhow, could you try the patch please, in particular on the source code
which you posted a picture of back in June, and see how well it,
together with stealth fontification, helps with the random
fontification.  (The patch is still a fairly rough sketch, not a
finished patch.)

Thanks!



diff -r a811a06c82c2 cc-engine.el
--- a/cc-engine.el	Sat Aug 21 10:14:48 2021 +0000
+++ b/cc-engine.el	Mon Aug 30 18:23:44 2021 +0000
@@ -173,6 +173,9 @@
 (cc-bytecomp-defvar c-doc-line-join-end-ch)
 (defvar c-syntactic-context)
 (defvar c-syntactic-element)
+(defvar c-new-id-start)
+(defvar c-new-id-end)
+(defvar c-new-id-is-type)
 (cc-bytecomp-defvar c-min-syn-tab-mkr)
 (cc-bytecomp-defvar c-max-syn-tab-mkr)
 (cc-bytecomp-defun c-clear-syn-tab)
@@ -6839,21 +6842,47 @@
   (setq c-found-types
 	(make-hash-table :test #'equal :weakness nil)))
 
+;;;; OLD STOUGH, 2021-08-23
+;; (defun c-add-type (from to)
+;;   ;; Add the given region as a type in `c-found-types'.  If the region
+;;   ;; doesn't match an existing type but there is a type which is equal
+;;   ;; to the given one except that the last character is missing, then
+;;   ;; the shorter type is removed.  That's done to avoid adding all
+;;   ;; prefixes of a type as it's being entered and font locked.  This
+;;   ;; doesn't cover cases like when characters are removed from a type
+;;   ;; or added in the middle.  We'd need the position of point when the
+;;   ;; font locking is invoked to solve this well.
+;;   ;;
+;;   ;; This function might do hidden buffer changes.
+;;   (let ((type (c-syntactic-content from to c-recognize-<>-arglists)))
+;;     (unless (gethash type c-found-types)
+;;       (remhash (substring type 0 -1) c-found-types)
+;;       (puthash type t c-found-types))))
+;;;; NEW STOUGH, 2021-08-29
+(defun c-add-type-1 (from to)
+  ;; FIXME!!!
+  (let ((type (c-syntactic-content from to c-recognize-<>-arglists)))
+    (unless (gethash type c-found-types)
+      (puthash type t c-found-types)
+      (when (and (eq (string-match c-symbol-key type) 0)
+		 (eq (match-end 0) (length type)))
+	(c-fontify-new-found-type type)))))
+
 (defun c-add-type (from to)
-  ;; Add the given region as a type in `c-found-types'.  If the region
-  ;; doesn't match an existing type but there is a type which is equal
-  ;; to the given one except that the last character is missing, then
-  ;; the shorter type is removed.  That's done to avoid adding all
-  ;; prefixes of a type as it's being entered and font locked.  This
-  ;; doesn't cover cases like when characters are removed from a type
-  ;; or added in the middle.  We'd need the position of point when the
-  ;; font locking is invoked to solve this well.
+  ;; Add the given region as a type in `c-found-types'.  If the region is or
+  ;; overlaps an identifier which might be being typed in, don't record it.
+  ;; This is tested by checking `c-new-id-start' and `c-new-id-end'.  That's
+  ;; done to avoid adding all prefixes of a type as it's being entered and
+  ;; font locked.  This is a bit rough and ready, but now covers adding
+  ;; characters into the middle of an identifer.
   ;;
   ;; This function might do hidden buffer changes.
-  (let ((type (c-syntactic-content from to c-recognize-<>-arglists)))
-    (unless (gethash type c-found-types)
-      (remhash (substring type 0 -1) c-found-types)
-      (puthash type t c-found-types))))
+  (if (and c-new-id-start c-new-id-end
+	   (<= from c-new-id-end) (>= to c-new-id-start))
+      (setq c-new-id-is-type t)
+    (c-add-type-1 from to)))
+;;;; END OF NEW STOUGH
+;;;; END OF NEW STOUGH
 
 (defun c-unfind-type (name)
   ;; Remove the "NAME" from c-found-types, if present.
diff -r a811a06c82c2 cc-fonts.el
--- a/cc-fonts.el	Sat Aug 21 10:14:48 2021 +0000
+++ b/cc-fonts.el	Mon Aug 30 18:23:44 2021 +0000
@@ -2253,6 +2253,48 @@
     ;; defvar will install its default value later on.
     (makunbound def-var)))
 
+;;;; NEW STOUGH, 2021-08-29
+;; `c-re-redisplay-timer' is a timer which, when triggered, causes a
+;; redisplay.
+(defvar c-re-redisplay-timer nil)
+
+(defun c-force-redisplay (start end)
+  ;; Force redisplay immediately.  This assumes `font-lock-support-mode' is
+  ;; 'jit-lock-mode.  Set the variable `c-re-redisplay-timer' to nil.
+  (jit-lock-force-redisplay (copy-marker start) (copy-marker end))
+  (setq c-re-redisplay-timer nil))
+
+(defun c-fontify-new-found-type (type)
+  ;; Cause the fontification of TYPE, a string, wherever it occurs in the
+  ;; buffer.  If TYPE is currently displayed in a window, cause redisplay to
+  ;; happen "instantaneously".  These actions are done only when jit-lock-mode
+  ;; is active.
+  (when (and (boundp 'font-lock-support-mode)
+	     (eq font-lock-support-mode 'jit-lock-mode))
+    (c-save-buffer-state
+	((window-boundaries
+	  (mapcar (lambda (win)
+		    (cons (window-start win)
+			  (window-end win)))
+		  (get-buffer-window-list (current-buffer) 'no-mini t)))
+	 (target-re (concat "\\_<" type "\\_>")))
+      (save-excursion
+	(save-restriction
+	  (widen)
+	  (goto-char (point-min))
+	  (while (re-search-forward target-re nil t)
+	    (put-text-property (match-beginning 0) (match-end 0)
+			       'fontified nil)
+	    (dolist (win-boundary window-boundaries)
+	      (when (and (< (match-beginning 0) (cdr win-boundary))
+			 (> (match-end 0) (car win-boundary))
+			 (c-get-char-property (match-beginning 0) 'fontified)
+			 (not c-re-redisplay-timer))
+		(setq c-re-redisplay-timer
+		      (run-with-timer 0 nil #'c-force-redisplay
+				      (match-beginning 0) (match-end 0)))))))))))
+
+;;;; END OF NEW STOUGH, 2021-08-29
 \f
 ;;; C.
 
diff -r a811a06c82c2 cc-mode.el
--- a/cc-mode.el	Sat Aug 21 10:14:48 2021 +0000
+++ b/cc-mode.el	Mon Aug 30 18:23:44 2021 +0000
@@ -173,6 +173,16 @@
   (when c-buffer-is-cc-mode
     (save-restriction
       (widen)
+;;;; NEW STOUGH, 2021-08-23
+      (let ((lst (buffer-list)))
+	(catch 'found
+	  (dolist (b lst)
+	    (if (and (not (eq b (current-buffer)))
+		     (with-current-buffer b
+		       c-buffer-is-cc-mode))
+		(throw 'found nil)))
+	  (remove-hook 'post-command-hook 'c-post-command)))
+;;;; END OF NEW STOUGH
       (c-save-buffer-state ()
 	(c-clear-char-properties (point-min) (point-max) 'category)
 	(c-clear-char-properties (point-min) (point-max) 'syntax-table)
@@ -728,6 +738,9 @@
   (or (memq 'add-hook-local c-emacs-features)
       (make-local-hook 'after-change-functions))
   (add-hook 'after-change-functions 'c-after-change nil t)
+;;;; NEW STOUGH, 2021-08-23
+  (add-hook 'post-command-hook 'c-post-command)
+;;;; END OF NEW STOUGH
   (when (boundp 'font-lock-extend-after-change-region-function)
     (set (make-local-variable 'font-lock-extend-after-change-region-function)
 	 'c-extend-after-change-region))) ; Currently (2009-05) used by all
@@ -1936,6 +1949,45 @@
 	;; confused by already processed single quotes.
 	(narrow-to-region (point) (point-max))))))
 
+;;;; NEW STOUGH, 2021-08-22
+;; The next two variables record the bounds of an identifier currently being
+;; typed in.  These are used to prevent such a partial identifier being
+;; recorded as a found type by c-add-type.
+(defvar c-new-id-start nil)
+(make-variable-buffer-local 'c-new-id-start)
+(defvar c-new-id-end nil)
+(make-variable-buffer-local 'c-new-id-end)
+;;;; NEW STOUGH, 2021-08-29
+;; The next variable, when non-nil, records that the previous two variables
+;; define a type.
+(defvar c-new-id-is-type nil)
+(make-variable-buffer-local 'c-new-id-is-type)
+;;;; END OF NEW STOUGH
+
+(defun c-update-new-id (end)
+  ;; Fill this in.  FIXME!!!
+  (save-excursion
+    (goto-char end)
+    (let ((id-beg (c-on-identifier)))
+      (setq c-new-id-start id-beg
+	    c-new-id-end (and id-beg
+			      (progn (c-end-of-current-token) (point)))))))
+
+
+(defun c-post-command ()
+  ;; If point was inside of a new identifier and no longer is, record that
+  ;; fact.
+  (when (and c-buffer-is-cc-mode
+	     c-new-id-start c-new-id-end
+	     (or (> (point) c-new-id-end)
+		 (< (point) c-new-id-start)))
+    (when c-new-id-is-type
+      (c-add-type-1 c-new-id-start c-new-id-end))
+    (setq c-new-id-start nil
+	  c-new-id-end nil
+	  c-new-id-is-type nil)))
+;;;; END OF NEW STOUGH
+
 (defun c-before-change (beg end)
   ;; Function to be put in `before-change-functions'.  Primarily, this calls
   ;; the language dependent `c-get-state-before-change-functions'.  It is
@@ -2133,6 +2185,9 @@
 						      c->-as-paren-syntax)
 		    (c-clear-char-property-with-value beg end 'syntax-table nil)))
 
+;;;; NEW STOUGH, 2021-08-22
+		(c-update-new-id end)
+;;;; END OF NEW STOUGH
 		(c-trim-found-types beg end old-len) ; maybe we don't
 						     ; need all of these.
 		(c-invalidate-sws-region-after beg end old-len)



-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 18:50 ` [PATCH] " Alan Mackenzie
@ 2021-08-30 19:03   ` Perry E. Metzger
  2021-08-30 19:18     ` Alan Mackenzie
  2021-08-30 19:25   ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Perry E. Metzger @ 2021-08-30 19:03 UTC (permalink / raw)
  To: emacs-devel

On 8/30/21 14:50, Alan Mackenzie wrote:
>> For years and years now, I've been thinking we just need more
>> deterministic parser-and-based mode support, and I still think that, but
>> on a realistic level, that doesn't seem to be coming any time soon.

I note that Tree Sitter integration is in active development now...

>
>> In the meantime, is there any general approach we might be able to use
>> to get stuff like the attached to stop happening?
> Here, "stuff like the attached" was having some types correctly
> fontified, others not.  This was due to the order, somewhat random, in
> which a type is recognised as such and entered into a CC Mode table, and
> its use being scanned in a jit-lock chunk.
>
> The following patch is an attempt to improve this situation.

I think we are inevitably hitting the wall here, because it is not 
possible to parse a context free grammar with regular expressions. One 
can only move around the suck, one can't actually remove it without 
parsing the underlying language.


Perry





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 19:03   ` Perry E. Metzger
@ 2021-08-30 19:18     ` Alan Mackenzie
  0 siblings, 0 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-30 19:18 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: emacs-devel

Hello, Perry.

On Mon, Aug 30, 2021 at 15:03:43 -0400, Perry E. Metzger wrote:
> On 8/30/21 14:50, Alan Mackenzie wrote:
> >> For years and years now, I've been thinking we just need more
> >> deterministic parser-and-based mode support, and I still think that, but
> >> on a realistic level, that doesn't seem to be coming any time soon.

> I note that Tree Sitter integration is in active development now...

Yes.  At some time in the future it will work, presumably well.

> >> In the meantime, is there any general approach we might be able to use
> >> to get stuff like the attached to stop happening?
> > Here, "stuff like the attached" was having some types correctly
> > fontified, others not.  This was due to the order, somewhat random, in
> > which a type is recognised as such and entered into a CC Mode table, and
> > its use being scanned in a jit-lock chunk.

> > The following patch is an attempt to improve this situation.

> I think we are inevitably hitting the wall here, because it is not 
> possible to parse a context free grammar with regular expressions. One 
> can only move around the suck, one can't actually remove it without 
> parsing the underlying language.

I'm not aiming at perfection.  It's a fairly simple hack whose aim is to
reduce the level of Daniel's (and others') irritation.  I think there's
now a general understanding that parsing the language is needed for
accurate fontification (and indentation).  But that is some way off,
yet.

> Perry

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 18:50 ` [PATCH] " Alan Mackenzie
  2021-08-30 19:03   ` Perry E. Metzger
@ 2021-08-30 19:25   ` Eli Zaretskii
  2021-08-30 19:28     ` Daniel Colascione
  2021-08-30 20:03     ` Alan Mackenzie
  1 sibling, 2 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-30 19:25 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Mon, 30 Aug 2021 18:50:34 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
> The following patch is an attempt to improve this situation.  It is best
> used with jit-stealth-lock enabled.  I have tried it out with the
> following settings:
> 
>     jit-lock-stealth-load: 200 ; i.e. inactive.
>     jit-lock-stealth-nice: 0.1 ; 100 ms between fontifying stealth
>                                  chunks.
>     jit-lock-stealth-time: 1   ; 1 second idle time before stealth kicks
>                                  in.
> 
> Whenever a new found type is entered into the CC Mode table, it marks
> all occurrences of the type in the buffer for fontification (by setting
> the 'fontified text property to nil on it), and causes an immediate
> redisplay when there are occurrences of the new type in a window.

So you are saying that this will cause the display of the visible
portion of the window to flicker whenever jit-lock-stealth finds such
a "new type"?  That could annoy, can't it?  jit-lock-stealth is for
fontifying portions of the buffer(s) that are not on display, it would
be wrong for it to apply this enhancement, I think, certainly by
default.

And jit-lock-stealth-time of 1 sec is too short.  I use 16, because
once jit-lock-stealth starts fontifying a chunk, Emacs can be
relatively slow to react to keyboard input, so I prefer to let
jit-lock-stealth start its thing only when there's a very good chance
I indeed stopped typing, not just thinking about something for a
second or two.

> I think stealth lock could be enhanced by having it fontify several
> 500-byte chunks together, say until 0.05s time has been taken up.  This
> could speed up stealth fontification while still leaving the keyboard
> responsive to the user.

Users can already arrange for that by manipulating the
jit-stealth-lock parameters.  Why should we change code to force upon
everyone what seems like a good idea to you (but isn't a good idea
IME, see above)?  If you want that, you can easily arrange for Emacs
to behave like that without changing any code.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 19:25   ` Eli Zaretskii
@ 2021-08-30 19:28     ` Daniel Colascione
  2021-08-30 19:37       ` Eli Zaretskii
  2021-08-30 20:11       ` Stefan Monnier
  2021-08-30 20:03     ` Alan Mackenzie
  1 sibling, 2 replies; 274+ messages in thread
From: Daniel Colascione @ 2021-08-30 19:28 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: emacs-devel

On 8/30/21 12:25 PM, Eli Zaretskii wrote:
>> Date: Mon, 30 Aug 2021 18:50:34 +0000
>> From: Alan Mackenzie <acm@muc.de>
>> Cc: emacs-devel@gnu.org
>>
>> The following patch is an attempt to improve this situation.  It is best
>> used with jit-stealth-lock enabled.  I have tried it out with the
>> following settings:
>>
>>      jit-lock-stealth-load: 200 ; i.e. inactive.
>>      jit-lock-stealth-nice: 0.1 ; 100 ms between fontifying stealth
>>                                   chunks.
>>      jit-lock-stealth-time: 1   ; 1 second idle time before stealth kicks
>>                                   in.
>>
>> Whenever a new found type is entered into the CC Mode table, it marks
>> all occurrences of the type in the buffer for fontification (by setting
>> the 'fontified text property to nil on it), and causes an immediate
>> redisplay when there are occurrences of the new type in a window.
> So you are saying that this will cause the display of the visible
> portion of the window to flicker whenever jit-lock-stealth finds such
> a "new type"?  That could annoy, can't it?  jit-lock-stealth is for
> fontifying portions of the buffer(s) that are not on display, it would
> be wrong for it to apply this enhancement, I think, certainly by
> default.

Any literal "flicker" issues should have been fixed for a long time now. 
What do you mean? That the fontification might change? That happens anyway.


>
> And jit-lock-stealth-time of 1 sec is too short.  I use 16, because
> once jit-lock-stealth starts fontifying a chunk, Emacs can be
> relatively slow to react to keyboard input, so I prefer to let
> jit-lock-stealth start its thing only when there's a very good chance
> I indeed stopped typing, not just thinking about something for a
> second or two.

Isn't this problem what while-no-input is intended to prevent?

>> I think stealth lock could be enhanced by having it fontify several
>> 500-byte chunks together, say until 0.05s time has been taken up.  This
>> could speed up stealth fontification while still leaving the keyboard
>> responsive to the user.
> Users can already arrange for that by manipulating the
> jit-stealth-lock parameters.  Why should we change code to force upon
> everyone what seems like a good idea to you (but isn't a good idea
> IME, see above)?  If you want that, you can easily arrange for Emacs
> to behave like that without changing any code.

Because good defaults matter, and if we have proper input preemption of 
stealth JIT working, there should be no downside.





^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 19:28     ` Daniel Colascione
@ 2021-08-30 19:37       ` Eli Zaretskii
  2021-08-30 20:11       ` Stefan Monnier
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-30 19:37 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: acm, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 30 Aug 2021 12:28:01 -0700
> 
> > So you are saying that this will cause the display of the visible
> > portion of the window to flicker whenever jit-lock-stealth finds such
> > a "new type"?  That could annoy, can't it?  jit-lock-stealth is for
> > fontifying portions of the buffer(s) that are not on display, it would
> > be wrong for it to apply this enhancement, I think, certainly by
> > default.
> 
> Any literal "flicker" issues should have been fixed for a long time now. 
> What do you mean? That the fontification might change? That happens anyway.

It doesn't happen with jit-stealth-lock now, because it only handles
portions of the buffer that are not displayed.  The portions that are
displayed get fontified before they are shown in the window.

> > And jit-lock-stealth-time of 1 sec is too short.  I use 16, because
> > once jit-lock-stealth starts fontifying a chunk, Emacs can be
> > relatively slow to react to keyboard input, so I prefer to let
> > jit-lock-stealth start its thing only when there's a very good chance
> > I indeed stopped typing, not just thinking about something for a
> > second or two.
> 
> Isn't this problem what while-no-input is intended to prevent?

while-no-input doesn't cause an immediate interruption of a running
Lisp code, as you well know.

> > Users can already arrange for that by manipulating the
> > jit-stealth-lock parameters.  Why should we change code to force upon
> > everyone what seems like a good idea to you (but isn't a good idea
> > IME, see above)?  If you want that, you can easily arrange for Emacs
> > to behave like that without changing any code.
> 
> Because good defaults matter, and if we have proper input preemption of 
> stealth JIT working, there should be no downside.

I invite you to try.  I use jit-stealth-lock all the time, and that is
my experience.  With it running, Emacs sometimes is slow to respond to
keyboard input.  "Slow" relatively, of course: it takes less than a
second even in the slow cases, but that's already somewhat annoying.
Which is why I avoid letting it run when I'm still typing, albeit
slowly.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 19:25   ` Eli Zaretskii
  2021-08-30 19:28     ` Daniel Colascione
@ 2021-08-30 20:03     ` Alan Mackenzie
  2021-08-31 11:53       ` Eli Zaretskii
  1 sibling, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-30 20:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, emacs-devel

Hello, Eli.

On Mon, Aug 30, 2021 at 22:25:31 +0300, Eli Zaretskii wrote:
> > Date: Mon, 30 Aug 2021 18:50:34 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> > The following patch is an attempt to improve this situation.  It is best
> > used with jit-stealth-lock enabled.  I have tried it out with the
> > following settings:

> >     jit-lock-stealth-load: 200 ; i.e. inactive.
> >     jit-lock-stealth-nice: 0.1 ; 100 ms between fontifying stealth
> >                                  chunks.
> >     jit-lock-stealth-time: 1   ; 1 second idle time before stealth kicks
> >                                  in.

> > Whenever a new found type is entered into the CC Mode table, it marks
> > all occurrences of the type in the buffer for fontification (by setting
> > the 'fontified text property to nil on it), and causes an immediate
> > redisplay when there are occurrences of the new type in a window.

> So you are saying that this will cause the display of the visible
> portion of the window to flicker whenever jit-lock-stealth finds such
> a "new type"?

There will be a one-time change from no fontification of foo to
font-lock-type-face.  The actual area of the screen getting refontified
is one declaration.

> That could annoy, can't it?

It might.  It didn't annoy me whilst trying it out.  The real question
is, will it annoy less than having types permanently unfontified.

> jit-lock-stealth is for fontifying portions of the buffer(s) that are
> not on display, it would be wrong for it to apply this enhancement, I
> think, certainly by default.

jit-lock has a presumption that the fontification of one part of a buffer
doesn't influence the fontification of another part.  This isn't the case
in CC Mode.

My impression is that jit-lock isn't much used.  It's disabled by
default.  What I'm envisaging is repurposing it to solve a real problem.

> And jit-lock-stealth-time of 1 sec is too short.  I use 16, because
> once jit-lock-stealth starts fontifying a chunk, Emacs can be
> relatively slow to react to keyboard input, so I prefer to let
> jit-lock-stealth start its thing only when there's a very good chance
> I indeed stopped typing, not just thinking about something for a
> second or two.

On my first try this evening with jit-lock, I set jit-lock-stealth-nice
to zero.  I completely lost control of my session, and hat to abort it
and restart.  But with the parameters set appropriately, can stealth
really make Emacs slow to react?

> > I think stealth lock could be enhanced by having it fontify several
> > 500-byte chunks together, say until 0.05s time has been taken up.  This
> > could speed up stealth fontification while still leaving the keyboard
> > responsive to the user.

> Users can already arrange for that by manipulating the
> jit-stealth-lock parameters.

I didn't see that this evening.  Stealth fontifies one chunk at a time,
then waits the "nice" time before fontifying the next chunk.  The time
for stealth to get through xdisp.c was several minutes, possibly even
many minutes.  This is unimportant for general background fontification,
but more important if using stealth to detect types.

> Why should we change code to force upon everyone what seems like a good
> idea to you (but isn't a good idea IME, see above)?

Well, I was thinking of these enhancements to stealth being more of an
option than forced upon people.  For example, a new variable
jit-lock-stealth-time-block could be nil by default, or set to 0.05 to
fontify for 0.05 seconds before attending to user input.

0.05 seconds is enough for about 5 500-byte chunks in CC Mode on my
machine.  That has the potential to speed up stealth fontification by a
factor a little less than 5.

> If you want that, you can easily arrange for Emacs to behave like that
> without changing any code.

I didn't see any way of doing it while looking at the code earlier on.
What exactly are you thinking of, here?

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 19:28     ` Daniel Colascione
  2021-08-30 19:37       ` Eli Zaretskii
@ 2021-08-30 20:11       ` Stefan Monnier
  2021-08-31 10:54         ` Alan Mackenzie
  2021-08-31 13:18         ` [PATCH] " Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-08-30 20:11 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, Alan Mackenzie, emacs-devel

> Because good defaults matter, and if we have proper input preemption of
> stealth JIT working, there should be no downside.

`jit-lock-stealth-time` defaults to nil (i.e. stealth fontification is
disabled by default) for a good reason: it used to be enabled.

The known downsides are:
- we don't have good input preemption
- it eats up your battery with no clear benefit
  [ Because you end up re-fontifying the whole rest of the buffer after
    every buffer modification.  ]

If we want to keep the buffers fully fontified, then it's crucial to make
sure every tiny buffer modification doesn't force re-fontifying most of the
rest of the buffer.


        Stefan




^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 20:11       ` Stefan Monnier
@ 2021-08-31 10:54         ` Alan Mackenzie
  2021-08-31 13:23           ` Eli Zaretskii
  2021-08-31 18:56           ` Stefan Monnier
  2021-08-31 13:18         ` [PATCH] " Eli Zaretskii
  1 sibling, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-31 10:54 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel

Hello, Stefan.

On Mon, Aug 30, 2021 at 16:11:18 -0400, Stefan Monnier wrote:
> > Because good defaults matter, and if we have proper input preemption of
> > stealth JIT working, there should be no downside.

> `jit-lock-stealth-time` defaults to nil (i.e. stealth fontification is
> disabled by default) for a good reason: it used to be enabled.

> The known downsides are:
> - we don't have good input preemption
> - it eats up your battery with no clear benefit
>   [ Because you end up re-fontifying the whole rest of the buffer after
>     every buffer modification.  ]

Ah, so that's why stealth was disabled by default - the battery.

For my current project (finding CC Mode "found types") stealth
fontification is not ideal - what I need is a single pass through the
whole buffer to detect the types, and that should be as fast as
possible.  The current stealth fontification is not fast (it doesn't
need to be) and carries on working until Emacs is shut down.

So, it seems I want something like stealth, but not quite.  How about,
say jit-lock-single-fontification - it would apply to individual buffers
only, would scan through the buffer precisely once, and would do just
enough 500-byte chunks at at time to take 0.05 seconds (configurable).

On my machine in C Mode, approximately 100 chunks are fontified per
second.  So if we had jit-lock-single-nice at 0.1 seconds, the time
taken to scan xdisp.c would be approximately 1 to 1.5 minutes.  On a
smaller file, say syntax.c (a tenth of the size) it would take 5 - 10
seconds.  This is barely worse than context fontification, and would
only happen at mode start up.

> If we want to keep the buffers fully fontified, then it's crucial to make
> sure every tiny buffer modification doesn't force re-fontifying most of the
> rest of the buffer.

Yes.  This feels like a difficult problem, otherwise it would have been
done a long time ago.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 20:03     ` Alan Mackenzie
@ 2021-08-31 11:53       ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-31 11:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, emacs-devel

> Date: Mon, 30 Aug 2021 20:03:01 +0000
> Cc: dancol@dancol.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > So you are saying that this will cause the display of the visible
> > portion of the window to flicker whenever jit-lock-stealth finds such
> > a "new type"?
> 
> There will be a one-time change from no fontification of foo to
> font-lock-type-face.  The actual area of the screen getting refontified
> is one declaration.
> 
> > That could annoy, can't it?
> 
> It might.  It didn't annoy me whilst trying it out.  The real question
> is, will it annoy less than having types permanently unfontified.

I don't think we will know soon enough.  Which is why I think this
should be optional behavior.

> > jit-lock-stealth is for fontifying portions of the buffer(s) that are
> > not on display, it would be wrong for it to apply this enhancement, I
> > think, certainly by default.
> 
> jit-lock has a presumption that the fontification of one part of a buffer
> doesn't influence the fontification of another part.

More accurately, that fontification of some part doesn't affect the
parts of the buffer before that.

> My impression is that jit-lock isn't much used.  It's disabled by
> default.  What I'm envisaging is repurposing it to solve a real problem.

As one user who uses jit-stealth-lock all the time, I don't think it's
wise to make jit-stealth-lock do this additional job by default.  They
are two separate jobs, and the optimal values of parameters that
control jit-stealth-lock are different for each job.  So this
re-purposing shouldn't be unconditional, IMO.

> > And jit-lock-stealth-time of 1 sec is too short.  I use 16, because
> > once jit-lock-stealth starts fontifying a chunk, Emacs can be
> > relatively slow to react to keyboard input, so I prefer to let
> > jit-lock-stealth start its thing only when there's a very good chance
> > I indeed stopped typing, not just thinking about something for a
> > second or two.
> 
> On my first try this evening with jit-lock, I set jit-lock-stealth-nice
> to zero.  I completely lost control of my session, and hat to abort it
> and restart.

Of course, you aren't supposed to do that.

> But with the parameters set appropriately, can stealth really make
> Emacs slow to react?

IME, yes, because we our way of interrupting a Lisp program is not
quick enough in some real-life situations.

> Well, I was thinking of these enhancements to stealth being more of an
> option than forced upon people.  For example, a new variable
> jit-lock-stealth-time-block could be nil by default, or set to 0.05 to
> fontify for 0.05 seconds before attending to user input.

IME, 0.05 sec is already borderline for good responsiveness UX.

> > If you want that, you can easily arrange for Emacs to behave like that
> > without changing any code.
> 
> I didn't see any way of doing it while looking at the code earlier on.
> What exactly are you thinking of, here?

Lower jit-lock-stealth-nice and jit-lock-stealth-time to small
values.  Optionally, enlarge jit-lock-chunk-size as well.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-30 20:11       ` Stefan Monnier
  2021-08-31 10:54         ` Alan Mackenzie
@ 2021-08-31 13:18         ` Eli Zaretskii
  1 sibling, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-31 13:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, dancol, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  Alan Mackenzie <acm@muc.de>,
>  emacs-devel@gnu.org
> Date: Mon, 30 Aug 2021 16:11:18 -0400
> 
> `jit-lock-stealth-time` defaults to nil (i.e. stealth fontification is
> disabled by default) for a good reason: it used to be enabled.
> 
> The known downsides are:
> - we don't have good input preemption
> - it eats up your battery with no clear benefit
>   [ Because you end up re-fontifying the whole rest of the buffer after
>     every buffer modification.  ]

Regarding the second point: I'd like to put it back into the right
perspective, as someone who has jit-lock-stealth turned on all the
time.

Yes, editing a buffer will trigger stealth fontification.  But:

  . Unless you make jit-lock-stealth-time very small, fontifications
    only kick in if you stop editing, so they are unlikely to be as
    wasteful as it could sound, definitely not "after every buffer
    modification".
  . It is very rare to be editing many buffers at the same time.
    Typically, you edit a couple, and leave the rest alone,
    unmodified.  E.g., my current session has more than 400 buffers,
    and that's not too many by Emacs standards, most of them are
    almost never edited.

The upside, of course, is that my buffers are always fontified, and
scrolling through them is much faster.  And I seldom if ever use
laptops for development.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 10:54         ` Alan Mackenzie
@ 2021-08-31 13:23           ` Eli Zaretskii
  2021-08-31 16:02             ` Alan Mackenzie
  2021-08-31 18:56           ` Stefan Monnier
  1 sibling, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-31 13:23 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, monnier, emacs-devel

> Date: Tue, 31 Aug 2021 10:54:23 +0000
> Cc: Daniel Colascione <dancol@dancol.org>, Eli Zaretskii <eliz@gnu.org>,
>   emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> For my current project (finding CC Mode "found types") stealth
> fontification is not ideal - what I need is a single pass through the
> whole buffer to detect the types, and that should be as fast as
> possible.  The current stealth fontification is not fast (it doesn't
> need to be) and carries on working until Emacs is shut down.

The purpose of jit-lock-stealth is different, that's why.

> So, it seems I want something like stealth, but not quite.

Yes, you want a different feature.

> How about, say jit-lock-single-fontification - it would apply to
> individual buffers only, would scan through the buffer precisely
> once, and would do just enough 500-byte chunks at at time to take
> 0.05 seconds (configurable).
> 
> On my machine in C Mode, approximately 100 chunks are fontified per
> second.  So if we had jit-lock-single-nice at 0.1 seconds, the time
> taken to scan xdisp.c would be approximately 1 to 1.5 minutes.  On a
> smaller file, say syntax.c (a tenth of the size) it would take 5 - 10
> seconds.  This is barely worse than context fontification, and would
> only happen at mode start up.

What happens if you unleash this jit-lock-single-fontification, and
then type at a relatively slow pace: does Emacs still feel responsive
enough?  And how much of CPU does it use in that case?



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 13:23           ` Eli Zaretskii
@ 2021-08-31 16:02             ` Alan Mackenzie
  2021-08-31 16:21               ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-31 16:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, monnier, emacs-devel

Hello, Eli.

On Tue, Aug 31, 2021 at 16:23:08 +0300, Eli Zaretskii wrote:
> > Date: Tue, 31 Aug 2021 10:54:23 +0000
> > Cc: Daniel Colascione <dancol@dancol.org>, Eli Zaretskii <eliz@gnu.org>,
> >   emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > For my current project (finding CC Mode "found types") stealth
> > fontification is not ideal - what I need is a single pass through the
> > whole buffer to detect the types, and that should be as fast as
> > possible.  The current stealth fontification is not fast (it doesn't
> > need to be) and carries on working until Emacs is shut down.

> The purpose of jit-lock-stealth is different, that's why.

> > So, it seems I want something like stealth, but not quite.

> Yes, you want a different feature.

> > How about, say jit-lock-single-fontification - it would apply to
> > individual buffers only, would scan through the buffer precisely
> > once, and would do just enough 500-byte chunks at at time to take
> > 0.05 seconds (configurable).

> > On my machine in C Mode, approximately 100 chunks are fontified per
> > second.  So if we had jit-lock-single-nice at 0.1 seconds, the time
> > taken to scan xdisp.c would be approximately 1 to 1.5 minutes.  On a
> > smaller file, say syntax.c (a tenth of the size) it would take 5 - 10
> > seconds.  This is barely worse than context fontification, and would
> > only happen at mode start up.

> What happens if you unleash this jit-lock-single-fontification, and
> then type at a relatively slow pace: does Emacs still feel responsive
> enough?  And how much of CPU does it use in that case?

I'm guessing at the moment, but I'm reckoning that if
jit-lock-single-fontification uses (marginally over) 0.05s at a time,
followed by (at least) 0.1s break, normal typing at up to ten characters
per second shouldn't be affected.  Am I being optimistic here?  I think
you said in another post that your system's response with stealth
sometimes is "under a second".  That sounds like "not very much" under a
second, which is sluggish and something we want to avoid.

As for how much CPU, then I think it would use one third of a single
core's CPU time on an otherwise idle Emacs.  That's 0.05s strenuous
activity followed by 0.1s break.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 16:02             ` Alan Mackenzie
@ 2021-08-31 16:21               ` Eli Zaretskii
  2021-08-31 16:46                 ` Alan Mackenzie
  0 siblings, 1 reply; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-31 16:21 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, monnier, emacs-devel

> Date: Tue, 31 Aug 2021 16:02:45 +0000
> Cc: monnier@iro.umontreal.ca, dancol@dancol.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > What happens if you unleash this jit-lock-single-fontification, and
> > then type at a relatively slow pace: does Emacs still feel responsive
> > enough?  And how much of CPU does it use in that case?
> 
> I'm guessing at the moment

Well, my point, if it wasn't clear, was to suggest that you or someone
else actually try that and see what happens and how does it feel.
Emacs is a complex beast, so the results might surprise us all.

> but I'm reckoning that if
> jit-lock-single-fontification uses (marginally over) 0.05s at a time,
> followed by (at least) 0.1s break, normal typing at up to ten characters
> per second shouldn't be affected.

Not sure where did the 0.05 sec number come from.  You want to limit
the fontification of a chunk to that time, but we don't have an
efficient method of doing that, because Emacs is not an
interrupt-driven program.  So if a Lisp program starts some heavy
processing, it will only be able to stop when it gets to looking at
how much time passed, or tests some flag set by a signal handler.
There can be no guarantee this can never exceed 50 msec.

> As for how much CPU, then I think it would use one third of a single
> core's CPU time on an otherwise idle Emacs.  That's 0.05s strenuous
> activity followed by 0.1s break.

30% of an execution unit is not negligible.  When I see something like
that on my system that I presume should be idle, I start looking for
an offender.  And I'm not sure the 30% estimate is accurate.  Once
again, let's measure it.

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 16:21               ` Eli Zaretskii
@ 2021-08-31 16:46                 ` Alan Mackenzie
  2021-08-31 17:02                   ` Eli Zaretskii
  0 siblings, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-31 16:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dancol, monnier, emacs-devel

Hello, Eli.

On Tue, Aug 31, 2021 at 19:21:40 +0300, Eli Zaretskii wrote:
> > Date: Tue, 31 Aug 2021 16:02:45 +0000
> > Cc: monnier@iro.umontreal.ca, dancol@dancol.org, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > What happens if you unleash this jit-lock-single-fontification, and
> > > then type at a relatively slow pace: does Emacs still feel responsive
> > > enough?  And how much of CPU does it use in that case?

> > I'm guessing at the moment

> Well, my point, if it wasn't clear, was to suggest that you or someone
> else actually try that and see what happens and how does it feel.
> Emacs is a complex beast, so the results might surprise us all.

Yes, you were clear, sorry I was not as clear in my reply.  I'm
currently working out how to implement it.  It shouldn't be too
difficult or take too long.

> > but I'm reckoning that if jit-lock-single-fontification uses
> > (marginally over) 0.05s at a time, followed by (at least) 0.1s
> > break, normal typing at up to ten characters per second shouldn't be
> > affected.

> Not sure where did the 0.05 sec number come from.

I remember reading somewhere that this is the approximate length of time
which is effectively instantaneous to the human consciousness.  It would
in any case be a configurable option.

> You want to limit the fontification of a chunk to that time, but we
> don't have an efficient method of doing that, because Emacs is not an
> interrupt-driven program.

On my 4 year old HW, approximately 5 CC Mode chunks fit into that time,
on average.  I think the way to do it is to fontify a chunk at a time,
and if the 0.05s hasn't yet been used up, fontify another one, and so
on.  This should work fine on a modern machine, it might cause
sluggishness on an older slower machine, which would be fontifying a
single chunk which might take, say, 0.1s or 0.2s.  This might
necessitate different settings than for a fast machine.

> So if a Lisp program starts some heavy processing, it will only be
> able to stop when it gets to looking at how much time passed, or tests
> some flag set by a signal handler.  There can be no guarantee this can
> never exceed 50 msec.

No.  But doing a chunk at a time, checking the clock after each chunk,
will probably be granular enough.  Again, lets try it and see.

> > As for how much CPU, then I think it would use one third of a single
> > core's CPU time on an otherwise idle Emacs.  That's 0.05s strenuous
> > activity followed by 0.1s break.

> 30% of an execution unit is not negligible.  When I see something like
> that on my system that I presume should be idle, I start looking for
> an offender.  And I'm not sure the 30% estimate is accurate.  Once
> again, let's measure it.

Indeed!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 16:46                 ` Alan Mackenzie
@ 2021-08-31 17:02                   ` Eli Zaretskii
  0 siblings, 0 replies; 274+ messages in thread
From: Eli Zaretskii @ 2021-08-31 17:02 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: dancol, monnier, emacs-devel

> Date: Tue, 31 Aug 2021 16:46:15 +0000
> Cc: monnier@iro.umontreal.ca, dancol@dancol.org, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > You want to limit the fontification of a chunk to that time, but we
> > don't have an efficient method of doing that, because Emacs is not an
> > interrupt-driven program.
> 
> On my 4 year old HW, approximately 5 CC Mode chunks fit into that time,
> on average.  I think the way to do it is to fontify a chunk at a time,
> and if the 0.05s hasn't yet been used up, fontify another one, and so
> on.

It mostly does, but sometimes doesn't.  We all know that there are
some chunks of C code whose fontification takes a long time.  IME,
this happens frequently enough (judging by the time it sometimes takes
Emacs to respond to a keypress) to be mildly annoying.

> > So if a Lisp program starts some heavy processing, it will only be
> > able to stop when it gets to looking at how much time passed, or tests
> > some flag set by a signal handler.  There can be no guarantee this can
> > never exceed 50 msec.
> 
> No.  But doing a chunk at a time, checking the clock after each chunk,
> will probably be granular enough.

Until you bump into a chunk where it doesn't.

> Again, lets try it and see.

Yes, let's.



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 10:54         ` Alan Mackenzie
  2021-08-31 13:23           ` Eli Zaretskii
@ 2021-08-31 18:56           ` Stefan Monnier
  2021-08-31 21:17             ` Alan Mackenzie
  1 sibling, 1 reply; 274+ messages in thread
From: Stefan Monnier @ 2021-08-31 18:56 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Daniel Colascione, Eli Zaretskii, emacs-devel

> So, it seems I want something like stealth, but not quite.  How about,
> say jit-lock-single-fontification - it would apply to individual buffers
> only, would scan through the buffer precisely once, and would do just
> enough 500-byte chunks at at time to take 0.05 seconds (configurable).

I still don't think it's a good use of resources: fontifying the whole
buffer is usually completely unnecessary because most of that buffer
will never be displayed, and because we can fontifying them fast enough
that doing it ahead of time won't save us enough time when we do display
those parts.

[ Typical concrete problems show up when you visit a hundred files
  (e.g. for a search&replace, or when loading your desktop.el) after which
  your Emacs stays busy for a long time fontifying all those buffers.  ]

For my own use, the benefit of "correct" highlighting of types is not
worth any effort at all.  Maybe some users would enjoy the improved
highlighting of your new code, but if so I think you'd be better off
running an ad-hoc timer that does nothing else than scan for type
declarations (without doing the rest of font-lock).  It'll be faster and
you won't need to care about what jit-lock does.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 18:56           ` Stefan Monnier
@ 2021-08-31 21:17             ` Alan Mackenzie
  2021-08-31 21:47               ` Stefan Monnier
  2021-10-22 20:13               ` [Committed PATCH] " Alan Mackenzie
  0 siblings, 2 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-08-31 21:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel

Hello, Stefan.

On Tue, Aug 31, 2021 at 14:56:13 -0400, Stefan Monnier wrote:
> > So, it seems I want something like stealth, but not quite.  How about,
> > say jit-lock-single-fontification - it would apply to individual buffers
> > only, would scan through the buffer precisely once, and would do just
> > enough 500-byte chunks at at time to take 0.05 seconds (configurable).

> I still don't think it's a good use of resources: fontifying the whole
> buffer is usually completely unnecessary because most of that buffer
> will never be displayed, and because we can fontifying them fast enough
> that doing it ahead of time won't save us enough time when we do display
> those parts.

It is the incorrectness of the display of some CC Mode types, rather
than the speed of fontification which is the issue here.

> [ Typical concrete problems show up when you visit a hundred files
>   (e.g. for a search&replace, or when loading your desktop.el) after which
>   your Emacs stays busy for a long time fontifying all those buffers.  ]

It may be busy, but it remains responsive, as in stealth fontification.
Is this really a problem, as long as one's running on mains power, not a
battery.

> For my own use, the benefit of "correct" highlighting of types is not
> worth any effort at all.

I envisage this facility being enabled by a user option.

> Maybe some users would enjoy the improved highlighting of your new
> code, ....

Daniel Colascione, the OP of this thread, most assuredly would.

> .... but if so I think you'd be better off running an ad-hoc timer
> that does nothing else than scan for type declarations (without doing
> the rest of font-lock).  It'll be faster and you won't need to care
> about what jit-lock does.

Thanks, that's a brilliant idea!  It will be somewhat faster rather than
much faster (because parsing declarations is what sucks up the time in
CC Mode's fontification), but as you say, assuming most parts of most
buffers never get displayed, it will be a net win.

It will be more work to code up, though.  I'm currently quite some way
into hacking up jit-lock-single-fontification, so I'm going to get that
working first to see how well it works.  Then I'll hack up your idea,
and confirm it works better.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [PATCH] Re: cc-mode fontification feels random
  2021-08-31 21:17             ` Alan Mackenzie
@ 2021-08-31 21:47               ` Stefan Monnier
  2021-10-22 20:13               ` [Committed PATCH] " Alan Mackenzie
  1 sibling, 0 replies; 274+ messages in thread
From: Stefan Monnier @ 2021-08-31 21:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Daniel Colascione, Eli Zaretskii, emacs-devel

> It may be busy, but it remains responsive, as in stealth fontification.

Even if it remains responsive, it's less responsive in my experience.

And it's using more CPU power, which can be a problem because it reduces
the battery's lifetime, or because it increases the temperature and
hence the fan speed and hence the noise, or because it slows down
other tasks.

There are cases where it can be beneficial overall (which is why Eli
enables it), but definitely not in my use cases.
And I think enabling it by default would be a bad asking for trouble.

>> Maybe some users would enjoy the improved highlighting of your new
>> code, ....
> Daniel Colascione, the OP of this thread, most assuredly would.

Maybe, maybe not.  Maybe he'd still find it too annoying that `foo` is
highlighted as a type when `bar` isn't just because `bar` is defined
in another file.

        Stefan

^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [Committed PATCH] Re: cc-mode fontification feels random
  2021-08-31 21:17             ` Alan Mackenzie
  2021-08-31 21:47               ` Stefan Monnier
@ 2021-10-22 20:13               ` Alan Mackenzie
  2021-10-24 20:18                 ` Alan Mackenzie
  1 sibling, 1 reply; 274+ messages in thread
From: Alan Mackenzie @ 2021-10-22 20:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel

Hello, Stefan.

On Tue, Aug 31, 2021 at 21:17:12 +0000, Alan Mackenzie wrote:
> On Tue, Aug 31, 2021 at 14:56:13 -0400, Stefan Monnier wrote:
> > > So, it seems I want something like stealth, but not quite.  How about,
> > > say jit-lock-single-fontification - it would apply to individual buffers
> > > only, would scan through the buffer precisely once, and would do just
> > > enough 500-byte chunks at at time to take 0.05 seconds (configurable).

[ .... ]

> It is the incorrectness of the display of some CC Mode types, rather
> than the speed of fontification which is the issue here.

[ .... ]

> It may be busy, but it remains responsive, as in stealth fontification.
> Is this really a problem, as long as one's running on mains power, not a
> battery.

> > For my own use, the benefit of "correct" highlighting of types is not
> > worth any effort at all.

> I envisage this facility being enabled by a user option.

Indeed, though the option is "opt out" rather than "opt in".  This
facility is only active for a few seconds after initialising a CC Mode
mode.

[ .... ]

> > .... but if so I think you'd be better off running an ad-hoc timer
> > that does nothing else than scan for type declarations (without doing
> > the rest of font-lock).  It'll be faster and you won't need to care
> > about what jit-lock does.

> Thanks, that's a brilliant idea!  It will be somewhat faster rather than
> much faster (because parsing declarations is what sucks up the time in
> CC Mode's fontification), but as you say, assuming most parts of most
> buffers never get displayed, it will be a net win.

> It will be more work to code up, though.  I'm currently quite some way
> into hacking up jit-lock-single-fontification, so I'm going to get that
> working first to see how well it works.

I never finished this.  It doesn't matter any more.

> Then I'll hack up your idea, and confirm it works better.

Well, I've coded it up and committed it to master.  The fontification of
CC Mode types should now be somewhat less random than previously.

> >         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

* Re: [Committed PATCH] Re: cc-mode fontification feels random
  2021-10-22 20:13               ` [Committed PATCH] " Alan Mackenzie
@ 2021-10-24 20:18                 ` Alan Mackenzie
  0 siblings, 0 replies; 274+ messages in thread
From: Alan Mackenzie @ 2021-10-24 20:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, Daniel Colascione, emacs-devel

Hello again, Stefan.

I've committed a corrected version, after my commit from Friday caused
Emacs to hang for several minutes.  It probably wants a NEWS item.
Otherwise, I'm hoping it won't be noticed (except by Eli, of course ;-).

"Found types" should now be systematically and correctly fontified, even
if it takes a short while after starting the major mode.

On Fri, Oct 22, 2021 at 20:13:25 +0000, Alan Mackenzie wrote:
> On Tue, Aug 31, 2021 at 21:17:12 +0000, Alan Mackenzie wrote:
> > On Tue, Aug 31, 2021 at 14:56:13 -0400, Stefan Monnier wrote:
> > > > So, it seems I want something like stealth, but not quite.  How about,
> > > > say jit-lock-single-fontification - it would apply to individual buffers
> > > > only, would scan through the buffer precisely once, and would do just
> > > > enough 500-byte chunks at at time to take 0.05 seconds (configurable).

> [ .... ]

> > It is the incorrectness of the display of some CC Mode types, rather
> > than the speed of fontification which is the issue here.

> [ .... ]

> > It may be busy, but it remains responsive, as in stealth fontification.
> > Is this really a problem, as long as one's running on mains power, not a
> > battery.

> > > For my own use, the benefit of "correct" highlighting of types is not
> > > worth any effort at all.

> > I envisage this facility being enabled by a user option.

> Indeed, though the option is "opt out" rather than "opt in".  This
> facility is only active for a few seconds after initialising a CC Mode
> mode.

> [ .... ]

> > > .... but if so I think you'd be better off running an ad-hoc timer
> > > that does nothing else than scan for type declarations (without doing
> > > the rest of font-lock).  It'll be faster and you won't need to care
> > > about what jit-lock does.

> > Thanks, that's a brilliant idea!  It will be somewhat faster rather than
> > much faster (because parsing declarations is what sucks up the time in
> > CC Mode's fontification), but as you say, assuming most parts of most
> > buffers never get displayed, it will be a net win.

> > It will be more work to code up, though.  I'm currently quite some way
> > into hacking up jit-lock-single-fontification, so I'm going to get that
> > working first to see how well it works.

> I never finished this.  It doesn't matter any more.

> > Then I'll hack up your idea, and confirm it works better.

> Well, I've coded it up and committed it to master.  The fontification of
> CC Mode types should now be somewhat less random than previously.

> > >         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 274+ messages in thread

end of thread, other threads:[~2021-10-24 20:18 UTC | newest]

Thread overview: 274+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04  3:16 cc-mode fontification feels random Daniel Colascione
2021-06-04  6:10 ` Eli Zaretskii
2021-06-04  7:10   ` Theodor Thornhill
2021-06-04 10:08     ` João Távora
2021-06-04 10:39       ` Eli Zaretskii
2021-06-04 10:59         ` Philipp
2021-06-04 11:05           ` João Távora
2021-06-04 11:22             ` Eli Zaretskii
2021-06-04 12:44               ` Dmitry Gutov
2021-06-04 13:46               ` João Távora
2021-06-04 14:11                 ` Eli Zaretskii
2021-06-04 11:18           ` Eli Zaretskii
2021-06-04 16:43       ` Jim Porter
     [not found]         ` <83k0n9l9pv.fsf@gnu.org>
2021-06-04 19:41           ` Jim Porter
2021-06-04 19:53             ` Eli Zaretskii
2021-06-04 20:05               ` Jim Porter
2021-06-04 20:11                 ` Joost Kremers
2021-06-05  6:51                   ` Eli Zaretskii
2021-06-05 10:14                     ` Joost Kremers
2021-06-05 11:31                       ` Eli Zaretskii
2021-06-05 12:12                         ` Joost Kremers
2021-06-05 13:23                     ` Stefan Monnier
2021-06-05 17:08                       ` Óscar Fuentes
2021-06-05 17:31                         ` Stefan Monnier
2021-06-05 17:32                         ` Eli Zaretskii
2021-06-05 18:46                     ` João Távora
2021-06-05  6:41                 ` Eli Zaretskii
2021-06-05  9:32                   ` João Távora
2021-06-05  9:59                     ` Ergus
2021-06-05 11:29                       ` Eli Zaretskii
2021-06-05 11:55                         ` Daniel Colascione
2021-06-05 12:27                           ` Eli Zaretskii
2021-06-05 17:59                             ` Jim Porter
2021-06-05 18:56                               ` Daniel Martín
2021-06-05 12:43                         ` Ergus
2021-06-05 13:59                       ` Remote GUI Emacs really works (was: cc-mode fontification feels random) Óscar Fuentes
2021-06-05 11:25                     ` cc-mode fontification feels random Eli Zaretskii
2021-06-05  9:46                   ` Ergus
2021-06-05 11:27                     ` Eli Zaretskii
2021-06-04 20:14               ` Yuri Khan
2021-06-04 10:25     ` Eli Zaretskii
2021-06-04 10:05   ` Daniel Colascione
2021-06-04 10:22     ` Eli Zaretskii
2021-06-04 10:34       ` João Távora
2021-06-04 10:43         ` Eli Zaretskii
2021-06-04 18:25         ` Stefan Monnier
2021-06-04 18:36           ` Daniel Colascione
2021-06-04 19:11             ` Eli Zaretskii
2021-06-04 19:16               ` Daniel Colascione
2021-06-04 19:26                 ` Eli Zaretskii
2021-06-04 19:33                   ` Daniel Colascione
2021-06-04 19:51                     ` Eli Zaretskii
2021-06-05  0:29             ` Stefan Monnier
2021-06-05  6:32               ` Eli Zaretskii
2021-06-04 19:07           ` Eli Zaretskii
2021-06-04 19:26             ` Daniel Colascione
2021-06-04 19:32               ` Eli Zaretskii
2021-06-04 10:41       ` Eli Zaretskii
2021-06-04 10:42 ` Ergus
2021-06-04 15:54 ` Alan Mackenzie
2021-06-04 18:30   ` Daniel Colascione
2021-06-06 11:37     ` Alan Mackenzie
2021-06-06 11:57       ` Eli Zaretskii
2021-06-06 12:27         ` Alan Mackenzie
2021-06-06 12:44           ` Eli Zaretskii
2021-06-06 14:19             ` Alan Mackenzie
2021-06-06 17:06               ` Eli Zaretskii
2021-06-06 17:44       ` Stefan Monnier
2021-06-06 18:00         ` Eli Zaretskii
2021-06-06 18:18           ` Stefan Monnier
2021-06-06 18:33             ` Daniel Colascione
2021-06-06 20:24               ` Stefan Monnier
2021-06-06 20:27                 ` Daniel Colascione
2021-06-06 20:38                   ` Stefan Monnier
2021-06-06 19:03             ` Eli Zaretskii
2021-06-06 20:28               ` Stefan Monnier
2021-06-07  7:35                 ` martin rudalics
2021-06-07 13:20                   ` Stefan Monnier
2021-06-07 13:37                     ` Eli Zaretskii
2021-06-08  0:06                       ` Daniel Colascione
2021-06-08 15:16                       ` Stefan Monnier
2021-06-07 15:58                     ` martin rudalics
2021-06-08  4:01                     ` Richard Stallman
2021-06-08 15:29                       ` Stefan Monnier
2021-06-08 15:52                         ` Eli Zaretskii
2021-06-08 16:36                           ` Stefan Monnier
2021-06-08 18:11                             ` Daniel Colascione
2021-06-08 18:25                               ` Eli Zaretskii
2021-06-08 18:28                                 ` Daniel Colascione
2021-06-08 18:54                                   ` Eli Zaretskii
2021-06-09 18:22                                 ` Alan Mackenzie
2021-06-09 18:36                                   ` Eli Zaretskii
2021-06-09 18:51                                     ` Daniel Colascione
2021-06-09 19:04                                       ` Eli Zaretskii
2021-06-09 20:07                                       ` chad
2021-06-10  6:43                                         ` Eli Zaretskii
2021-06-09 20:17                                       ` Dmitry Gutov
2021-06-09 21:03                                     ` Alan Mackenzie
2021-06-10  2:21                                       ` Daniel Colascione
2021-06-10  6:55                                         ` Eli Zaretskii
2021-06-10  6:58                                           ` Daniel Colascione
2021-06-10  7:19                                             ` Eli Zaretskii
2021-06-10  6:39                                       ` Eli Zaretskii
2021-06-10 16:46                                         ` Alan Mackenzie
2021-06-10 17:01                                           ` Eli Zaretskii
2021-06-10 17:07                                             ` Daniel Colascione
2021-06-10 17:22                                               ` Eli Zaretskii
2021-06-10 17:33                                                 ` Daniel Colascione
2021-06-10 17:39                                                   ` Eli Zaretskii
2021-06-10 17:40                                                 ` Óscar Fuentes
2021-06-10 17:44                                                   ` Eli Zaretskii
2021-06-11 16:11                                                 ` Alan Mackenzie
2021-06-11 17:53                                                   ` Eli Zaretskii
2021-06-11 18:02                                                     ` Daniel Colascione
2021-06-11 18:22                                                       ` Eli Zaretskii
2021-06-11 18:28                                                         ` Daniel Colascione
2021-06-11 19:12                                                           ` Alan Mackenzie
2021-06-11 19:23                                                           ` Eli Zaretskii
2021-06-11 18:47                                                         ` Alan Mackenzie
2021-06-11 19:32                                                           ` Eli Zaretskii
2021-06-11 19:46                                                             ` Alan Mackenzie
2021-06-11 19:50                                                               ` Eli Zaretskii
2021-06-11 18:42                                                       ` Stefan Monnier
2021-06-11 19:31                                                         ` Eli Zaretskii
2021-06-11 19:57                                                           ` Stefan Monnier
2021-06-11 23:25                                                             ` Ergus
2021-06-11 23:52                                                               ` Óscar Fuentes
2021-06-12  1:08                                                                 ` Ergus
2021-06-12  3:20                                                                   ` Stefan Monnier
2021-06-12 11:07                                                                     ` Ergus
2021-06-12  6:58                                                                   ` Eli Zaretskii
2021-06-12 11:01                                                                     ` Ergus
2021-06-12 11:25                                                                       ` Eli Zaretskii
2021-06-12 15:04                                                                         ` Ergus
2021-06-12 15:16                                                                           ` Eli Zaretskii
2021-06-12 15:23                                                                             ` Ergus
2021-06-12 15:35                                                                               ` Eli Zaretskii
2021-06-12 14:00                                                                     ` Stefan Monnier
2021-06-12 14:20                                                                       ` Eli Zaretskii
2021-06-12 14:33                                                                         ` Stefan Monnier
2021-06-12 15:06                                                                           ` Eli Zaretskii
2021-06-12 15:46                                                                             ` Stefan Monnier
2021-06-12  6:50                                                                 ` Eli Zaretskii
2021-06-12  5:20                                                               ` Theodor Thornhill
2021-06-12 13:40                                                                 ` Stefan Monnier
2021-06-12 15:56                                                                   ` Theodor Thornhill
2021-06-12 16:59                                                                     ` Ergus
2021-06-12 17:51                                                                       ` Theodor Thornhill
2021-06-12 17:25                                                                     ` Stefan Monnier
2021-06-12 17:53                                                                       ` Theodor Thornhill
2021-06-12 17:54                                                                       ` Ergus
2021-06-12 18:02                                                                       ` Daniel Colascione
2021-06-12 18:39                                                                         ` Ergus
2021-06-12  6:38                                                             ` Eli Zaretskii
2021-06-12 13:44                                                               ` Stefan Monnier
2021-06-12 14:14                                                                 ` Eli Zaretskii
2021-06-11 20:06                                                           ` Alan Mackenzie
2021-06-12  6:44                                                             ` Eli Zaretskii
2021-06-12  8:00                                                               ` Daniel Colascione
2021-06-12  8:08                                                                 ` Eli Zaretskii
2021-06-12  9:31                                                                   ` Alan Mackenzie
2021-06-11 19:48                                                         ` Eli Zaretskii
2021-06-11 18:34                                                     ` Alan Mackenzie
2021-06-10 17:26                                               ` Óscar Fuentes
2021-06-10 17:39                                               ` andrés ramírez
2021-06-10 21:06                                           ` Stefan Monnier
2021-06-11  6:14                                             ` Eli Zaretskii
2021-06-10 15:16                                       ` Ergus
2021-06-10 15:34                                         ` Óscar Fuentes
2021-06-10 19:06                                           ` Ergus
2021-06-10 19:28                                             ` Eli Zaretskii
2021-06-10 21:56                                               ` Ergus
2021-06-10 15:59                                         ` Jim Porter
2021-06-10 21:02                                         ` Stefan Monnier
2021-06-11 20:21                                           ` Ergus
2021-06-11 20:27                                             ` Stefan Monnier
2021-06-11 20:37                                               ` Daniel Colascione
2021-06-11 20:52                                                 ` Stefan Monnier
2021-06-12  6:46                                                   ` Eli Zaretskii
2021-06-12  8:03                                                     ` Daniel Colascione
2021-06-12  8:13                                                       ` Eli Zaretskii
2021-06-12 13:51                                                       ` Stefan Monnier
2021-06-12  8:47                                                   ` Daniele Nicolodi
2021-06-12  8:57                                                     ` tomas
2021-06-12 14:04                                                     ` Stefan Monnier
2021-06-09 19:05                                   ` Daniel Colascione
2021-06-09 19:11                                     ` Eli Zaretskii
2021-06-09 20:20                                     ` Alan Mackenzie
2021-06-09 20:36                                       ` Stefan Monnier
2021-06-10  7:01                                         ` Daniel Colascione
2021-06-10  7:21                                           ` Eli Zaretskii
2021-06-10  2:21                                       ` Daniel Colascione
2021-06-19  9:25                                         ` Alan Mackenzie
2021-06-19 15:24                                           ` Alan Mackenzie
2021-07-09 14:06                                             ` Daniel Colascione
2021-07-11 18:12                                               ` Stephen Leake
2021-07-15 18:13                                                 ` Perry E. Metzger
2021-07-15 22:43                                                   ` Tree Sitter (was Re: cc-mode fontification feels random) Perry E. Metzger
2021-07-19 23:49                                                     ` Stephen Leake
2021-07-20 14:53                                                       ` Perry E. Metzger
2021-07-21  0:04                                                         ` Stephen Leake
2021-07-21  1:28                                                           ` Stefan Monnier
2021-07-21 14:43                                                             ` Perry E. Metzger
2021-07-21 16:21                                                               ` Daniel Colascione
2021-07-21 19:15                                                                 ` Perry E. Metzger
2021-07-22  1:16                                                                   ` Daniel Colascione
2021-07-22 13:18                                                                     ` Perry E. Metzger
2021-07-22 13:49                                                                     ` Yuan Fu
2021-07-24 20:05                                                                     ` [SPAM UNSURE] " Stephen Leake
2021-07-25  0:41                                                                       ` Daniel Colascione
2021-07-26  4:24                                                                         ` [SPAM UNSURE] " Stephen Leake
2021-07-25 18:01                                                                       ` Perry E. Metzger
2021-07-22 14:00                                                           ` Perry E. Metzger
2021-07-24  1:17                                                             ` Richard Stallman
2021-07-25 16:13                                                               ` Stephen Leake
2021-07-25 19:52                                                                 ` Ada (was Re: Tree Sitter) Perry E. Metzger
2021-07-26  5:05                                                                   ` Stephen Leake
2021-07-26  9:42                                                                     ` Stephen Leake
2021-07-26 14:01                                                                       ` Perry E. Metzger
2021-07-26 13:45                                                                     ` Perry E. Metzger
2021-07-27  0:26                                                                   ` Richard Stallman
2021-07-27 12:38                                                                     ` Perry E. Metzger
2021-07-26  2:23                                                                 ` Tree Sitter (was Re: cc-mode fontification feels random) John Yates
2021-07-24 19:59                                                             ` Stephen Leake
2021-07-24 21:21                                                               ` OFF-TOPIC: Ada availability (was: Tree Sitter) Óscar Fuentes
2021-07-25  7:31                                                                 ` tomas
2021-06-08 18:11                             ` cc-mode fontification feels random Eli Zaretskii
2021-06-08 21:25                               ` Stefan Monnier
2021-06-09  3:39                         ` Richard Stallman
2021-06-09  8:34                         ` martin rudalics
2021-06-09 13:14                           ` `open-paren-in-column-0-is-defun-start` (was: cc-mode fontification feels random) Stefan Monnier
2021-06-09 15:15                             ` Yuri Khan
2021-06-09 15:16                               ` Yuri Khan
2021-06-12 17:29                           ` cc-mode fontification feels random João Távora
2021-06-13  8:50                             ` martin rudalics
2021-06-13  9:05                               ` João Távora
2021-06-13  9:39                                 ` martin rudalics
2021-06-13 10:06                                   ` João Távora
2021-06-13 14:52                                     ` martin rudalics
2021-06-13 15:25                                       ` João Távora
2021-06-14  8:29                                         ` martin rudalics
2021-06-14  8:40                                           ` João Távora
2021-06-14  9:00                                             ` martin rudalics
2021-06-14  9:14                                               ` João Távora
2021-06-14 11:28                                           ` Eli Zaretskii
2021-06-14 14:39                                           ` Stefan Monnier
2021-06-15 22:38                                             ` Ergus
2021-06-07 12:08                 ` Eli Zaretskii
2021-06-08 15:22                   ` Stefan Monnier
2021-06-08 15:46                     ` Eli Zaretskii
2021-06-05 20:25   ` Dmitry Gutov
2021-06-06 11:53     ` Alan Mackenzie
2021-06-06 17:08       ` Dmitry Gutov
2021-08-30 18:50 ` [PATCH] " Alan Mackenzie
2021-08-30 19:03   ` Perry E. Metzger
2021-08-30 19:18     ` Alan Mackenzie
2021-08-30 19:25   ` Eli Zaretskii
2021-08-30 19:28     ` Daniel Colascione
2021-08-30 19:37       ` Eli Zaretskii
2021-08-30 20:11       ` Stefan Monnier
2021-08-31 10:54         ` Alan Mackenzie
2021-08-31 13:23           ` Eli Zaretskii
2021-08-31 16:02             ` Alan Mackenzie
2021-08-31 16:21               ` Eli Zaretskii
2021-08-31 16:46                 ` Alan Mackenzie
2021-08-31 17:02                   ` Eli Zaretskii
2021-08-31 18:56           ` Stefan Monnier
2021-08-31 21:17             ` Alan Mackenzie
2021-08-31 21:47               ` Stefan Monnier
2021-10-22 20:13               ` [Committed PATCH] " Alan Mackenzie
2021-10-24 20:18                 ` Alan Mackenzie
2021-08-31 13:18         ` [PATCH] " Eli Zaretskii
2021-08-30 20:03     ` Alan Mackenzie
2021-08-31 11:53       ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).