all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@IRO.UMontreal.CA>
To: Alan Mackenzie <acm@muc.de>
Cc: cc-mode-help@lists.sourceforge.net, emacs-devel@gnu.org
Subject: Re: A possible way for CC Mode to resolve its sluggishness
Date: Sun, 28 Apr 2019 21:46:25 -0400	[thread overview]
Message-ID: <jwvsgu1k3zo.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <20190427135725.GB4822@ACM> (Alan Mackenzie's message of "Sat, 27 Apr 2019 13:57:25 +0000")

>> Part of the problem is that CC-mode is very eager in its management of
>> syntax information: the `syntax-table` text-properties are always kept
>> up-to-date over the whole buffer right after every single change.
> That is not part of the problem.  That is part of the challenge.

Keeping everything always up-to-date, whether we use them or not, is not
in itself of any benefit to the end-user.

So it's a self-imposed challenge.  I think here are enough challenges
without having to add this one to the lot.

>> Modes using syntax-propertize work more lazily:
>> before-change-functions only marks that some change occurred at
>> position POS and the syntax-table properties after that position are
>> only updated afterward on-demand.
>
> Yes, but it is somewhat unclear whether, how, and when modes using
> syntax-propertize can update syntax-table properties on positions
> _before_ a change.

When syntax-propertize-function is called on BEG..END it's not expected
to touch anything outside of the BEG..END region (I think it's perfectly
safe to touch things after END, but it's definitely risky to touch
things before BEG).

When a change at buffer position POS requires updating syntax-table (or
other) text-properties at some earlier position, this can be indicated
to syntax-propertize via syntax-propertize-extend-region-functions.

>> Maybe another part of the problem is that CC-mode tries to do more than
>> most other major modes: e.g. the highlighting of unclosed strings.
>> For plain single-line strings this can be fairly cheap, but for
>> multiline strings, keeping this information constantly up-to-date over
>> the whole buffer can be costly.
>
> CC Mode is successful in this regard.  The highlighting with
> warning-face of unclosed string openers is a useful feature which other
> modes could emulate.

I don't think "successful" is an appropriate description (e.g. I don't
know what a failure would be).  What I do know, is that this
highlighting imposes significant additional work (e.g. because it more
often requires updating text-properties before the place where
text was modified).
And that it's an incomplete feature: it only does that for strings.

Other major modes provide a more complete implementation of that feature
via flymake/lsp without imposing that extra work on
before/after-change-functions.

It also means that those modes don't try to give a definite answer to
questions of what to do in the face of invalid code
(e.g. non-terminated strings or comments, mismatched parens, ...)
a limit themselves to try and avoid doing something "obviously" wrong in
those case.

I know you think this is a bad design decision, but it's a design
decision that is largely unavoidable (e.g. C-mode has to do the same
when faced with some uses of CPP which render the "un-preprocessed"
code unparsable) and Emacs usually gets a lot of benefits from it in
terms of simplicity and performance.

> But that's a digression from the topic of this thread.

Indeed.

>> Not sure whether you intend this to be just a change to CC-mode (it does
>> sound like it can all be implemented in Elisp) or you intend for some
>> change at the C level.
> At the Lisp level.  I hadn't even considered any C enhancements.

Good.  Then I think it's worth a try.

>> But to tell you the truth, other than CC-mode, I'm having a hard time
>> imagining which other major mode will want to use such a thing.
>> Performance of syntax-propertize is not stellar but doesn't seem
>> problematic, and it is not too hard to use (its functioning is not
>> exactly the same as what a real lexer would do, but you can make use of
>> the language spec more or less straightforwardly), ....
> Again, can syntax-propertize work on positions _before_ a buffer change?

Yes.  There has not been much need for it, tho, so support for it is
fairly primitive.

E.g. most modes that have such needs don't use
syntax-propertize-extend-region-functions but rely on
jit-lock-multiline, which makes this kind of "update an earlier part of
the buffer" happen at a later time, and hence causes the text-properties
to be left invalid for a while.

Delaying updates this way is OK for font-lock highlighting, but is wrong
for syntax-propertize, so we should change those modes to use
syntax-propertize-extend-region-functions where relevant.

>> Maybe if we want to speed things up, we should consider a new parsing
>> engine (instead of parse-partial-sexp and syntax-tables) based maybe on
>> a DFA for the tokenizer and GLR parser on top.  That might arguably be
>> more generally useful and easier to use (in the sense that one can more
>> or less follow the language spec when implementing the major mode).
> That would be a lot of design and a lot of work, and sounds like
> something from the distant rather than medium future.

Agreed.

> The indentation and font-lock routines would have to be rewritten for
> each mode using it.

It would probably be a good idea to make use of parsing info for those,
but not indispensable.


        Stefan



  parent reply	other threads:[~2019-04-29  1:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-26 19:30 A possible way for CC Mode to resolve its sluggishness Alan Mackenzie
2019-04-26 19:53 ` Eli Zaretskii
2019-04-26 20:11   ` Alan Mackenzie
2019-04-27  2:10 ` Stefan Monnier
2019-04-27  3:34   ` Óscar Fuentes
2019-04-27 13:57   ` Alan Mackenzie
2019-04-28 17:32     ` Stephen Leake
2019-04-29  1:46     ` Stefan Monnier [this message]
2019-04-29  9:23       ` Alan Mackenzie
2019-04-29 12:19         ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvsgu1k3zo.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=acm@muc.de \
    --cc=cc-mode-help@lists.sourceforge.net \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.