all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: cc-mode-help@lists.sourceforge.net
Cc: emacs-devel@gnu.org
Subject: Re: A possible way for CC Mode to resolve its sluggishness
Date: Fri, 26 Apr 2019 22:10:23 -0400	[thread overview]
Message-ID: <jwv4l6kkzf6.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: 20190426193056.GC4720@ACM

> The problem is that CC Mode's before/after-change-functions are very
> general, and scan the buffer looking for situations which only arise
> sporadically.  Things like an open string getting closed, or a > being
> inserted which needs to be checked for a template delimiter.  However,
> these expensive checks are performed for _every_ buffer change.  Even
> doing something like inserting a letter or a digit causes the full range
> of tests to be performed.  This is not good.

Part of the problem is that CC-mode is very eager in its management of
syntax information: the `syntax-table` text-properties are always kept
up-to-date over the whole buffer right after every single change.

Modes using syntax-propertize work more lazily: before-change-functions
only marks that some change occurred at position POS and the syntax-table
properties after that position are only updated afterward on-demand.

CC-mode tries to make up for it by being more clever about which parts of
the buffer after position POS actually need to be updated, but when
there are several consecutive changes, the extra work performed between
each one of those changes add up quickly.

[ Of course, there are cases where the approach used in
  syntax-propertize loses big time.  E.g. if you have a loop that first
  modifies a char near point-min, then asks for the syntax-table
  properties near point-max, and then repeats... performance will suck.
  But luckily I haven't yet seen a real-world use case where
  this occurs.  ]

Maybe another part of the problem is that CC-mode tries to do more than
most other major modes: e.g. the highlighting of unclosed strings.
For plain single-line strings this can be fairly cheap, but for
multiline strings, keeping this information constantly up-to-date over
the whole buffer can be costly.

Most other major modes just let the font-lock-string-face bleeds further
than the user intended, which requires much less work and works well
enough for all other syntactic elements (CC-mode doesn't highlight
unclosed parens, or mismatched parens, or `do` with missing `while`,
...).  When needed these many different kinds of errors are detected and
shown to the user via things like flymake or LSP instead, which work
much more lazily w.r.t buffer changes, so they don't need to same kind
of engineering efforts to make them fast enough.

> Thoughts?

Not sure whether you intend this to be just a change to CC-mode (it does
sound like it can all be implemented in Elisp) or you intend for some
change at the C level.  My gut feeling is that the checks you suggest in
(iii) could be implemented in Elisp without losing too much performance
(they should spend most of their time within a few C primitives), tho it
depends on the specifics of the cases you'll want to catch.  Also if you
want to implement it in C those same specifics will need to be spelled
out to figure out how a major mode will communicate them to the C code
(for this to be useful beyond CC-mode, it would need to be very general,
so it could be tricky to design).

But to tell you the truth, other than CC-mode, I'm having a hard time
imagining which other major mode will want to use such a thing.
Performance of syntax-propertize is not stellar but doesn't seem
problematic, and it is not too hard to use (its functioning is not
exactly the same as what a real lexer would do, but you can make use of
the language spec more or less straightforwardly), whereas I get the
impression that your suggestion relies on properties of the language
which are not often used, so are less familiar to the average
mode implementor (and a language spec is unlikely to help you figure out
what to do).

Maybe if we want to speed things up, we should consider a new parsing
engine (instead of parse-partial-sexp and syntax-tables) based maybe on
a DFA for the tokenizer and GLR parser on top.  That might arguably be
more generally useful and easier to use (in the sense that one can more
or less follow the language spec when implementing the major mode).


        Stefan




  parent reply	other threads:[~2019-04-27  2:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-26 19:30 A possible way for CC Mode to resolve its sluggishness Alan Mackenzie
2019-04-26 19:53 ` Eli Zaretskii
2019-04-26 20:11   ` Alan Mackenzie
2019-04-27  2:10 ` Stefan Monnier [this message]
2019-04-27  3:34   ` Óscar Fuentes
2019-04-27 13:57   ` Alan Mackenzie
2019-04-28 17:32     ` Stephen Leake
2019-04-29  1:46     ` Stefan Monnier
2019-04-29  9:23       ` Alan Mackenzie
2019-04-29 12:19         ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv4l6kkzf6.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=cc-mode-help@lists.sourceforge.net \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.