From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: casouri@gmail.com, akrl@sdf.org, emacs-devel@gnu.org
Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs)
Date: Tue, 31 Mar 2020 11:11:22 -0400 [thread overview]
Message-ID: <jwvpncsk1x9.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <83369o3bvb.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 31 Mar 2020 16:14:16 +0300")
>> IIUC, tree-sitter starts by parsing the whole buffer anyway, and then
>> keeps the parse tree up-to-date in response to buffer changes.
> Why does it need the entire buffer up front?
Because as a general rule you cannot parse a region without looking at
all the preceding text. That's why when we fontify START..BEG we need
to begin by computing the `syntax-ppss` at START, which involved passing
the whole text from `point-min` to START though `parse-partial-sexp`.
> that sounds like a potential performance killer.
Indeed. And so does this `syntax-ppss` call we have.
It's OK as long as the parsing is fast enough and you don't use it in
too large buffers.
E.g. I expect that most programming major modes currently exhibit
significant delays when you jump to the end of multi-GB buffer because
of that `syntax-ppss` call.
> Fontifying a small part of a buffer doesn't need its entire text.
Sadly, it does. In specific cases you may be able to speed things up,
but that's only applicable to some cases.
I'm sure there could be other approaches that focus on trying to parse as
little of the buffer text as possible (e.g. SMIE follows this kind of
idea), but it's difficult to make them work with a "normal" grammar,
providing a full parse tree and giving a reliable result (and without
it degenerating to parsing the whole buffer anyway in most cases).
> In any case, I hope that passing the buffer to tree-sitter doesn't
> involve marshalling the entire buffer text via a function call as a
> huge string, or some such.
These are internal implementation details that can be tweaked later on.
I do expect that the code currently needs to call `buffer-string` or its
moral equivalent. But if the resources this requires are significant
enough to worry about, then it's a great news: it means the parsing
itself is very fast.
> We should instead request that tree-sitter exposes an API through
> which we could give it direct access to buffer text as 2 parts, before
> and after the gap, like we do with regex code. Otherwise this will be
> a bottleneck in the long run, not unlike the problem we have with LSP.
I'm not sure exactly which problem with LSP you're thinking about, but
I doubt `buffer-string` is a significant component of a performance
problem with LSP: the time to pass that string to the server via a pipe
should dwarf it.
> I still don't see why it would need the entire buffer for this class
> of applications. Did anyone try the alternatives, in particular on
> very large buffers?
What alternatives?
How large is "very large" here?
Stefan
next prev parent reply other threads:[~2020-03-31 15:11 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-29 18:46 Using incremental parsing in Emacs (via: emacs rendering comparisson between emacs23 and emacs26.3) Stefan Monnier
2020-03-29 19:05 ` Andrea Corallo
2020-03-29 19:18 ` Eli Zaretskii
2020-03-29 19:29 ` Reliable after-change-functions (via: Using incremental parsing in Emacs) Yuan Fu
2020-03-30 14:04 ` Eli Zaretskii
2020-03-30 15:06 ` Stefan Monnier
2020-03-30 17:14 ` Yuan Fu
2020-03-30 17:54 ` Stefan Monnier
2020-03-30 18:43 ` Štěpán Němec
2020-03-30 18:46 ` Stefan Monnier
2020-03-30 19:02 ` Yuan Fu
2020-03-30 19:10 ` Eli Zaretskii
2020-03-30 19:21 ` Yuan Fu
2020-03-31 3:56 ` Štěpán Němec
2020-03-31 13:16 ` Eli Zaretskii
2020-03-31 13:36 ` Štěpán Němec
2020-03-31 14:34 ` Eli Zaretskii
2020-03-31 15:37 ` Štěpán Němec
2020-03-31 15:58 ` Eli Zaretskii
2020-03-31 16:18 ` Štěpán Němec
2020-03-31 17:38 ` Eli Zaretskii
2020-04-01 0:57 ` Stephen Leake
2020-03-30 19:42 ` Stefan Monnier
2020-03-30 19:27 ` Štěpán Němec
2020-03-31 2:24 ` Eli Zaretskii
2020-03-31 3:10 ` Stefan Monnier
2020-03-31 13:14 ` Eli Zaretskii
2020-03-31 14:31 ` Dmitry Gutov
2020-03-31 15:36 ` Eli Zaretskii
2020-03-31 15:45 ` Dmitry Gutov
2020-03-31 17:16 ` Stefan Monnier
2020-03-31 17:48 ` Eli Zaretskii
2020-03-31 19:35 ` Stefan Monnier
2020-04-01 2:23 ` Eli Zaretskii
2020-03-31 15:11 ` Stefan Monnier [this message]
2020-03-31 15:44 ` Eli Zaretskii
2020-03-31 17:10 ` Stefan Monnier
2020-03-31 17:19 ` Jorge Javier Araya Navarro
2020-03-31 17:46 ` Eli Zaretskii
2020-03-31 18:42 ` 조성빈
2020-03-31 19:29 ` Eli Zaretskii
2020-03-31 18:47 ` Dmitry Gutov
2020-03-31 18:48 ` Noam Postavsky
2020-03-31 19:02 ` Dmitry Gutov
2020-03-31 19:26 ` Eli Zaretskii
2020-03-31 19:50 ` Dmitry Gutov
2020-04-01 2:28 ` Eli Zaretskii
2020-04-01 3:49 ` Dmitry Gutov
2020-04-01 4:14 ` Eli Zaretskii
2020-04-01 13:47 ` Dmitry Gutov
2020-04-01 14:04 ` Eli Zaretskii
2020-04-01 14:55 ` Eli Zaretskii
2020-04-01 15:16 ` Dmitry Gutov
2020-04-01 15:59 ` Eli Zaretskii
2020-04-01 21:48 ` Dmitry Gutov
2020-04-01 22:29 ` Stefan Monnier
2020-04-02 14:23 ` Eli Zaretskii
2020-04-02 16:17 ` Dmitry Gutov
2020-04-02 18:25 ` Eli Zaretskii
2020-04-03 14:40 ` Tuấn-Anh Nguyễn
2020-04-03 16:10 ` Dmitry Gutov
2020-04-01 13:52 ` Alan Mackenzie
2020-04-01 14:10 ` Eli Zaretskii
2020-04-01 15:27 ` Dmitry Gutov
2020-04-01 15:44 ` Jorge Javier Araya Navarro
2020-04-01 16:03 ` Eli Zaretskii
2020-04-01 21:21 ` Dmitry Gutov
2020-04-02 14:09 ` Eli Zaretskii
2020-04-02 18:03 ` 조성빈 via "Emacs development discussions.
2020-04-02 18:27 ` Yuan Fu
2020-04-02 19:39 ` Stefan Monnier
2020-04-01 15:22 ` Dmitry Gutov
2020-04-04 11:06 ` Alan Mackenzie
2020-04-04 11:26 ` Eli Zaretskii
2020-04-04 14:14 ` Andrea Corallo
2020-04-04 14:41 ` Eli Zaretskii
2020-04-04 15:04 ` Andrea Corallo
2020-04-04 15:38 ` Richard Copley
2020-04-04 11:27 ` Eli Zaretskii
2020-04-04 12:01 ` Dmitry Gutov
2020-04-04 12:36 ` Alan Mackenzie
2020-04-04 12:40 ` Dmitry Gutov
2020-04-04 13:02 ` Eli Zaretskii
2020-04-04 16:09 ` Dmitry Gutov
2020-04-04 16:38 ` Eli Zaretskii
2020-04-04 16:45 ` Eli Zaretskii
2020-04-04 17:22 ` Richard Copley
2020-04-04 17:50 ` Eli Zaretskii
2020-04-04 18:29 ` Andrea Corallo
2020-04-04 18:56 ` Richard Copley
2020-04-04 20:36 ` Andrea Corallo
2020-04-04 17:36 ` Dmitry Gutov
2020-04-04 17:47 ` Eli Zaretskii
2020-04-04 18:02 ` Dmitry Gutov
2020-04-04 23:01 ` Stefan Monnier
2020-04-06 14:25 ` Yuan Fu
2020-04-06 19:55 ` Jorge Javier Araya Navarro
2020-04-04 17:29 ` Dmitry Gutov
2020-04-04 17:38 ` Eli Zaretskii
2020-04-04 17:57 ` Dmitry Gutov
2020-03-31 16:13 ` Alan Third
2020-03-31 17:55 ` Eli Zaretskii
2020-03-30 3:35 ` Using incremental parsing in Emacs (via: emacs rendering comparisson between emacs23 and emacs26.3) Stefan Monnier
2020-03-30 6:02 ` Eli Zaretskii
2020-03-30 13:33 ` Stefan Monnier
2020-03-30 14:09 ` Eli Zaretskii
2020-03-30 15:03 ` Stefan Monnier
2020-04-01 0:39 ` Stephen Leake
-- strict thread matches above, loose matches on Subject: below --
2020-03-31 17:07 Reliable after-change-functions (via: Using incremental parsing in Emacs) Tuấn Anh Nguyễn
2020-03-31 17:50 ` Eli Zaretskii
2020-04-01 6:17 ` Tuấn Anh Nguyễn
2020-04-01 13:26 ` Eli Zaretskii
2020-04-01 15:47 ` Jorge Javier Araya Navarro
2020-04-01 16:07 ` Eli Zaretskii
2020-04-01 17:55 ` Tuấn-Anh Nguyễn
2020-04-01 19:33 ` Eli Zaretskii
2020-04-01 23:38 ` Stephen Leake
2020-04-02 0:25 ` Stephen Leake
2020-04-02 2:46 ` Stefan Monnier
2020-04-02 4:36 ` Tuấn-Anh Nguyễn
2020-04-02 14:44 ` Eli Zaretskii
2020-04-02 15:19 ` Stefan Monnier
2020-04-02 5:21 ` Tuấn-Anh Nguyễn
2020-04-02 14:36 ` Eli Zaretskii
2020-04-03 2:27 ` Stephen Leake
2020-04-03 7:43 ` Eli Zaretskii
2020-04-03 17:45 ` Stephen Leake
2020-04-03 18:31 ` Eli Zaretskii
2020-04-04 0:04 ` Stephen Leake
2020-04-04 7:13 ` Eli Zaretskii
2020-04-02 4:21 ` Tuấn-Anh Nguyễn
2020-04-02 5:19 ` Jorge Javier Araya Navarro
2020-04-02 9:29 ` Stephen Leake
2020-04-02 10:37 ` Andrea Corallo
2020-04-02 11:14 ` Tuấn-Anh Nguyễn
2020-04-02 13:02 ` Stefan Monnier
2020-04-02 15:06 ` Eli Zaretskii
2020-04-02 15:02 ` Eli Zaretskii
2020-04-03 14:34 ` Tuấn-Anh Nguyễn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwvpncsk1x9.fsf-monnier+emacs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=akrl@sdf.org \
--cc=casouri@gmail.com \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).