From: Stephen Leake <stephen_leake@stephe-leake.org>
To: emacs-devel <emacs-devel@gnu.org>
Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs)
Date: Fri, 03 Apr 2020 09:45:44 -0800 [thread overview]
Message-ID: <868sjcfoon.fsf@stephe-leake.org> (raw)
In-Reply-To: <83zhbtvwsm.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 03 Apr 2020 10:43:53 +0300")
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Stephen Leake <stephen_leake@stephe-leake.org>
>> Date: Thu, 02 Apr 2020 18:27:59 -0800
>>
>> > Such copying is not really scalable, and IMO should be avoided.
>> > During active editing, redisplay runs very frequently, and having to
>> > copy portions of the buffer, let alone all of it, each time, which
>> > necessarily requires memory allocation, consing of Lisp objects, etc.,
>> > will produce significant memory pressure, expensive heap
>> > allocations/deallocations, and a lot of GC. Recall that on many
>> > modern platforms Emacs doesn't really return memory to the system,
>> > which means we risk increasing the memory footprint, and create
>> > system-wide memory pressure. It isn't a catastrophe, but we should
>> > try to avoid it if possible.
>>
>> Ok. I know very little about the internal storage of text in Emacs.
>> There is at least two strings with a gap at the current edit point; if
>> we pass a simple pointer to tree-sitter, it will have to handle the gap.
>
> Tree-sitter allows the application to define a "reader" function that
> it will then call to access buffer text. That function should cope
> with the gap.
and also with the encoding, which you did not address. I don't see how
that is different from the C level buffer-substring. Certainly there
should be a module function buffer-substring that is as efficient as possible.
>> You mention "consing of Lisp objects" above, which says to me that the
>> text is stored in a more complex structure.
>
> I meant the consing that is necessary to make a buffer-substring that
> will be passed to the parser.
Since are are calling the parser from C (if it is linked into Emacs, or
in a module), I still don't understand. Does C code have to cons to
create a string? It will have to allocate if the requested range is not
contiguous in the buffer.
>> Avoid _all_ copying is impossible; the parser must store the contents of
>> each token in some way. Typically that is done by storing
>> pointers/indices into the text buffer that contains the entire text.
>
> I don't think tree-sitter does that, because the text it gets is
> ephemeral. If we pass it a buffer-substring, it's a temporary string
> which will be GCed after it's used; if we pass it pointers to buffer
> text, those pointers can be invalid after GC, because GC can relocate
> buffer text to a different memory region.
Hmm.
https://tree-sitter.github.io/tree-sitter/using-parsers#providing-the-code
says:
Syntax nodes store their position in the source code both in terms
of raw bytes and row/column coordinates
In the case of passing a pointer to a string (or buffer, etc), those
positions are relative to that original buffer. So the Emacs buffer is
serving as the parse buffer. Ok, that avoids any copying.
If we pass a buffer-substring to the parser, we are then responsible for
mapping positions relative to the substring into positions relative to
the full buffer. wisi delegates that to the parser; it can pass
start-char-pos and start-byte-pos to the parser along with a string.
>> >> In sum, the short answer is "yes, you must parse the whole file, unless
>> >> your language is particularly simple".
>> >
>> > Funny, my conclusion from reading your detailed description was
>> > entirely different.
>>
>> I need more than that to respond in a helpful way.
>
> Well, you said:
>
>> To some extent, that depends on the language.
>
> and then went on to describing how each language might _not_ need a
> full parse in many cases. Thus the conclusion sounded a bit radical
> to me.
Ok, we are putting different spins on what "particularly simple" means.
A more neutral phrasing would be:
Some languages require parsing the whole file, some do not.
--
-- Stephe
next prev parent reply other threads:[~2020-04-03 17:45 UTC|newest]
Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-31 17:07 Reliable after-change-functions (via: Using incremental parsing in Emacs) Tuấn Anh Nguyễn
2020-03-31 17:50 ` Eli Zaretskii
2020-04-01 6:17 ` Tuấn Anh Nguyễn
2020-04-01 13:26 ` Eli Zaretskii
2020-04-01 15:47 ` Jorge Javier Araya Navarro
2020-04-01 16:07 ` Eli Zaretskii
2020-04-01 17:55 ` Tuấn-Anh Nguyễn
2020-04-01 19:33 ` Eli Zaretskii
2020-04-01 23:38 ` Stephen Leake
2020-04-02 0:25 ` Stephen Leake
2020-04-02 2:46 ` Stefan Monnier
2020-04-02 4:36 ` Tuấn-Anh Nguyễn
2020-04-02 14:44 ` Eli Zaretskii
2020-04-02 15:19 ` Stefan Monnier
2020-04-03 2:49 ` [SPAM UNSURE] " Stephen Leake
2020-04-03 7:47 ` Eli Zaretskii
2020-04-03 18:11 ` Stephen Leake
2020-04-03 18:46 ` Eli Zaretskii
2020-04-04 0:05 ` Stephen Leake
2020-04-03 8:11 ` Robert Pluim
2020-04-03 11:00 ` Eli Zaretskii
2020-04-03 11:09 ` Robert Pluim
2020-04-03 12:44 ` Eli Zaretskii
2020-04-03 11:21 ` John Yates
2020-04-03 12:50 ` Eli Zaretskii
2020-04-02 5:21 ` Tuấn-Anh Nguyễn
2020-04-02 9:24 ` [SPAM UNSURE] " Stephen Leake
2020-04-02 14:36 ` Eli Zaretskii
2020-04-03 2:27 ` Stephen Leake
2020-04-03 7:43 ` Eli Zaretskii
2020-04-03 17:45 ` Stephen Leake [this message]
2020-04-03 18:31 ` Eli Zaretskii
2020-04-04 0:04 ` Stephen Leake
2020-04-04 7:13 ` Eli Zaretskii
2020-04-02 4:21 ` Tuấn-Anh Nguyễn
2020-04-02 5:19 ` Jorge Javier Araya Navarro
2020-04-02 9:29 ` Stephen Leake
2020-04-02 10:37 ` Andrea Corallo
2020-04-02 11:14 ` Tuấn-Anh Nguyễn
2020-04-02 13:02 ` Stefan Monnier
2020-04-02 15:06 ` Eli Zaretskii
2020-04-02 15:02 ` Eli Zaretskii
2020-04-03 14:34 ` Tuấn-Anh Nguyễn
-- strict thread matches above, loose matches on Subject: below --
2020-03-29 18:46 Using incremental parsing in Emacs (via: emacs rendering comparisson between emacs23 and emacs26.3) Stefan Monnier
2020-03-29 19:05 ` Andrea Corallo
2020-03-29 19:18 ` Eli Zaretskii
2020-03-29 19:29 ` Reliable after-change-functions (via: Using incremental parsing in Emacs) Yuan Fu
2020-03-30 14:04 ` Eli Zaretskii
2020-03-30 15:06 ` Stefan Monnier
2020-03-30 17:14 ` Yuan Fu
2020-03-30 17:54 ` Stefan Monnier
2020-03-30 18:43 ` Štěpán Němec
2020-03-30 18:46 ` Stefan Monnier
2020-03-30 19:02 ` Yuan Fu
2020-03-30 19:10 ` Eli Zaretskii
2020-03-30 19:21 ` Yuan Fu
2020-03-31 3:56 ` Štěpán Němec
2020-03-31 13:16 ` Eli Zaretskii
2020-03-31 13:36 ` Štěpán Němec
2020-03-31 14:34 ` Eli Zaretskii
2020-03-31 15:37 ` Štěpán Němec
2020-03-31 15:58 ` Eli Zaretskii
2020-03-31 16:18 ` Štěpán Němec
2020-03-31 17:38 ` Eli Zaretskii
2020-04-01 0:57 ` Stephen Leake
2020-03-30 19:42 ` Stefan Monnier
2020-03-30 19:27 ` Štěpán Němec
2020-03-31 2:24 ` Eli Zaretskii
2020-03-31 3:10 ` Stefan Monnier
2020-03-31 13:14 ` Eli Zaretskii
2020-03-31 14:31 ` Dmitry Gutov
2020-03-31 15:36 ` Eli Zaretskii
2020-03-31 15:45 ` Dmitry Gutov
2020-03-31 17:16 ` Stefan Monnier
2020-03-31 17:48 ` Eli Zaretskii
2020-03-31 19:35 ` Stefan Monnier
2020-04-01 2:23 ` Eli Zaretskii
2020-03-31 15:11 ` Stefan Monnier
2020-03-31 15:44 ` Eli Zaretskii
2020-03-31 17:10 ` Stefan Monnier
2020-03-31 17:19 ` Jorge Javier Araya Navarro
2020-03-31 17:46 ` Eli Zaretskii
2020-03-31 18:42 ` 조성빈
2020-03-31 19:29 ` Eli Zaretskii
2020-03-31 18:47 ` Dmitry Gutov
2020-03-31 18:48 ` Noam Postavsky
2020-03-31 19:02 ` Dmitry Gutov
2020-03-31 19:26 ` Eli Zaretskii
2020-03-31 19:50 ` Dmitry Gutov
2020-04-01 2:28 ` Eli Zaretskii
2020-04-01 3:49 ` Dmitry Gutov
2020-04-01 4:14 ` Eli Zaretskii
2020-04-01 13:47 ` Dmitry Gutov
2020-04-01 14:04 ` Eli Zaretskii
2020-04-01 14:55 ` Eli Zaretskii
2020-04-01 15:16 ` Dmitry Gutov
2020-04-01 15:59 ` Eli Zaretskii
2020-04-01 21:48 ` Dmitry Gutov
2020-04-01 22:29 ` Stefan Monnier
2020-04-02 14:23 ` Eli Zaretskii
2020-04-02 16:17 ` Dmitry Gutov
2020-04-02 18:25 ` Eli Zaretskii
2020-04-03 14:40 ` Tuấn-Anh Nguyễn
2020-04-03 16:10 ` Dmitry Gutov
2020-04-01 13:52 ` Alan Mackenzie
2020-04-01 14:10 ` Eli Zaretskii
2020-04-01 15:27 ` Dmitry Gutov
2020-04-01 15:44 ` Jorge Javier Araya Navarro
2020-04-01 16:03 ` Eli Zaretskii
2020-04-01 21:21 ` Dmitry Gutov
2020-04-02 14:09 ` Eli Zaretskii
2020-04-02 18:03 ` 조성빈 via "Emacs development discussions.
2020-04-02 18:27 ` Yuan Fu
2020-04-02 19:39 ` Stefan Monnier
2020-04-01 15:22 ` Dmitry Gutov
2020-04-04 11:06 ` Alan Mackenzie
2020-04-04 11:26 ` Eli Zaretskii
2020-04-04 14:14 ` Andrea Corallo
2020-04-04 14:41 ` Eli Zaretskii
2020-04-04 15:04 ` Andrea Corallo
2020-04-04 15:38 ` Richard Copley
2020-04-04 11:27 ` Eli Zaretskii
2020-04-04 12:01 ` Dmitry Gutov
2020-04-04 12:36 ` Alan Mackenzie
2020-04-04 12:40 ` Dmitry Gutov
2020-04-04 13:02 ` Eli Zaretskii
2020-04-04 16:09 ` Dmitry Gutov
2020-04-04 16:38 ` Eli Zaretskii
2020-04-04 16:45 ` Eli Zaretskii
2020-04-04 17:22 ` Richard Copley
2020-04-04 17:50 ` Eli Zaretskii
2020-04-04 18:29 ` Andrea Corallo
2020-04-04 18:56 ` Richard Copley
2020-04-04 20:36 ` Andrea Corallo
2020-04-04 17:36 ` Dmitry Gutov
2020-04-04 17:47 ` Eli Zaretskii
2020-04-04 18:02 ` Dmitry Gutov
2020-04-04 23:01 ` Stefan Monnier
2020-04-06 14:25 ` Yuan Fu
2020-04-06 19:55 ` Jorge Javier Araya Navarro
2020-04-04 17:29 ` Dmitry Gutov
2020-04-04 17:38 ` Eli Zaretskii
2020-04-04 17:57 ` Dmitry Gutov
2020-03-31 16:13 ` Alan Third
2020-03-31 17:55 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=868sjcfoon.fsf@stephe-leake.org \
--to=stephen_leake@stephe-leake.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).