From: Eli Zaretskii <eliz@gnu.org>
To: "Tuấn-Anh Nguyễn" <ubolonton@gmail.com>
Cc: emacs-devel@gnu.org
Subject: Re: Reliable after-change-functions (via: Using incremental parsing in Emacs)
Date: Wed, 01 Apr 2020 22:33:05 +0300 [thread overview]
Message-ID: <831rp7ypam.fsf@gnu.org> (raw)
In-Reply-To: <CAOGL6j7X2PEbSV13rKM6j77trHJ-dij2ivssPqYNKniC0NgDVQ@mail.gmail.com> (message from Tuấn-Anh Nguyễn on Thu, 2 Apr 2020 00:55:45 +0700)
> From: Tuấn-Anh Nguyễn <ubolonton@gmail.com>
> Date: Thu, 2 Apr 2020 00:55:45 +0700
> Cc: emacs-devel@gnu.org
>
> > Did you consider using the API where an application can provide a
> > function to return text at a given offset? Such a function could be
> > relatively easily implemented for Emacs.
> >
>
> I don't understand what you mean. Below I'll explain how it works
> currently. [...] If dynamic modules have direct access to the
> buffer text, none of the above is an issue.
>
> Such direct access can be enabled by something like this:
>
> char* (*access_buffer_text) (emacs_env *env,
> emacs_value buffer,
> ptrdiff_t byte_offset,
> ptrdiff_t *size_inout);
>
> Of course, such an API would require extensive documentation on how it
> must be used, to ensure safety and correctness.
I think you are moving too fast, and keep the current implementation
in sight too much.
What I suggest is to step back and see how such direct access, if it
were available, could be used with tree-sitter. Let's forget about
modules for a moment and consider tree-sitter linked with Emacs and
capable of calling any C function in core. How would you use that?
Buffer text is not exactly UTF-8, it's a superset of UTF-8. So one
question to answer is what to do with byte sequences that are not
valid UTF-8. Any suggestions or ideas? How does tree-sitter handle
invalid byte sequences in general?
Also, direct access to buffer text generally means we must make sure
GC never runs as long as pointers to buffer text are lying around.
Can any Lisp run between calls to the reader function that the
tree-sitter parser calls to access the buffer text? If so, we need to
take care of that issue.
Next, I'm still asking whether parsing the whole buffer when it is
first created is necessary. Can we pass to the parser just a small
chunk (say, 500 bytes) of the buffer around the window-full to be
displayed next? If this presents problems, what are those problems?
IOW, the issue with exposing access to buffer text to modules is IMO
secondary. My suggestion is first to figure out how to do this stuff
efficiently from within Emacs itself, as if the module interface were
not part of the equation. We can add that aspect back later.
And yes, doing this by consing strings is not a good idea, it will
slow things down and cause a lot of GC. It is best avoided. Thus my
questions above.
> > Btw, what do you do with the tree returned by the tree-sitter parser?
> > store it in some buffer-local variable? If so, how much memory does
> > such a tree take, and when, if ever, is that memory released?
> >
>
> It's stored in a buffer-local variable. I haven't measured the memory
> they take. Memory is released when the tree object is garbage-collected
> (it's a `user-ptr').
So if I have many hundreds of buffers, I could have such a tree in
each one of them indefinitely? Perhaps that's one more design issue
to consider, given that the parsing is so fast. Similar to what we do
with image and face caches -- we flush them from time to time, to keep
the memory footprint in check. So a buffer that was not current more
than some time interval ago could have its tree GCed.
next prev parent reply other threads:[~2020-04-01 19:33 UTC|newest]
Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-31 17:07 Reliable after-change-functions (via: Using incremental parsing in Emacs) Tuấn Anh Nguyễn
2020-03-31 17:50 ` Eli Zaretskii
2020-04-01 6:17 ` Tuấn Anh Nguyễn
2020-04-01 13:26 ` Eli Zaretskii
2020-04-01 15:47 ` Jorge Javier Araya Navarro
2020-04-01 16:07 ` Eli Zaretskii
2020-04-01 17:55 ` Tuấn-Anh Nguyễn
2020-04-01 19:33 ` Eli Zaretskii [this message]
2020-04-01 23:38 ` Stephen Leake
2020-04-02 0:25 ` Stephen Leake
2020-04-02 2:46 ` Stefan Monnier
2020-04-02 4:36 ` Tuấn-Anh Nguyễn
2020-04-02 14:44 ` Eli Zaretskii
2020-04-02 15:19 ` Stefan Monnier
2020-04-03 2:49 ` [SPAM UNSURE] " Stephen Leake
2020-04-03 7:47 ` Eli Zaretskii
2020-04-03 18:11 ` Stephen Leake
2020-04-03 18:46 ` Eli Zaretskii
2020-04-04 0:05 ` Stephen Leake
2020-04-03 8:11 ` Robert Pluim
2020-04-03 11:00 ` Eli Zaretskii
2020-04-03 11:09 ` Robert Pluim
2020-04-03 12:44 ` Eli Zaretskii
2020-04-03 11:21 ` John Yates
2020-04-03 12:50 ` Eli Zaretskii
2020-04-02 5:21 ` Tuấn-Anh Nguyễn
2020-04-02 9:24 ` [SPAM UNSURE] " Stephen Leake
2020-04-02 14:36 ` Eli Zaretskii
2020-04-03 2:27 ` Stephen Leake
2020-04-03 7:43 ` Eli Zaretskii
2020-04-03 17:45 ` Stephen Leake
2020-04-03 18:31 ` Eli Zaretskii
2020-04-04 0:04 ` Stephen Leake
2020-04-04 7:13 ` Eli Zaretskii
2020-04-02 4:21 ` Tuấn-Anh Nguyễn
2020-04-02 5:19 ` Jorge Javier Araya Navarro
2020-04-02 9:29 ` Stephen Leake
2020-04-02 10:37 ` Andrea Corallo
2020-04-02 11:14 ` Tuấn-Anh Nguyễn
2020-04-02 13:02 ` Stefan Monnier
2020-04-02 15:06 ` Eli Zaretskii
2020-04-02 15:02 ` Eli Zaretskii
2020-04-03 14:34 ` Tuấn-Anh Nguyễn
-- strict thread matches above, loose matches on Subject: below --
2020-03-29 18:46 Using incremental parsing in Emacs (via: emacs rendering comparisson between emacs23 and emacs26.3) Stefan Monnier
2020-03-29 19:05 ` Andrea Corallo
2020-03-29 19:18 ` Eli Zaretskii
2020-03-29 19:29 ` Reliable after-change-functions (via: Using incremental parsing in Emacs) Yuan Fu
2020-03-30 14:04 ` Eli Zaretskii
2020-03-30 15:06 ` Stefan Monnier
2020-03-30 17:14 ` Yuan Fu
2020-03-30 17:54 ` Stefan Monnier
2020-03-30 18:43 ` Štěpán Němec
2020-03-30 18:46 ` Stefan Monnier
2020-03-30 19:02 ` Yuan Fu
2020-03-30 19:10 ` Eli Zaretskii
2020-03-30 19:21 ` Yuan Fu
2020-03-31 3:56 ` Štěpán Němec
2020-03-31 13:16 ` Eli Zaretskii
2020-03-31 13:36 ` Štěpán Němec
2020-03-31 14:34 ` Eli Zaretskii
2020-03-31 15:37 ` Štěpán Němec
2020-03-31 15:58 ` Eli Zaretskii
2020-03-31 16:18 ` Štěpán Němec
2020-03-31 17:38 ` Eli Zaretskii
2020-04-01 0:57 ` Stephen Leake
2020-03-30 19:42 ` Stefan Monnier
2020-03-30 19:27 ` Štěpán Němec
2020-03-31 2:24 ` Eli Zaretskii
2020-03-31 3:10 ` Stefan Monnier
2020-03-31 13:14 ` Eli Zaretskii
2020-03-31 14:31 ` Dmitry Gutov
2020-03-31 15:36 ` Eli Zaretskii
2020-03-31 15:45 ` Dmitry Gutov
2020-03-31 17:16 ` Stefan Monnier
2020-03-31 17:48 ` Eli Zaretskii
2020-03-31 19:35 ` Stefan Monnier
2020-04-01 2:23 ` Eli Zaretskii
2020-03-31 15:11 ` Stefan Monnier
2020-03-31 15:44 ` Eli Zaretskii
2020-03-31 17:10 ` Stefan Monnier
2020-03-31 17:19 ` Jorge Javier Araya Navarro
2020-03-31 17:46 ` Eli Zaretskii
2020-03-31 18:42 ` 조성빈
2020-03-31 19:29 ` Eli Zaretskii
2020-03-31 18:47 ` Dmitry Gutov
2020-03-31 18:48 ` Noam Postavsky
2020-03-31 19:02 ` Dmitry Gutov
2020-03-31 19:26 ` Eli Zaretskii
2020-03-31 19:50 ` Dmitry Gutov
2020-04-01 2:28 ` Eli Zaretskii
2020-04-01 3:49 ` Dmitry Gutov
2020-04-01 4:14 ` Eli Zaretskii
2020-04-01 13:47 ` Dmitry Gutov
2020-04-01 14:04 ` Eli Zaretskii
2020-04-01 14:55 ` Eli Zaretskii
2020-04-01 15:16 ` Dmitry Gutov
2020-04-01 15:59 ` Eli Zaretskii
2020-04-01 21:48 ` Dmitry Gutov
2020-04-01 22:29 ` Stefan Monnier
2020-04-02 14:23 ` Eli Zaretskii
2020-04-02 16:17 ` Dmitry Gutov
2020-04-02 18:25 ` Eli Zaretskii
2020-04-03 14:40 ` Tuấn-Anh Nguyễn
2020-04-03 16:10 ` Dmitry Gutov
2020-04-01 13:52 ` Alan Mackenzie
2020-04-01 14:10 ` Eli Zaretskii
2020-04-01 15:27 ` Dmitry Gutov
2020-04-01 15:44 ` Jorge Javier Araya Navarro
2020-04-01 16:03 ` Eli Zaretskii
2020-04-01 21:21 ` Dmitry Gutov
2020-04-02 14:09 ` Eli Zaretskii
2020-04-02 18:03 ` 조성빈 via "Emacs development discussions.
2020-04-02 18:27 ` Yuan Fu
2020-04-02 19:39 ` Stefan Monnier
2020-04-01 15:22 ` Dmitry Gutov
2020-04-04 11:06 ` Alan Mackenzie
2020-04-04 11:26 ` Eli Zaretskii
2020-04-04 14:14 ` Andrea Corallo
2020-04-04 14:41 ` Eli Zaretskii
2020-04-04 15:04 ` Andrea Corallo
2020-04-04 15:38 ` Richard Copley
2020-04-04 11:27 ` Eli Zaretskii
2020-04-04 12:01 ` Dmitry Gutov
2020-04-04 12:36 ` Alan Mackenzie
2020-04-04 12:40 ` Dmitry Gutov
2020-04-04 13:02 ` Eli Zaretskii
2020-04-04 16:09 ` Dmitry Gutov
2020-04-04 16:38 ` Eli Zaretskii
2020-04-04 16:45 ` Eli Zaretskii
2020-04-04 17:22 ` Richard Copley
2020-04-04 17:50 ` Eli Zaretskii
2020-04-04 18:29 ` Andrea Corallo
2020-04-04 18:56 ` Richard Copley
2020-04-04 20:36 ` Andrea Corallo
2020-04-04 17:36 ` Dmitry Gutov
2020-04-04 17:47 ` Eli Zaretskii
2020-04-04 18:02 ` Dmitry Gutov
2020-04-04 23:01 ` Stefan Monnier
2020-04-06 14:25 ` Yuan Fu
2020-04-06 19:55 ` Jorge Javier Araya Navarro
2020-04-04 17:29 ` Dmitry Gutov
2020-04-04 17:38 ` Eli Zaretskii
2020-04-04 17:57 ` Dmitry Gutov
2020-03-31 16:13 ` Alan Third
2020-03-31 17:55 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=831rp7ypam.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
--cc=ubolonton@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).