From: Ihor Radchenko <yantar92@posteo.net>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: casouri@gmail.com, 70077@debbugs.gnu.org, qhong@alum.mit.edu,
frederic.bour@lakaban.net, joaotavora@gmail.com,
mail@nicolasgoaziou.fr, acm@muc.de, Eli Zaretskii <eliz@gnu.org>,
stephen_leake@stephe-leake.org, alan.zimm@gmail.com,
phillip.lord@russet.org.uk
Subject: bug#70077: An easier way to track buffer changes
Date: Tue, 02 Apr 2024 16:21:25 +0000 [thread overview]
Message-ID: <87ttkjspkq.fsf@localhost> (raw)
In-Reply-To: <jwv4jcjq16n.fsf-monnier+emacs@gnu.org>
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>> I'm not sure how to combine the benefits of combining small changes into
>>> larger ones with the benefits of keeping distant changes separate.
>> I am not sure if combining small changes into larger ones is at all a
>> good idea.
>
> In my experience, if the amount of work to be done "per change" is not
> completely trivial, combining small changes is indispensable (the worst
> part being that the cases where it's needed are sufficiently infrequent
> that naive users of `*-change-functions` won't know about that, so we
> end up with unacceptable slowdowns in corner cases).
I agree, but only when subsequent changes intersect each other.
When distant changes are combined together, they cover a lot of buffer
text that has not been changed - they may trigger a lot more work
compared to more granular changes. Consider, for example, two changes
near the beginning and the end of the buffer:
(1..10) and (1000000...(point-max))
If such changes are tracked by tree-sitter or other parser, there is a
world of difference between requesting to re-parse individual segments
and the whole buffer.
>> The changes are stored in strings, which get allocated and
>> re-allocated repeatedly.
>
> Indeed. Tho for N small changes, my code should perform only O(log N)
> re-allocations.
May you explain a bit more about this?
My reading of `track-changes--before' is that `concat' is called on
every change except when a new change is already inside the previously
recorded one. (aside: I have a hard time reading the code because of
confusing slot names: bbeg vs beg??)
>> Repeated string allocations, especially when strings keep growing
>> towards the buffer size, is likely going to increase consing and make
>> GCs more frequent.
>
> Similar allocations presumably take place anyway while running the code
> (e.g. for the `buffer-undo-list`), so I'm hoping the effect will be
> "lost in the noise".
I disagree. `buffer-undo-list' only includes the text that has been
actually changed. It never creates strings that span between
distant regions. As a terminal case, consider alternating between first
and second half of the buffer, starting in the middle and editing
towards the buffer boundaries - this will involve re-allocation of
buffer-long strings on average.
>>> We could expose a list of simultaneous (and thus disjoint) changes,
>>> which avoids the last problem. But it's a fair bit more work for us, it
>>> makes the API more complex for the clients, and it's rarely what the
>>> clients really want anyway.
>> FYI, Org parser maintains such a list.
>
> Could you point me to the relevant code (I failed to find it)?
That code handles a lot more than just changes, so it might not be the
best reference. Anyway...
See the docstring of `org-element--cache-sync-requests', and
the branch in `org-element--cache-submit-request' starting from
;; Current changes can be merged with first sync request: we
;; can save a partial cache synchronization.
>> We previously discussed a similar API in
>> https://yhetil.org/emacs-bugs/87o7iq1emo.fsf@localhost/
>
> IIUC this discusses a *sequence* of edits. In the point to which you
> replied I was discussing keeping a list of *simultaneous* edits.
Hmm. By referring to buffer-undo-list, I meant that intersecting edits
will be automatically merged, as it is usually done in undo system.
I am also not 100% sure why edits being simultaneous is any relevant.
Maybe we are talking about different things?
What I was talking about is (1) automatically merging subsequent edits
like 1..5 2..7 7..9 into 1..9. (2) if a subsequent edit does not
intersect with previous edited region, record it separately without
merging.
> This said, the downside in both cases is that the specific data that we
> need from such a list tends to depend on the user. E.g. you suggest
>
> (BEG END_BEFORE END_AFTER COUNTER)
>
> but that is not sufficient reconstruct the corresponding buffer state,
> so things like Eglot/CRDT can't use it. Ideally for CRDT I think you'd
> want a sequence of
>
> (BEG END-BEFORE STRING-AFTER)
Right. It's just that Org mode does not need STRING-AFTER, which is why
I did not think about it in my proposal. Of course, having STRING-AFTER
is required to get full reconstruction of the buffer state.
> but for Eglot this is not sufficient because Eglot needs to convert BEG
> and END_BEFORE into LSP positions (i.e. "line+col") and for that it
> needs to reproduce the past buffer state. So instead, what Eglot needs
> (and does indeed build using `*-change-functions`) is a sequence of
>
> (LSP-BEG LSP-END-BEFORE STRING-AFTER)
>
> [ Tho it seems it also needs a "LENGTH" of the deleted chunk, not sure
> exactly why, but I guess it's a piece of redundant info the servers
> can use to sanity-check the data? ]
It is to support deprecated LSP spec (I presume that older LSP servers
may still be using the old spec):
https://microsoft.github.io/language-server-protocol/specifications/specification-3-15/
export type TextDocumentContentChangeEvent = {
* The range of the document that changed.
range: Range;
* The optional length of the range that got replaced.
* @deprecated use range instead.
rangeLength?: number;
* The new text for the provided range.
text: string;
> The code changes were actually quite simple, so I have now included it.
> See my current PoC code below.
Am I missing something, or will `track-changes--signal-if-disjoint'
alter `track-changes--state' when it calls `track-changes-fetch' ->
`track-changes-reset'? Won't that interfere with the information passed
to non-disjoint trackers?
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
next prev parent reply other threads:[~2024-04-02 16:21 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-29 16:15 bug#70077: An easier way to track buffer changes Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 18:12 ` Eli Zaretskii
2024-03-29 18:53 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30 6:34 ` Eli Zaretskii
2024-03-30 14:58 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30 16:45 ` Eli Zaretskii
2024-03-31 2:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-01 11:53 ` Ihor Radchenko
2024-04-01 14:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-01 17:49 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-02 14:22 ` Ihor Radchenko
2024-04-02 15:17 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-02 16:21 ` Ihor Radchenko [this message]
2024-04-02 17:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-03 12:34 ` Ihor Radchenko
2024-04-03 12:45 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-04 17:58 ` Ihor Radchenko
2024-03-30 3:17 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30 5:09 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 22:20 ` phillip.lord
2024-03-29 22:59 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30 6:46 ` Eli Zaretskii
2024-03-30 12:06 ` phillip.lord
2024-03-30 13:39 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30 9:51 ` Ihor Radchenko
2024-03-30 12:49 ` Eli Zaretskii
2024-03-30 13:19 ` Ihor Radchenko
2024-03-30 13:31 ` Eli Zaretskii
2024-03-30 14:09 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-05 22:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-06 8:43 ` Eli Zaretskii
2024-04-08 15:24 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-08 15:53 ` Eli Zaretskii
2024-04-08 17:17 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-08 17:27 ` Andrea Corallo
2024-04-08 18:36 ` Eli Zaretskii
2024-04-08 20:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09 4:10 ` Eli Zaretskii
2024-04-08 20:45 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09 3:56 ` Eli Zaretskii
2024-04-09 23:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-13 13:44 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-06 17:37 ` Dmitry Gutov
2024-04-06 19:44 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-07 14:40 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-07 15:47 ` Dmitry Gutov
2024-04-07 14:07 ` Ihor Radchenko
2024-04-08 16:06 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09 17:35 ` Ihor Radchenko
2024-04-10 2:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttkjspkq.fsf@localhost \
--to=yantar92@posteo.net \
--cc=70077@debbugs.gnu.org \
--cc=acm@muc.de \
--cc=alan.zimm@gmail.com \
--cc=casouri@gmail.com \
--cc=eliz@gnu.org \
--cc=frederic.bour@lakaban.net \
--cc=joaotavora@gmail.com \
--cc=mail@nicolasgoaziou.fr \
--cc=monnier@iro.umontreal.ca \
--cc=phillip.lord@russet.org.uk \
--cc=qhong@alum.mit.edu \
--cc=stephen_leake@stephe-leake.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.