all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@posteo.net>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: casouri@gmail.com, 70077@debbugs.gnu.org, qhong@alum.mit.edu,
	frederic.bour@lakaban.net, joaotavora@gmail.com,
	mail@nicolasgoaziou.fr, acm@muc.de, Eli Zaretskii <eliz@gnu.org>,
	stephen_leake@stephe-leake.org, alan.zimm@gmail.com,
	phillip.lord@russet.org.uk
Subject: bug#70077: An easier way to track buffer changes
Date: Tue, 02 Apr 2024 16:21:25 +0000	[thread overview]
Message-ID: <87ttkjspkq.fsf@localhost> (raw)
In-Reply-To: <jwv4jcjq16n.fsf-monnier+emacs@gnu.org>

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> I'm not sure how to combine the benefits of combining small changes into
>>> larger ones with the benefits of keeping distant changes separate.
>> I am not sure if combining small changes into larger ones is at all a
>> good idea.
>
> In my experience, if the amount of work to be done "per change" is not
> completely trivial, combining small changes is indispensable (the worst
> part being that the cases where it's needed are sufficiently infrequent
> that naive users of `*-change-functions` won't know about that, so we
> end up with unacceptable slowdowns in corner cases).

I agree, but only when subsequent changes intersect each other.
When distant changes are combined together, they cover a lot of buffer
text that has not been changed - they may trigger a lot more work
compared to more granular changes. Consider, for example, two changes
near the beginning and the end of the buffer:
(1..10) and (1000000...(point-max))
If such changes are tracked by tree-sitter or other parser, there is a
world of difference between requesting to re-parse individual segments
and the whole buffer.

>> The changes are stored in strings, which get allocated and
>> re-allocated repeatedly.
>
> Indeed.  Tho for N small changes, my code should perform only O(log N)
> re-allocations.

May you explain a bit more about this?
My reading of `track-changes--before' is that `concat' is called on
every change except when a new change is already inside the previously
recorded one. (aside: I have a hard time reading the code because of
confusing slot names: bbeg vs beg??)

>> Repeated string allocations, especially when strings keep growing
>> towards the buffer size, is likely going to increase consing and make
>> GCs more frequent.
>
> Similar allocations presumably take place anyway while running the code
> (e.g. for the `buffer-undo-list`), so I'm hoping the effect will be
> "lost in the noise".

I disagree. `buffer-undo-list' only includes the text that has been
actually changed. It never creates strings that span between
distant regions. As a terminal case, consider alternating between first
and second half of the buffer, starting in the middle and editing
towards the buffer boundaries - this will involve re-allocation of
buffer-long strings on average.

>>> We could expose a list of simultaneous (and thus disjoint) changes,
>>> which avoids the last problem.  But it's a fair bit more work for us, it
>>> makes the API more complex for the clients, and it's rarely what the
>>> clients really want anyway.
>> FYI, Org parser maintains such a list.
>
> Could you point me to the relevant code (I failed to find it)?

That code handles a lot more than just changes, so it might not be the
best reference. Anyway...

See the docstring of `org-element--cache-sync-requests', and
the branch in `org-element--cache-submit-request' starting from
	  ;; Current changes can be merged with first sync request: we
	  ;; can save a partial cache synchronization.

>> We previously discussed a similar API in
>> https://yhetil.org/emacs-bugs/87o7iq1emo.fsf@localhost/
>
> IIUC this discusses a *sequence* of edits.  In the point to which you
> replied I was discussing keeping a list of *simultaneous* edits.

Hmm. By referring to buffer-undo-list, I meant that intersecting edits
will be automatically merged, as it is usually done in undo system.

I am also not 100% sure why edits being simultaneous is any relevant.
Maybe we are talking about different things?

What I was talking about is (1) automatically merging subsequent edits
like 1..5 2..7 7..9 into 1..9. (2) if a subsequent edit does not
intersect with previous edited region, record it separately without
merging.

> This said, the downside in both cases is that the specific data that we
> need from such a list tends to depend on the user.  E.g. you suggest
>
>     (BEG END_BEFORE END_AFTER COUNTER)
>
> but that is not sufficient reconstruct the corresponding buffer state,
> so things like Eglot/CRDT can't use it.  Ideally for CRDT I think you'd
> want a sequence of
>
>     (BEG END-BEFORE STRING-AFTER)

Right. It's just that Org mode does not need STRING-AFTER, which is why
I did not think about it in my proposal. Of course, having STRING-AFTER
is required to get full reconstruction of the buffer state.

> but for Eglot this is not sufficient because Eglot needs to convert BEG
> and END_BEFORE into LSP positions (i.e. "line+col") and for that it
> needs to reproduce the past buffer state.  So instead, what Eglot needs
> (and does indeed build using `*-change-functions`) is a sequence of
>
>     (LSP-BEG LSP-END-BEFORE STRING-AFTER)
>
> [ Tho it seems it also needs a "LENGTH" of the deleted chunk, not sure
>   exactly why, but I guess it's a piece of redundant info the servers
>   can use to sanity-check the data?  ]

It is to support deprecated LSP spec (I presume that older LSP servers
may still be using the old spec):

https://microsoft.github.io/language-server-protocol/specifications/specification-3-15/
export type TextDocumentContentChangeEvent = {
	 * The range of the document that changed.
	range: Range;

	 * The optional length of the range that got replaced.
	 * @deprecated use range instead.
	rangeLength?: number;

	 * The new text for the provided range.
	text: string;

> The code changes were actually quite simple, so I have now included it.
> See my current PoC code below.

Am I missing something, or will `track-changes--signal-if-disjoint'
alter `track-changes--state' when it calls `track-changes-fetch' ->
`track-changes-reset'? Won't that interfere with the information passed
to non-disjoint trackers?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





  reply	other threads:[~2024-04-02 16:21 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29 16:15 bug#70077: An easier way to track buffer changes Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 18:12 ` Eli Zaretskii
2024-03-29 18:53   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  6:34     ` Eli Zaretskii
2024-03-30 14:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30 16:45         ` Eli Zaretskii
2024-03-31  2:57           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-01 11:53         ` Ihor Radchenko
2024-04-01 14:51           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-01 17:49             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-02 14:22               ` Ihor Radchenko
2024-04-02 15:17                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-02 16:21                   ` Ihor Radchenko [this message]
2024-04-02 17:51                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-03 12:34                       ` Ihor Radchenko
2024-04-03 12:45                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-04 17:58                           ` Ihor Radchenko
2024-03-30  3:17   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  5:09     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 22:20 ` phillip.lord
2024-03-29 22:59   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  6:46     ` Eli Zaretskii
2024-03-30 12:06     ` phillip.lord
2024-03-30 13:39       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  9:51 ` Ihor Radchenko
2024-03-30 12:49   ` Eli Zaretskii
2024-03-30 13:19     ` Ihor Radchenko
2024-03-30 13:31       ` Eli Zaretskii
2024-03-30 14:09   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-05 22:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-06  8:43   ` Eli Zaretskii
2024-04-08 15:24     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-08 15:53       ` Eli Zaretskii
2024-04-08 17:17         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-08 17:27           ` Andrea Corallo
2024-04-08 18:36           ` Eli Zaretskii
2024-04-08 20:57             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09  4:10               ` Eli Zaretskii
2024-04-08 20:45       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09  3:56         ` Eli Zaretskii
2024-04-09 23:30           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-13 13:44             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-06 17:37   ` Dmitry Gutov
2024-04-06 19:44     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-07 14:40       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-07 15:47         ` Dmitry Gutov
2024-04-07 14:07   ` Ihor Radchenko
2024-04-08 16:06     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09 17:35       ` Ihor Radchenko
2024-04-10  2:02         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttkjspkq.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=70077@debbugs.gnu.org \
    --cc=acm@muc.de \
    --cc=alan.zimm@gmail.com \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=frederic.bour@lakaban.net \
    --cc=joaotavora@gmail.com \
    --cc=mail@nicolasgoaziou.fr \
    --cc=monnier@iro.umontreal.ca \
    --cc=phillip.lord@russet.org.uk \
    --cc=qhong@alum.mit.edu \
    --cc=stephen_leake@stephe-leake.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.