bug#70077: An easier way to track buffer changes - Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: yantar92@posteo.net, 70077@debbugs.gnu.org, casouri@gmail.com,
	qhong@alum.mit.edu, frederic.bour@lakaban.net,
	joaotavora@gmail.com, mail@nicolasgoaziou.fr, acm@muc.de,
	stephen_leake@stephe-leake.org, alan.zimm@gmail.com,
	phillip.lord@russet.org.uk
Subject: bug#70077: An easier way to track buffer changes
Date: Sat, 30 Mar 2024 10:58:40 -0400	[thread overview]
Message-ID: <jwv7chjvmho.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <86cyrcdy80.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 30 Mar 2024 09:34:39 +0300")

> If the last point, i.e. the problems caused by limitations on what can
> be safely done from modification hooks, is basically the main
> advantage, then I think I understand the rationale.

That's one of the purposes but not the only one.

> Otherwise, the above looks like doing all the job in
> after-change-functions, and it is not clear to me how is that better,
> since if track-changes-fetch will fetch a series of changes,

`track-changes-fetch` will call its function argument only once.
If several changes happened since last time, `track-changes.el` will
summarize them into a single (BEG END BEFORE).

> deciding how to handle them could be much harder than handling them
> one by one when each one happens.  For example, the BEGIN..END values
> no longer reflect the current buffer contents,

Indeed, one of the purposes of the proposed API is to allow delaying the
handling to a later time, and if you keep individual changes, then it
becomes very difficult to make sense of those changes because they refer
to buffer states that aren't available any more.

That's why `track-changes.el` bundles those into a single (BEG END
BEFORE) change, which makes sure BEG/END are currently-valid buffer
positions and thus easy to use.
The previous buffer state is not directly available but can be
easily reconstructed from (BEG END BEFORE).

> Also, all of your examples seem to have the signal function just call
> track-changes-fetch and do almost nothing else, so I wonder why we
> need a separate function for that,

It gives more freedom to the clients.  For example it would allow
`eglot.el` to get rid of the `eglot--recent-changes` list of changes:
instead of calling `track-changes-fetch` directly from the signal and
collect the changes in `eglot--recent-changes`, it would delay the call
to `track-changes-fetch` to the idle-timer where we run
`eglot--signal-textDocument/didChange` so it would never need to send
more than a "single change" to the server.

Similarly, it would allow changing the specific moment the signal is
called without necessarily moving the moment the changes are performed.
This may be necessary in `eglot.el`: there are various places where we
check the boolean value of `eglot--recent-changes` to know if we're
in sync with the server.  In the current code this value because non-nil
as soon as a modification is made, whereas with my patch it becomes
non-nil only after the end of the command that made the first change.
I don't know yet if the difference is important, but if it is, then
we'd want to ask `track-changes.el` to send the signal more promptly
(there is a FIXME to add such a feature), although we would still want
to delay the `track-changes-fetch` so it's called only at the end of
the command.

> and more specifically what would be a use case where the registered
> signal function does NOT call track-changes-fetch, but does something
> else, and track-changes-fetch is then called outside of the
> signal function.

As mentioned, a use-case I imagine are cases where we want to delay the
processing of changes to an idle timer.  In the current `eglot.el` that
idle timer processes a sequence of changes, but for some use cases it
may too difficult (for the reasons discussed above: it can be difficult
to make sense of earlier changes once they're disconnected from the
corresponding buffer state), so they'd instead prefer to call
`track-changes-fetch` from the idle timer (and thus let
`track-changes.el` combine all those changes).

> Finally, the doc string of track-changes-register does not describe
> the exact place in the buffer-change sequence where the signal
> function will be called, which makes it harder to reason about it.
> Will it be called where we now call signal_after_change or somewhere
> else?

Good point.  Currently it's called via `funcall-later` which isn't in
`master` but can be thought of as `run-with-timer` with a 0 delay.
[ In reality it relies on `pending_funcalls` instead.  ]

But indeed, I have a FIXME to let the caller request to signal
as soon as possible, i.e. directly from the `after-change-functions`.

I think it's better to use something like `funcall-later` by default
than to signal directly from `after-change-functions` because most
coders don't realize that `after-change-functions` can be called
thousands of times for a single command (and in normal testing, they'll
probably never bump into such a case either).  So waiting for the end of
the current command (via `funcall-later`) provides a behavior closer to
what most naive coders expect, I believe, and will save them from the
corner cased they weren't aware of.

> And how do you guarantee that the signal function will not be
> called again until track-changes-fetch is called?

By removing the tracker from `track-changes--clean-trackers` (and
re-adding it once `track-changes-fetch` is finished, which is the main
reason I make `track-changes-fetch` call a function argument rather
than making it return (BEG END CHANGES)).

In a later email you wrote:

>>     (concat (buffer-substring (point-min) beg)
>>             before
>>             (buffer-substring end (point-max)))
>
> But if you get several changes, the above will need to be done in
> reverse order, back-to-front, no?

No, because you (the client) never get several changes.

>> I don't mean to suggest to do that, since it's costly for large
>> buffers
> Exactly.  But if the buffer _is_ large, then what to do?

It all depends on the specific cases.  E.g. in the case of `eglot.el` we
don't need the full content of the buffer before the change.  We only
really need to know how many newlines were in `before` as well as some
kind of length of the last line of `before`.  I compute that
by inserting `before` into a temp buffer.  Note that this is
proportional to the size of `before` and not to the total size of
the buffer.

If we want to do better, I think we'd then need a more complex API where
the client can specify more precisely (presumably via some kind of
function) what information we need to record about the "before state"
(and how to update that information as new changes are performed).

I don't have a good idea for what such an API could look like.

>> Also, it would fix only the problem of pairing, and not the other ones.
> So the main/only problem this mechanism solves is the lack of pairing
> between before/after calls to modification hooks?

No, the text you cite is saying the opposite: that we don't want to
solve only the pairing.  I hope/want it to solve all the
problems I mentioned:

    - before and after calls are not necessarily paired.
    - the beg/end values don't always match.
    - there can be thousands of calls from within a single command.
    - these hooks are run at a fairly low-level so there are things they
      really shouldn't do, such as modify the buffer or wait.
    - the after call doesn't get enough info to rebuild the before-change state,
      so some callers need to use both before-c-f and after-c-f (and then
      deal with the first two points above).

I don't claim it solves them all in a perfect way, tho.

        Stefan

next prev parent reply	other threads:[~2024-03-30 14:58 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29 16:15 bug#70077: An easier way to track buffer changes Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 18:12 ` Eli Zaretskii
2024-03-29 18:53   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  6:34     ` Eli Zaretskii
2024-03-30 14:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2024-03-30 16:45         ` Eli Zaretskii
2024-03-31  2:57           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-01 11:53         ` Ihor Radchenko
2024-04-01 14:51           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-01 17:49             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-02 14:22               ` Ihor Radchenko
2024-04-02 15:17                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-02 16:21                   ` Ihor Radchenko
2024-04-02 17:51                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-03 12:34                       ` Ihor Radchenko
2024-04-03 12:45                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-04 17:58                           ` Ihor Radchenko
2024-03-30  3:17   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  5:09     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-29 22:20 ` phillip.lord
2024-03-29 22:59   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  6:46     ` Eli Zaretskii
2024-03-30 12:06     ` phillip.lord
2024-03-30 13:39       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-30  9:51 ` Ihor Radchenko
2024-03-30 12:49   ` Eli Zaretskii
2024-03-30 13:19     ` Ihor Radchenko
2024-03-30 13:31       ` Eli Zaretskii
2024-03-30 14:09   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-05 22:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-06  8:43   ` Eli Zaretskii
2024-04-08 15:24     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-08 15:53       ` Eli Zaretskii
2024-04-08 17:17         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-08 17:27           ` Andrea Corallo
2024-04-08 18:36           ` Eli Zaretskii
2024-04-08 20:57             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09  4:10               ` Eli Zaretskii
2024-04-08 20:45       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09  3:56         ` Eli Zaretskii
2024-04-09 23:30           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-13 13:44             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-06 17:37   ` Dmitry Gutov
2024-04-06 19:44     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-07 14:40       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-07 15:47         ` Dmitry Gutov
2024-04-07 14:07   ` Ihor Radchenko
2024-04-08 16:06     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-04-09 17:35       ` Ihor Radchenko
2024-04-10  2:02         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv7chjvmho.fsf-monnier+emacs@gnu.org \
    --to=bug-gnu-emacs@gnu.org \
    --cc=70077@debbugs.gnu.org \
    --cc=acm@muc.de \
    --cc=alan.zimm@gmail.com \
    --cc=casouri@gmail.com \
    --cc=eliz@gnu.org \
    --cc=frederic.bour@lakaban.net \
    --cc=joaotavora@gmail.com \
    --cc=mail@nicolasgoaziou.fr \
    --cc=monnier@iro.umontreal.ca \
    --cc=phillip.lord@russet.org.uk \
    --cc=qhong@alum.mit.edu \
    --cc=stephen_leake@stephe-leake.org \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.