From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#70077: An easier way to track buffer changes Date: Sat, 30 Mar 2024 10:58:40 -0400 Message-ID: References: <86frw8ewk9.fsf@gnu.org> <86cyrcdy80.fsf@gnu.org> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20326"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: yantar92@posteo.net, 70077@debbugs.gnu.org, casouri@gmail.com, qhong@alum.mit.edu, frederic.bour@lakaban.net, joaotavora@gmail.com, mail@nicolasgoaziou.fr, acm@muc.de, stephen_leake@stephe-leake.org, alan.zimm@gmail.com, phillip.lord@russet.org.uk To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Mar 30 15:59:19 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rqaAx-00054X-9E for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 30 Mar 2024 15:59:19 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rqaAk-0004S8-KB; Sat, 30 Mar 2024 10:59:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rqaAe-0004Rq-L7 for bug-gnu-emacs@gnu.org; Sat, 30 Mar 2024 10:59:00 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rqaAe-0008QY-CU for bug-gnu-emacs@gnu.org; Sat, 30 Mar 2024 10:59:00 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rqaAf-0004xu-Vf for bug-gnu-emacs@gnu.org; Sat, 30 Mar 2024 10:59:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 30 Mar 2024 14:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 70077 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 70077-submit@debbugs.gnu.org id=B70077.171181073419059 (code B ref 70077); Sat, 30 Mar 2024 14:59:01 +0000 Original-Received: (at 70077) by debbugs.gnu.org; 30 Mar 2024 14:58:54 +0000 Original-Received: from localhost ([127.0.0.1]:46015 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rqaAX-0004xL-Aj for submit@debbugs.gnu.org; Sat, 30 Mar 2024 10:58:53 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:59256) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rqaAV-0004x6-H7 for 70077@debbugs.gnu.org; Sat, 30 Mar 2024 10:58:52 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id AAFD5441506; Sat, 30 Mar 2024 10:58:43 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1711810721; bh=P1DGJ8aMWNYPARIfB+KpIiHJqtgrioIXFBM5CiRbo4w=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=em2LDzB5CYfPFTvTSBFwv+5F3DKRHp1Ib1gT0r8Cy5p5IypbqQkYBBypRzk7RfSyR i/GW9w1oZAmH6HUTAAsedFb90yVuNb1gJZT8D3bSPLHfztXlKwWkd6y07Oq6ju9d/z Ysk1cHZgtiNOS1PPeWvExUA+Bupiyrgqs+xrSxusoHQ5zLVPliWnV+WJTXxI3nFV8r qShktHowq3k5XzOU2vp+v5/U+yCmK4yL33XTb5UWGthvC64VYc7F9xmkTy/uPC6pNc PfCK9LhmCCFnUyJ8cd2NgjrlOofiGO54NJHygfQBJ3zUs8NcO9se5Qv6+3bop0NOe/ ZadhanINy376g== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 7FD21441421; Sat, 30 Mar 2024 10:58:41 -0400 (EDT) Original-Received: from pastel (unknown [45.72.201.215]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 296221201E7; Sat, 30 Mar 2024 10:58:41 -0400 (EDT) In-Reply-To: <86cyrcdy80.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 30 Mar 2024 09:34:39 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:282377 Archived-At: > If the last point, i.e. the problems caused by limitations on what can > be safely done from modification hooks, is basically the main > advantage, then I think I understand the rationale. That's one of the purposes but not the only one. > Otherwise, the above looks like doing all the job in > after-change-functions, and it is not clear to me how is that better, > since if track-changes-fetch will fetch a series of changes, `track-changes-fetch` will call its function argument only once. If several changes happened since last time, `track-changes.el` will summarize them into a single (BEG END BEFORE). > deciding how to handle them could be much harder than handling them > one by one when each one happens. For example, the BEGIN..END values > no longer reflect the current buffer contents, Indeed, one of the purposes of the proposed API is to allow delaying the handling to a later time, and if you keep individual changes, then it becomes very difficult to make sense of those changes because they refer to buffer states that aren't available any more. That's why `track-changes.el` bundles those into a single (BEG END BEFORE) change, which makes sure BEG/END are currently-valid buffer positions and thus easy to use. The previous buffer state is not directly available but can be easily reconstructed from (BEG END BEFORE). > Also, all of your examples seem to have the signal function just call > track-changes-fetch and do almost nothing else, so I wonder why we > need a separate function for that, It gives more freedom to the clients. For example it would allow `eglot.el` to get rid of the `eglot--recent-changes` list of changes: instead of calling `track-changes-fetch` directly from the signal and collect the changes in `eglot--recent-changes`, it would delay the call to `track-changes-fetch` to the idle-timer where we run `eglot--signal-textDocument/didChange` so it would never need to send more than a "single change" to the server. Similarly, it would allow changing the specific moment the signal is called without necessarily moving the moment the changes are performed. This may be necessary in `eglot.el`: there are various places where we check the boolean value of `eglot--recent-changes` to know if we're in sync with the server. In the current code this value because non-nil as soon as a modification is made, whereas with my patch it becomes non-nil only after the end of the command that made the first change. I don't know yet if the difference is important, but if it is, then we'd want to ask `track-changes.el` to send the signal more promptly (there is a FIXME to add such a feature), although we would still want to delay the `track-changes-fetch` so it's called only at the end of the command. > and more specifically what would be a use case where the registered > signal function does NOT call track-changes-fetch, but does something > else, and track-changes-fetch is then called outside of the > signal function. As mentioned, a use-case I imagine are cases where we want to delay the processing of changes to an idle timer. In the current `eglot.el` that idle timer processes a sequence of changes, but for some use cases it may too difficult (for the reasons discussed above: it can be difficult to make sense of earlier changes once they're disconnected from the corresponding buffer state), so they'd instead prefer to call `track-changes-fetch` from the idle timer (and thus let `track-changes.el` combine all those changes). > Finally, the doc string of track-changes-register does not describe > the exact place in the buffer-change sequence where the signal > function will be called, which makes it harder to reason about it. > Will it be called where we now call signal_after_change or somewhere > else? Good point. Currently it's called via `funcall-later` which isn't in `master` but can be thought of as `run-with-timer` with a 0 delay. [ In reality it relies on `pending_funcalls` instead. ] But indeed, I have a FIXME to let the caller request to signal as soon as possible, i.e. directly from the `after-change-functions`. I think it's better to use something like `funcall-later` by default than to signal directly from `after-change-functions` because most coders don't realize that `after-change-functions` can be called thousands of times for a single command (and in normal testing, they'll probably never bump into such a case either). So waiting for the end of the current command (via `funcall-later`) provides a behavior closer to what most naive coders expect, I believe, and will save them from the corner cased they weren't aware of. > And how do you guarantee that the signal function will not be > called again until track-changes-fetch is called? By removing the tracker from `track-changes--clean-trackers` (and re-adding it once `track-changes-fetch` is finished, which is the main reason I make `track-changes-fetch` call a function argument rather than making it return (BEG END CHANGES)). In a later email you wrote: >> (concat (buffer-substring (point-min) beg) >> before >> (buffer-substring end (point-max))) > > But if you get several changes, the above will need to be done in > reverse order, back-to-front, no? No, because you (the client) never get several changes. >> I don't mean to suggest to do that, since it's costly for large >> buffers > Exactly. But if the buffer _is_ large, then what to do? It all depends on the specific cases. E.g. in the case of `eglot.el` we don't need the full content of the buffer before the change. We only really need to know how many newlines were in `before` as well as some kind of length of the last line of `before`. I compute that by inserting `before` into a temp buffer. Note that this is proportional to the size of `before` and not to the total size of the buffer. If we want to do better, I think we'd then need a more complex API where the client can specify more precisely (presumably via some kind of function) what information we need to record about the "before state" (and how to update that information as new changes are performed). I don't have a good idea for what such an API could look like. >> Also, it would fix only the problem of pairing, and not the other ones. > So the main/only problem this mechanism solves is the lack of pairing > between before/after calls to modification hooks? No, the text you cite is saying the opposite: that we don't want to solve only the pairing. I hope/want it to solve all the problems I mentioned: - before and after calls are not necessarily paired. - the beg/end values don't always match. - there can be thousands of calls from within a single command. - these hooks are run at a fairly low-level so there are things they really shouldn't do, such as modify the buffer or wait. - the after call doesn't get enough info to rebuild the before-change state, so some callers need to use both before-c-f and after-c-f (and then deal with the first two points above). I don't claim it solves them all in a perfect way, tho. Stefan