From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#70077: An easier way to track buffer changes Date: Tue, 02 Apr 2024 13:51:42 -0400 Message-ID: References: <86frw8ewk9.fsf@gnu.org> <86cyrcdy80.fsf@gnu.org> <877chh8fjk.fsf@localhost> <87wmpfsv2y.fsf@localhost> <87ttkjspkq.fsf@localhost> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31955"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: casouri@gmail.com, 70077@debbugs.gnu.org, qhong@alum.mit.edu, frederic.bour@lakaban.net, joaotavora@gmail.com, mail@nicolasgoaziou.fr, acm@muc.de, Eli Zaretskii , stephen_leake@stephe-leake.org, alan.zimm@gmail.com, phillip.lord@russet.org.uk To: Ihor Radchenko Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Apr 02 19:52:39 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rriJK-00086Z-2O for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 02 Apr 2024 19:52:38 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rriIy-0000Ng-J9; Tue, 02 Apr 2024 13:52:16 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rriIg-0000KS-Sl for bug-gnu-emacs@gnu.org; Tue, 02 Apr 2024 13:52:01 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rriIg-0008RZ-Kd for bug-gnu-emacs@gnu.org; Tue, 02 Apr 2024 13:51:58 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rriIk-0001Xg-7G for bug-gnu-emacs@gnu.org; Tue, 02 Apr 2024 13:52:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 02 Apr 2024 17:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 70077 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 70077-submit@debbugs.gnu.org id=B70077.17120803205920 (code B ref 70077); Tue, 02 Apr 2024 17:52:02 +0000 Original-Received: (at 70077) by debbugs.gnu.org; 2 Apr 2024 17:52:00 +0000 Original-Received: from localhost ([127.0.0.1]:56172 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rriIh-0001XP-I5 for submit@debbugs.gnu.org; Tue, 02 Apr 2024 13:52:00 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:19075) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rriIf-0001Wz-Dj for 70077@debbugs.gnu.org; Tue, 02 Apr 2024 13:51:58 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 520DA442379; Tue, 2 Apr 2024 13:51:47 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1712080305; bh=a/oiLDnMuo7F/+MgqslqKeDcbREiovQzSFZZfyaMFWM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=EqFYOoCMXO+apmnbz73SbAxvDyroxNSi8y+ESWH+UuhdwgN4WWhAryeWMXp+90eEN shZzaNvRUh8X5jk2ShmK0kwiQwCIPX4XYcUf2Gph6Pz4Vj/14m8Rawmqw0gAFj3yrN lLzZBvN0cCgxsnnjJuQIABf/MUzrXsuoTwSggOtjejVdIJMFKuiE+k1qnocL4XQfAY o/ylDq0PZ/kYdWcA+idozrgv8jQ8zy2fnX64MsdiHE3FIrhKYByI6RD3Buhw/uS3WN RuH/XM9iA/Z4eLmr02/6sDIPKrwGU3OLG2jwoXX9p1pipsLL0bJ97OS272O7X8GouN 2F8ZuQoTJOaEg== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 4809E441FC1; Tue, 2 Apr 2024 13:51:45 -0400 (EDT) Original-Received: from pastel (unknown [45.72.201.215]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id E41D5120682; Tue, 2 Apr 2024 13:51:44 -0400 (EDT) In-Reply-To: <87ttkjspkq.fsf@localhost> (Ihor Radchenko's message of "Tue, 02 Apr 2024 16:21:25 +0000") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:282536 Archived-At: > I agree, but only when subsequent changes intersect each other. > When distant changes are combined together, they cover a lot of buffer > text that has not been changed - they may trigger a lot more work > compared to more granular changes. Consider, for example, two changes > near the beginning and the end of the buffer: > (1..10) and (1000000...(point-max)) > If such changes are tracked by tree-sitter or other parser, there is a > world of difference between requesting to re-parse individual segments > and the whole buffer. Oh, definitely. For some clients, the amount of work doesn't really depend on the size of the change, tho (OTOH the work done by `track-changes.el` *is* affected by the size of the change). Handling disjoint changes can be very important. I'm fairly satisfied with my `track-changes-register-disjoint` solution to this problem. >>> The changes are stored in strings, which get allocated and >>> re-allocated repeatedly. >> Indeed. Tho for N small changes, my code should perform only O(log N) >> re-allocations. > May you explain a bit more about this? > My reading of `track-changes--before' is that `concat' is called on > every change except when a new change is already inside the previously > recorded one. The "previously recorded one" (i.e. the info stored in `track-changes--before-beg/end/string`) is a conservative over-approximation of the changed area. We (at least) double its size every time we have to grow it, specifically to reduce the number of times we need to grow it (otherwise we quickly fall into the O(N=C2=B2) trap). > (aside: I have a hard time reading the code because of > confusing slot names: bbeg vs beg??) "bbeg/bend" stands for "before beg" and "before end" (i.e. the positions as they were before the change). BTW, those two slots don't exist any more in the latest code I sent (but vars with those names are still present here and there). >>>> We could expose a list of simultaneous (and thus disjoint) changes, >>>> which avoids the last problem. But it's a fair bit more work for us, = it >>>> makes the API more complex for the clients, and it's rarely what the >>>> clients really want anyway. >>> FYI, Org parser maintains such a list. >> Could you point me to the relevant code (I failed to find it)? > That code handles a lot more than just changes, so it might not be the > best reference. Anyway... > See the docstring of `org-element--cache-sync-requests', and > the branch in `org-element--cache-submit-request' starting from > ;; Current changes can be merged with first sync request: we > ;; can save a partial cache synchronization. Thanks. [ I see I did search the right file, but the code was too intertwined with other things for me to find that part. ] [ Further discussions of Org's code moved off-list. ] > Hmm. By referring to buffer-undo-list, I meant that intersecting edits > will be automatically merged, as it is usually done in undo system. [ Actually `buffer-undo-list` doesn't do very much merging, if any. AFAIK the only merging that happens there is to "amalgamate" consecutive `self-insert-command`s or consecutive single-char `delete-char`s. =F0=9F= =99=82 ] > I am also not 100% sure why edits being simultaneous is any relevant. If they're not simultaneous it means that they describe N different buffer states and that to interpret a specific change in the list (e.g. to convert buffer positions to line+col positions) you may have to reconstruct its before&after states by applying the other changes. For some use cases this is quite inconvenient. In any case, it's not very important, I think. > What I was talking about is (1) automatically merging subsequent edits > like 1..5 2..7 7..9 into 1..9. (2) if a subsequent edit does not > intersect with previous edited region, record it separately without > merging. That's what you can get now with `track-changes-register-disjoint`. >> The code changes were actually quite simple, so I have now included it. >> See my current PoC code below. > Am I missing something, or will `track-changes--signal-if-disjoint' > alter `track-changes--state' when it calls `track-changes-fetch' -> > `track-changes-reset'? [ I assume you meant `track-changes--clean-state` instead of `track-changes-reset`. ]=20 Yes it will (and there's a bug in `track-changes--before` because of that that I still need to fix =F0=9F=99=81), but that should not be a probl= em for the clients. > Won't that interfere with the information passed to > non-disjoint trackers? No: the separate states will be (re)combined when those trackers call `track-changes-fetch`. There is a nearby poor interference between clients, tho: if one client calls `track-changes-fetch` eagerly all the time, it may prevent another client from getting a "disjoint change" signal because disjointness is noticed only between changes that occurred since the last call to `track-changes--clean-state` (which is called mostly by `track-changes-fetch`). I guess this deserves a FIXME. Stefan