* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made @ 2023-08-22 9:30 Ihor Radchenko 2023-08-22 12:22 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-22 9:30 UTC (permalink / raw) To: 65451 Consider the following reproducer: 1. Create /tmp/bug.el with the following contents (defun my/setup () (interactive) (defun my/before-change (beg end) (warn "Before: %d %d" beg end)) (add-hook 'before-change-functions #'my/before-change nil 'local) (defun my/after-change (beg end pre) (warn "After: %d %d delta: %d" beg end (- end beg pre))) (add-hook 'after-change-functions #'my/after-change nil 'local)) 2. Create /tmp/bug.org https://www.gnu.org/software/emacs/manual/html_node/emacs/Mode-Line.html -UUU: @**- F2 UTF-8-demo.txt Top (1,0) <N> (Text s-/ WS Wrap) | string | meaning | note | |--------+------------------------------------------+----------------------------------------------------------------------| | - | input method | | | UUU | coding system (keyboard terminal buffer) | (U utf-8-unix) | | : | end-of-line convention | (: LF) (/ CR) (\ CRLF) | | @ | emacsclient | | | ** | buffer status | (-- unmodified) (** modified) (%% read-only) (%* read-only_modified) | | - | default-directory | (- local) (@ remote) | | F2 | frame name | (F2 the-2nd-frame) | | | | | 3. emacs -Q -l /tmp/bug.el /tmp/bug.org 4. M-x my/setup 5. Move point to | F2 | frame name | (F2 the-2nd-frame) | | <point> | | | 6. Insert "UTF" 7. M-x dabbrev-expand 8. Observe the following in the *Warnings* buffer ⛔ Warning (emacs): Before: 1278 1278 ⛔ Warning (emacs): After: 1278 1279 delta: 1 ⛔ Warning (emacs): Before: 1284 1285 ⛔ Warning (emacs): After: 1284 1284 delta: -1 ⛔ Warning (emacs): Before: 1279 1279 ⛔ Warning (emacs): After: 1279 1280 delta: 1 ⛔ Warning (emacs): Before: 1284 1285 ⛔ Warning (emacs): After: 1284 1284 delta: -1 ⛔ Warning (emacs): Before: 1280 1280 ⛔ Warning (emacs): After: 1280 1281 delta: 1 ⛔ Warning (emacs): Before: 1284 1285 ⛔ Warning (emacs): After: 1284 1284 delta: -1 ⛔ Warning (emacs): Before: 1278 1281 ⛔ Warning (emacs): Before: 1278 1288 ⛔ Warning (emacs): After: 1278 1288 delta: 0 ⛔ Warning (emacs): After: 1278 1288 delta: 7 The last bit of the messages corresponds to dabbrev expansion of "UTF" to "utf-8-unix": ⛔ Warning (emacs): Before: 1278 1281 ⛔ Warning (emacs): Before: 1278 1288 ⛔ Warning (emacs): After: 1278 1288 delta: 0 ⛔ Warning (emacs): After: 1278 1288 delta: 7 Note how "After: 1278 1288 delta: 0" reports a change of "utf-8-unix" that did not alter the buffer text size. It is trigerred _before_ "After: 1278 1288 delta: 7" that corresponds to replacing "UTF" with "utf-8-unix". The order of after-change notifications thus do not correspond to the order of buffer changes. Org mode is relying upon the correct change order to update parser cache with buffer modifications. The same recipe using Emacs 27 yields the correct order <...> Warning (emacs): Before: 1278 1281 Warning (emacs): After: 1278 1288 delta: 7 Warning (emacs): Before: 1278 1288 Warning (emacs): After: 1278 1288 delta: 0 In GNU Emacs 30.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.38, cairo version 1.17.8) of 2023-08-14 built on localhost Repository revision: d483b38070120f17b1d00975081d27191d1deacc Repository branch: master Windowing system distributor 'The X.Org Foundation', version 11.0.12101008 System Description: Gentoo Linux -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 9:30 bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made Ihor Radchenko @ 2023-08-22 12:22 ` Eli Zaretskii 2023-08-22 12:42 ` Ihor Radchenko 2024-03-30 13:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 34+ messages in thread From: Eli Zaretskii @ 2023-08-22 12:22 UTC (permalink / raw) To: Ihor Radchenko; +Cc: 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Date: Tue, 22 Aug 2023 09:30:47 +0000 > > The last bit of the messages corresponds to dabbrev expansion of "UTF" > to "utf-8-unix": > > ⛔ Warning (emacs): Before: 1278 1281 > ⛔ Warning (emacs): Before: 1278 1288 > ⛔ Warning (emacs): After: 1278 1288 delta: 0 > ⛔ Warning (emacs): After: 1278 1288 delta: 7 > > Note how "After: 1278 1288 delta: 0" reports a change of "utf-8-unix" > that did not alter the buffer text size. It is trigerred _before_ > "After: 1278 1288 delta: 7" that corresponds to replacing "UTF" with > "utf-8-unix". The inner Before+After is because replace-match also changes the letter-case of the replaced text (to match the letter-case of the original). > The order of after-change notifications thus do not correspond to the > order of buffer changes. Org mode is relying upon the correct change > order to update parser cache with buffer modifications. I think Org mode is relying on something it should not. This particular use case aside, Emacs is allowed to call a function that changes the buffer from a function that itself changes the buffer, and it is allowed to call that inner function _before_ it did all the changes it intended to do. In such cases you will see Before+After pairs embedded inside an outer Before+After pair, because most functions which end up calling these hooks try not to call them too often, so they generally call the before-hook once at the beginning and the after-hook once near the end. IOW, we try to report all of the changes in one go, instead of reporting them one small chunk (perhaps even one character) at a time. And that inevitably leads to situations like this one. Which is perfectly normal, from my POV, and not a bug at all. I'd hate to make our code more complex, let alone slower, just to satisfy the reliance on the perfect pairing like you do. (We've been through this in other discussions: Org uses these hooks for purposes they were not designed for. The ELisp manual even warns about such assumptions.) > The same recipe using Emacs 27 yields the correct order AFAIR, Emacs 27 had quite a few bugs in this area, which were fixed in development since then. See bug#42424, for example. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 12:22 ` Eli Zaretskii @ 2023-08-22 12:42 ` Ihor Radchenko 2023-08-22 12:58 ` Eli Zaretskii 2024-03-30 13:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-22 12:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 65451 Eli Zaretskii <eliz@gnu.org> writes: >> The order of after-change notifications thus do not correspond to the >> order of buffer changes. Org mode is relying upon the correct change >> order to update parser cache with buffer modifications. > > I think Org mode is relying on something it should not. Then, I'd like to point back to the previous discussion where I asked to expose to Elisp information about buffer changes available to tree-sitter. https://yhetil.org/emacs-devel/83tu8jq2vl.fsf@gnu.org/ If there is no guarantee that buffer modifications are not signaled in order, on-the-fly updates of the parsed AST will become impossible for Org, even if using markers (as you suggested in the linked thread). In fact, I am not sure if tree-sitter will behave correctly if it is signaled changes in incorrect order. https://tree-sitter.github.io/tree-sitter/using-parsers#advanced-parsing says In applications like text editors, you often need to re-parse a file after its source code has changed. Tree-sitter is designed to support this use case efficiently. There are two steps required. First, you must edit the syntax tree, which adjusts the ranges of its nodes so that they stay in sync with the code. typedef struct { uint32_t start_byte; uint32_t old_end_byte; uint32_t new_end_byte; TSPoint start_point; TSPoint old_end_point; TSPoint new_end_point; } TSInputEdit; void ts_tree_edit(TSTree *, const TSInputEdit *); Note how the API is similar to `after-change-functions' - it expects edited region boundaries before/after the edit. If Emacs feeds the following to tree-sitter ⛔ Warning (emacs): After: 1278 1288 delta: 0 ⛔ Warning (emacs): After: 1278 1288 delta: 7 , the first "edit" query will apply to wrong range in the cached AST. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 12:42 ` Ihor Radchenko @ 2023-08-22 12:58 ` Eli Zaretskii 2023-08-22 13:41 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-22 12:58 UTC (permalink / raw) To: Ihor Radchenko, Yuan Fu; +Cc: 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: 65451@debbugs.gnu.org > Date: Tue, 22 Aug 2023 12:42:18 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> The order of after-change notifications thus do not correspond to the > >> order of buffer changes. Org mode is relying upon the correct change > >> order to update parser cache with buffer modifications. > > > > I think Org mode is relying on something it should not. > > Then, I'd like to point back to the previous discussion where I asked to > expose to Elisp information about buffer changes available to > tree-sitter. > https://yhetil.org/emacs-devel/83tu8jq2vl.fsf@gnu.org/ I don't want to do that, sorry. Not without a good understanding of what exactly do you need from that and in what way. If we will expose anything, it will have to be the minimum possible exposure, not the maximum, so I would like to understand this very well before I agree to any change in this direction. > If there is no guarantee that buffer modifications are not signaled in > order, on-the-fly updates of the parsed AST will become impossible for > Org, even if using markers (as you suggested in the linked thread). > > In fact, I am not sure if tree-sitter will behave correctly if it is > signaled changes in incorrect order. I will defer to Yuan, but tree-sitter doesn't use these hooks, we call its functions directly from insdel.c where needed. This makes sense for a library to which we link and whose interface code we control, but giving such access to Lisp (and Org on top of that) is out of the question. We don't even give such access to modules. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 12:58 ` Eli Zaretskii @ 2023-08-22 13:41 ` Ihor Radchenko 2023-08-22 16:02 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-22 13:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Yuan Fu, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> Then, I'd like to point back to the previous discussion where I asked to >> expose to Elisp information about buffer changes available to >> tree-sitter. >> https://yhetil.org/emacs-devel/83tu8jq2vl.fsf@gnu.org/ > > I don't want to do that, sorry. Not without a good understanding of > what exactly do you need from that and in what way. If we will expose > anything, it will have to be the minimum possible exposure, not the > maximum, so I would like to understand this very well before I agree > to any change in this direction. Org wants to do the same thing tree-sitter does - keep parsed AST in sync with buffer modifications without having to re-parse the whole buffer. So, we basically need the same information tree-sitter needs - the sequence of buffer text changes, in their order. Note that the markers discussed in the thread I linked are not sufficient. When editing near AST node boundaries, even if the boundaries are represented by markers, we have to re-parse the AST around to account for the possible structural changes. So, information about buffer edits is still required. >> In fact, I am not sure if tree-sitter will behave correctly if it is >> signaled changes in incorrect order. > > I will defer to Yuan, but tree-sitter doesn't use these hooks, we call > its functions directly from insdel.c where needed. This makes sense > for a library to which we link and whose interface code we control, > but giving such access to Lisp (and Org on top of that) is out of the > question. We don't even give such access to modules. I hope that we can solve this issue one way or another. This currently breaks the very core functionality of Org. Every part of Org relies on it to obtain reasonable performance. Prior to using cache, we had orders of magnitude slowdowns. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 13:41 ` Ihor Radchenko @ 2023-08-22 16:02 ` Eli Zaretskii 2023-08-23 8:52 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-22 16:02 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: Yuan Fu <casouri@gmail.com>, 65451@debbugs.gnu.org > Date: Tue, 22 Aug 2023 13:41:17 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Then, I'd like to point back to the previous discussion where I asked to > >> expose to Elisp information about buffer changes available to > >> tree-sitter. > >> https://yhetil.org/emacs-devel/83tu8jq2vl.fsf@gnu.org/ > > > > I don't want to do that, sorry. Not without a good understanding of > > what exactly do you need from that and in what way. If we will expose > > anything, it will have to be the minimum possible exposure, not the > > maximum, so I would like to understand this very well before I agree > > to any change in this direction. > > Org wants to do the same thing tree-sitter does - keep parsed AST in > sync with buffer modifications without having to re-parse the whole > buffer. So, we basically need the same information tree-sitter needs - > the sequence of buffer text changes, in their order. We don't expose the data you want to tree-sitter in Lisp. What is exposed to Lisp are the parser and parse-tree objects that we build (in C) based on tree-sitter parsing results. When the buffer is modified, the information about the modifications is used internally by Emacs, in C code, to find and update the relevant parsers, and for that we call the tree-sitter functions involved in this process. See the function treesit_record_change which does that, and which is called from C when buffer text changes in a way relevant to treesit.el functionalities. (Note that some changes of buffer text are not visible even to tree-sitter, because we decided they are not relevant, for now.) > Note that the markers discussed in the thread I linked are not > sufficient. When editing near AST node boundaries, even if the > boundaries are represented by markers, we have to re-parse the AST > around to account for the possible structural changes. So, information > about buffer edits is still required. If tracking markers is not enough, then I wonder how the information from the lower levels, which is basically the same but noisier, will be able to help you. > >> In fact, I am not sure if tree-sitter will behave correctly if it is > >> signaled changes in incorrect order. > > > > I will defer to Yuan, but tree-sitter doesn't use these hooks, we call > > its functions directly from insdel.c where needed. This makes sense > > for a library to which we link and whose interface code we control, > > but giving such access to Lisp (and Org on top of that) is out of the > > question. We don't even give such access to modules. > > I hope that we can solve this issue one way or another. This currently > breaks the very core functionality of Org. Every part of Org relies on > it to obtain reasonable performance. Prior to using cache, we had orders > of magnitude slowdowns. If you can arrange your design such that Lisp sees only AST-specific objects affected by the modifications in buffer text, then I believe we will have a good chance of finding a satisfactory solution. If that requires to have some of your code in C (preferably, generalized to some extent), then so be it. You see, I think the buffer-change hooks we have are already too much: Lisp programs abuse them all the time (you can see a good example in the bug which I mentioned up-thread, and which led to the change you are now complaining about). Doing more of that is not very wise, to say the least. Moreover, I think the solution you think you want you actually _don't_ want, because it will overwhelm you with changes that are not relevant to your purposes. You can see a clear evidence to that in the fact that treesit_record_change is called only in several strategical places, not everywhere where we change buffer text, and not at the lowest level of such changes. There's a reason to that. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 16:02 ` Eli Zaretskii @ 2023-08-23 8:52 ` Ihor Radchenko 2023-08-23 17:58 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-23 8:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> Org wants to do the same thing tree-sitter does - keep parsed AST in >> sync with buffer modifications without having to re-parse the whole >> buffer. So, we basically need the same information tree-sitter needs - >> the sequence of buffer text changes, in their order. > > We don't expose the data you want to tree-sitter in Lisp. What is > exposed to Lisp are the parser and parse-tree objects that we build > (in C) based on tree-sitter parsing results. I understand. The difference is that Org mode tries to implement the tree sitter algorithm (not exactly, but similar) in Elisp. > ... When the buffer is > modified, the information about the modifications is used internally > by Emacs, in C code, to find and update the relevant parsers, and for > that we call the tree-sitter functions involved in this process. See > the function treesit_record_change which does that, and which is > called from C when buffer text changes in a way relevant to treesit.el > functionalities. (Note that some changes of buffer text are not > visible even to tree-sitter, because we decided they are not relevant, > for now.) I understand. Org also does not need every single change. The changes that, when combined, do not affect the buffer text are not important to Org as well. In fact, Org even merge subsequent edits in the same, or intersecting region of text. For example, if no queries to Org AST are done between edits, Foo bar<point> baz -> Foo bar|new text|<point> baz -> Foo bar|new text| and more|<point> baz is simply treated as a single edit Foo bar<point> baz -> Foo bar|new text| and more|<point> baz > If tracking markers is not enough, then I wonder how the information > from the lower levels, which is basically the same but noisier, will > be able to help you. We do not really need information from lower levels. At every moment of time, Org parser maintains a linkage between buffer text and its parsed AST. After a buffer edit, we need to know how to update the AST corresponding to the last known buffer text state to the new (current) text state. To update the old AST we need to know which regions in the old AST (and old buffer text) were edited and their change in length. Then, we (1) remove elements in AST that are structurally affected by the edit; (2) shift elements after the edit that are not structurally affected but whose boundaries must be adjusted to accommodate for the buffer length being changed; (3) re-parse buffer text (after the edit) and fill the gap in the AST. For step (1), it is critical to have information about changed regions in the same order the edits happen. And, in theory, we might utilize before-change-functions to obtain this information (assuming that before-change-functions are always called in the same order the edits happen). But, unfortunately, before-change-functions are too noisy for this purpose because, for example, simply altering text properties also triggers before-change-functions, while no actual change in buffer text occurs in such scenario and re-parsing AST is unnecessary. So, we instead have to use after-change-functions and work out the original changed region boundaties using beg_changed end_changed pre-change_lentgh -> beg_before_change = beg_changed; end_before_change = end_changed - pre-change_length This calculation of the original changed region becomes incorrect when the order of after-change-functions is not the same as the editing order - this bug report. Step (2) is technically not necessary if we use markers, but we cannot do it yet for performance reasons. And without markers, we again need to know the change in length of th edited region: end_changed - beg_changed - pre-change_lentgh. This calculation is also incorrect when after-change-function is not the same as the editing order. All the above information does not have to be granular. We are good as long as nothing queries the AST between the edits. For example, internal details of edit sequence done by replace-match are not important because Org AST will not be queried from Emacs C internals. >> I hope that we can solve this issue one way or another. This currently >> breaks the very core functionality of Org. Every part of Org relies on >> it to obtain reasonable performance. Prior to using cache, we had orders >> of magnitude slowdowns. > > If you can arrange your design such that Lisp sees only AST-specific > objects affected by the modifications in buffer text, then I believe > we will have a good chance of finding a satisfactory solution. If > that requires to have some of your code in C (preferably, generalized > to some extent), then so be it. The problem is that whether an edit requires re-parsing an AST object or not depends on the object type. For example, :DRAWER: Text inside drawer. :END: Corresponds to the following simplified AST: (drawer (:name "DRAWER") (paragraph)) Then, changing it to :DR: ... :END: will require updating 'drawer' AST node: (drawer (:name "DR") (paragraph)) while :DRAWER: Text inside drawer. :ND: will invalidate drawer and turn the whole thing into (paragraph) and :DRAWER: Text inside <more text> drawer. :END: will not affect the drawer, just inner paragraph. Further, :DRAWER: Text inside * Heading drawer. :END: will split the whole thing into (paragraph) (heading (paragraph)) Basically, I do not see an obvious way to abstract AST away into C without implementing Org parser in C as well. With an exception of exposing treesit_record_change to Elisp, so that we can get information about region before and after the edit when updating the AST - that's what I had in mind in https://yhetil.org/emacs-devel/83tu8jq2vl.fsf@gnu.org/ We would need something like after-change-text-functions that will only trigger after actual text edits and provide information about region boundaries before and after the edit. > Moreover, I think the solution you think you want you actually _don't_ > want, because it will overwhelm you with changes that are not relevant > to your purposes. You can see a clear evidence to that in the fact > that treesit_record_change is called only in several strategical > places, not everywhere where we change buffer text, and not at the > lowest level of such changes. There's a reason to that. Indeed. As I tried to explain, the problem for me is not the granularity of changes, but their ordering. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-23 8:52 ` Ihor Radchenko @ 2023-08-23 17:58 ` Eli Zaretskii 2023-08-24 7:46 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-23 17:58 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Wed, 23 Aug 2023 08:52:30 +0000 > > To update the old AST we need to know which regions in the old AST (and > old buffer text) were edited and their change in length. Then, we (1) > remove elements in AST that are structurally affected by the edit; > (2) shift elements after the edit that are not structurally affected but > whose boundaries must be adjusted to accommodate for the buffer length > being changed; (3) re-parse buffer text (after the edit) and fill the > gap in the AST. > > For step (1), it is critical to have information about changed regions > in the same order the edits happen. And, in theory, we might utilize > before-change-functions to obtain this information (assuming that > before-change-functions are always called in the same order the edits > happen). But, unfortunately, before-change-functions are too noisy for > this purpose because, for example, simply altering text properties also > triggers before-change-functions, while no actual change in buffer text > occurs in such scenario and re-parsing AST is unnecessary. So, we > instead have to use after-change-functions and work out the original > changed region boundaties using > beg_changed end_changed pre-change_lentgh -> > beg_before_change = beg_changed; > end_before_change = end_changed - pre-change_length > This calculation of the original changed region becomes incorrect when > the order of after-change-functions is not the same as the editing order > - this bug report. If these measures still don't help you enough, perhaps the conclusion is that it isn't feasible to implement text parsers in Lisp, at least as long as you want all those micro-optimizations of knowing exactly which parts of the buffer text were modified (as opposed to only know how many characters at the beginning and at the end of the buffer remain unchanged, and reparse the rest). Maybe it must be done in C, if we want Emacs to remain a relatively safe environment. > Indeed. As I tried to explain, the problem for me is not the granularity > of changes, but their ordering. I still don't understand why the ordering matters, but I do know that you cannot rely on it, and I hope I explained why. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-23 17:58 ` Eli Zaretskii @ 2023-08-24 7:46 ` Ihor Radchenko 2023-08-24 8:08 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-24 7:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: > If these measures still don't help you enough, perhaps the conclusion > is that it isn't feasible to implement text parsers in Lisp, at least > as long as you want all those micro-optimizations of knowing exactly > which parts of the buffer text were modified (as opposed to only know > how many characters at the beginning and at the end of the buffer > remain unchanged, and reparse the rest). Maybe it must be done in C, > if we want Emacs to remain a relatively safe environment. Do I understand correctly that `treesit_record_change' is called __less frequently__ compared to before-change-functions and after-change-functions? If yes, I do not see how exposing it to Elisp will make things any worse than already available `before-change-functions'/`after-change-functions'. Elisp code that does not care about text property changes will not be forced to use `before/after-change-functions' (because no other option) and could be called less frequently. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 7:46 ` Ihor Radchenko @ 2023-08-24 8:08 ` Eli Zaretskii 2023-08-24 11:24 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-24 8:08 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Thu, 24 Aug 2023 07:46:06 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > If these measures still don't help you enough, perhaps the conclusion > > is that it isn't feasible to implement text parsers in Lisp, at least > > as long as you want all those micro-optimizations of knowing exactly > > which parts of the buffer text were modified (as opposed to only know > > how many characters at the beginning and at the end of the buffer > > remain unchanged, and reparse the rest). Maybe it must be done in C, > > if we want Emacs to remain a relatively safe environment. > > Do I understand correctly that `treesit_record_change' is called > __less frequently__ compared to before-change-functions and > after-change-functions? No, treesit_record_change is called at a lower level than buffer-change hooks are, and therefore in some cases the hooks will not be called, but treesit_record_change will be. The frequency might be lower, but only because treesit_record_change is called once per change; there's no separate "before" and "after" calls. In any case, the correspondence is not 1:1, because they are called on different levels. > If yes, I do not see how exposing it to Elisp will make things any > worse than already available > `before-change-functions'/`after-change-functions'. Exposing buffer text changes to Lisp is inherently dangerous, because Lisp code can do anything and everything in these hooks, and many packages already do. The fact that we allow this via those two hooks is unfortunate as it is, but adding more opportunities for Lisp to do potentially dangerous stuff, and doing that on a lower level, where the buffer object is sometimes in a state that is not 100% consistent, is unwise, to say the least. It will make Emacs much less stable. No, thanks. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 8:08 ` Eli Zaretskii @ 2023-08-24 11:24 ` Ihor Radchenko 2023-08-24 12:08 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-24 11:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: > Exposing buffer text changes to Lisp is inherently dangerous, because > Lisp code can do anything and everything in these hooks, and many > packages already do. The fact that we allow this via those two hooks > is unfortunate as it is, but adding more opportunities for Lisp to do > potentially dangerous stuff, and doing that on a lower level, where > the buffer object is sometimes in a state that is not 100% consistent, > is unwise, to say the least. It will make Emacs much less stable. > No, thanks. I can see the danger running lisp code while buffer state is transient. Would it be acceptable to accumulate treesit_record_change transactions into a queue, combine them later, and run the Elisp hook only when it is safe (around the same place `after-change-functions' is called, but only when actual edit is made to the buffer text)? That way, we can have a hook that will run strictly less frequently compared to `after-change-functions'. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 11:24 ` Ihor Radchenko @ 2023-08-24 12:08 ` Eli Zaretskii 2023-08-24 13:27 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-24 12:08 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Thu, 24 Aug 2023 11:24:37 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Exposing buffer text changes to Lisp is inherently dangerous, because > > Lisp code can do anything and everything in these hooks, and many > > packages already do. The fact that we allow this via those two hooks > > is unfortunate as it is, but adding more opportunities for Lisp to do > > potentially dangerous stuff, and doing that on a lower level, where > > the buffer object is sometimes in a state that is not 100% consistent, > > is unwise, to say the least. It will make Emacs much less stable. > > No, thanks. > > I can see the danger running lisp code while buffer state is transient. > > Would it be acceptable to accumulate treesit_record_change transactions > into a queue, combine them later, and run the Elisp hook only when it is > safe (around the same place `after-change-functions' is called, but only > when actual edit is made to the buffer text)? > > That way, we can have a hook that will run strictly less frequently > compared to `after-change-functions'. When exactly do you need this to run? At the same time as after-change-functions doesn't sound like a good idea to me, but I think you don't need to run it there. What about running just before redisplay kicks in? Or how about explaining how is the (updated) AST used, i.e. who are the clients of the AST updates, and what do they do when the AST changes? ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 12:08 ` Eli Zaretskii @ 2023-08-24 13:27 ` Ihor Radchenko 2023-08-24 14:53 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-24 13:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> Would it be acceptable to accumulate treesit_record_change transactions >> into a queue, combine them later, and run the Elisp hook only when it is >> safe (around the same place `after-change-functions' is called, but only >> when actual edit is made to the buffer text)? >> >> That way, we can have a hook that will run strictly less frequently >> compared to `after-change-functions'. > > When exactly do you need this to run? At the same time as > after-change-functions doesn't sound like a good idea to me, but I > think you don't need to run it there. What about running just before > redisplay kicks in? Usually, we need to update AST when some other Elisp code needs Org element API. And we update AST on idle timer to speed things up. What if we use no hooks at all? Instead, the edit transactions are accumulated in a list that can be examined and processed by Elisp code as needed. Elements of the list will be like [:buffer-chars-modified-tick :region-beginning :region-end-before-edit :region-end-after-edit] -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 13:27 ` Ihor Radchenko @ 2023-08-24 14:53 ` Eli Zaretskii 2023-08-25 6:37 ` Eli Zaretskii 2023-08-25 8:09 ` Ihor Radchenko 0 siblings, 2 replies; 34+ messages in thread From: Eli Zaretskii @ 2023-08-24 14:53 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Thu, 24 Aug 2023 13:27:02 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > When exactly do you need this to run? At the same time as > > after-change-functions doesn't sound like a good idea to me, but I > > think you don't need to run it there. What about running just before > > redisplay kicks in? > > Usually, we need to update AST when some other Elisp code needs Org > element API. And we update AST on idle timer to speed things up. > > What if we use no hooks at all? Instead, the edit transactions are > accumulated in a list that can be examined and processed by Elisp code > as needed. If you need that from timers, then yes, all you need is access from the timer function to a data structure that holds the accumulated transactions. Timers run approximately at the same time and under the same conditions as redisplay, so this mechanism will indeed ensure this data is accessed when Emacs is in a consistent state, and it is safe to access and use this data. > Elements of the list will be like > [:buffer-chars-modified-tick :region-beginning :region-end-before-edit :region-end-after-edit] If you really need buffer-chars-modified-tick, you will have to verify that it is updated before calling the function which updates the "transactions list". ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 14:53 ` Eli Zaretskii @ 2023-08-25 6:37 ` Eli Zaretskii 2023-08-25 9:09 ` Ihor Radchenko 2023-08-25 8:09 ` Ihor Radchenko 1 sibling, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-25 6:37 UTC (permalink / raw) To: yantar92; +Cc: casouri, 65451 > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Thu, 24 Aug 2023 17:53:26 +0300 > From: Eli Zaretskii <eliz@gnu.org> > > If you need that from timers, then yes, all you need is access from > the timer function to a data structure that holds the accumulated > transactions. Timers run approximately at the same time and under the > same conditions as redisplay, so this mechanism will indeed ensure > this data is accessed when Emacs is in a consistent state, and it is > safe to access and use this data. > > > Elements of the list will be like > > [:buffer-chars-modified-tick :region-beginning :region-end-before-edit :region-end-after-edit] > > If you really need buffer-chars-modified-tick, you will have to verify > that it is updated before calling the function which updates the > "transactions list". Thinking about this some more, we will need to consider whether this list of accumulated transactions is ever compacted by deleting old transactions, or we let it grow indefinitely. If the former, we should consider the case where more than one feature wants to track buffer edits (so it is impossible to remove entries once they have been processed by a single consumer). ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-25 6:37 ` Eli Zaretskii @ 2023-08-25 9:09 ` Ihor Radchenko 2023-08-26 7:10 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-25 9:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: > Thinking about this some more, we will need to consider whether this > list of accumulated transactions is ever compacted by deleting old > transactions, or we let it grow indefinitely. If the former, we > should consider the case where more than one feature wants to track > buffer edits (so it is impossible to remove entries once they have > been processed by a single consumer). What I propose is actually quite similar to `buffer-undo-list'. But a bit less generic - (apply FUN-NAME ARGS) entries cannot be handled outside the narrow scope of `undo'. Similar to `buffer-undo-list' it needs to be compacted. To not lose the information when the edit history is compacted, there may be a hook executed right before the compaction, so that all the users can update their state as needed. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-25 9:09 ` Ihor Radchenko @ 2023-08-26 7:10 ` Eli Zaretskii 2023-08-27 8:13 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-26 7:10 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Fri, 25 Aug 2023 09:09:11 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Thinking about this some more, we will need to consider whether this > > list of accumulated transactions is ever compacted by deleting old > > transactions, or we let it grow indefinitely. If the former, we > > should consider the case where more than one feature wants to track > > buffer edits (so it is impossible to remove entries once they have > > been processed by a single consumer). > > What I propose is actually quite similar to `buffer-undo-list'. > But a bit less generic - (apply FUN-NAME ARGS) entries cannot be handled > outside the narrow scope of `undo'. > Similar to `buffer-undo-list' it needs to be compacted. Not sure what this means in practice. the entries in the list we are discussing will be very different from the entries in buffer-undo-list. > To not lose the information when the edit history is compacted, there > may be a hook executed right before the compaction, so that all the > users can update their state as needed. If the compaction will run from GC, then we cannot safely call Lisp hooks at that time. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-26 7:10 ` Eli Zaretskii @ 2023-08-27 8:13 ` Ihor Radchenko 2023-08-27 8:29 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-27 8:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> What I propose is actually quite similar to `buffer-undo-list'. >> But a bit less generic - (apply FUN-NAME ARGS) entries cannot be handled >> outside the narrow scope of `undo'. >> Similar to `buffer-undo-list' it needs to be compacted. > > Not sure what this means in practice. the entries in the list we are > discussing will be very different from the entries in > buffer-undo-list. What I meant is that similar principles with undo-limit-like variables may apply. >> To not lose the information when the edit history is compacted, there >> may be a hook executed right before the compaction, so that all the >> users can update their state as needed. > > If the compaction will run from GC, then we cannot safely call Lisp > hooks at that time. Fair point. Then, what about compacting the "edit list" more frequently, so that we do not need to worry about its size? But I am not sure what frequency will be safe. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-27 8:13 ` Ihor Radchenko @ 2023-08-27 8:29 ` Eli Zaretskii 2023-08-29 7:39 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-27 8:29 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Sun, 27 Aug 2023 08:13:38 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> What I propose is actually quite similar to `buffer-undo-list'. > >> But a bit less generic - (apply FUN-NAME ARGS) entries cannot be handled > >> outside the narrow scope of `undo'. > >> Similar to `buffer-undo-list' it needs to be compacted. > > > > Not sure what this means in practice. the entries in the list we are > > discussing will be very different from the entries in > > buffer-undo-list. > > What I meant is that similar principles with undo-limit-like variables > may apply. Well, the devil is in the details ;-) > >> To not lose the information when the edit history is compacted, there > >> may be a hook executed right before the compaction, so that all the > >> users can update their state as needed. > > > > If the compaction will run from GC, then we cannot safely call Lisp > > hooks at that time. > > Fair point. > Then, what about compacting the "edit list" more frequently, so that we > do not need to worry about its size? But I am not sure what frequency > will be safe. Something like that, yes. But we need to invent a protocol which would allow several clients to consume the list safely and without the risk of missing edits. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-27 8:29 ` Eli Zaretskii @ 2023-08-29 7:39 ` Ihor Radchenko 0 siblings, 0 replies; 34+ messages in thread From: Ihor Radchenko @ 2023-08-29 7:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> Then, what about compacting the "edit list" more frequently, so that we >> do not need to worry about its size? But I am not sure what frequency >> will be safe. > > Something like that, yes. But we need to invent a protocol which > would allow several clients to consume the list safely and without the > risk of missing edits. I can think of two approaches: 1. There will be a new buffer-local variable - `buffer-edit-history' that will hold recent `buffer-edit-history-limit' edits. This way, Elisp functions will be able to examine it any time they need to. In addition, there will be `after-edit-functions' hook that will be called after `buffer-edit-history-limit' is exceeded. Before the hook is called, `buffer-edit-history' is truncated. The hook functions will be called with a single argument - list of edits that have been removed from the `buffer-undo-history'. That way, Elisp will be able to process edits that will disappear from the `buffer-edit-history'. Each entry in `buffer-edit-history' will be a list of (beg end_before end_after counter), describing region boundaries before and after the edit + a counter that can be used to keep track of processed positions. The counter will be useful for the code that processes `buffer-edit-history' independently, outside `after-edit-functions', and may need to skip already processed elements. (I initially though that we can simply hold `buffer-chars-modified-tick' here, but it is not necessary to hold `buffer-chars-modified-tick' specifically - just something to indicate "epoch" in the edit history). The downside of exposing `buffer-edit-history' is that some bad-written Elisp may be tempted to hold a pointer to a cons cell in `buffer-edit-history', thus preventing GC. 2. We can have `after-edit-functions' being called once for each edit event with (beg end_before end_after) arguments. To avoid skipping edits, in addition to Emacs sometimes calling the hook, we should allow Elisp to trigger the hook early, by calling `process-buffer-edits'. This way, synchronization can be ensured. The downside here is when multiple consumers are using `after-edit-functions' - synchronization (`process-buffer-edits') requested by one consumer will also trigger all other consumers, potentially creating extra overheads. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-24 14:53 ` Eli Zaretskii 2023-08-25 6:37 ` Eli Zaretskii @ 2023-08-25 8:09 ` Ihor Radchenko 2023-08-25 10:25 ` Eli Zaretskii 1 sibling, 1 reply; 34+ messages in thread From: Ihor Radchenko @ 2023-08-25 8:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> Elements of the list will be like >> [:buffer-chars-modified-tick :region-beginning :region-end-before-edit :region-end-after-edit] > > If you really need buffer-chars-modified-tick, you will have to verify > that it is updated before calling the function which updates the > "transactions list". Do you mean that buffer-chars-modified-tick may not be yet updated at the time treesit_record_change is called? Then, treesit_record_change only uses byte positions. Will it be possible to record buffer point positions, or may it be problematic? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-25 8:09 ` Ihor Radchenko @ 2023-08-25 10:25 ` Eli Zaretskii 2023-08-25 10:49 ` Ihor Radchenko 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2023-08-25 10:25 UTC (permalink / raw) To: Ihor Radchenko; +Cc: casouri, 65451 > From: Ihor Radchenko <yantar92@posteo.net> > Cc: casouri@gmail.com, 65451@debbugs.gnu.org > Date: Fri, 25 Aug 2023 08:09:41 +0000 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Elements of the list will be like > >> [:buffer-chars-modified-tick :region-beginning :region-end-before-edit :region-end-after-edit] > > > > If you really need buffer-chars-modified-tick, you will have to verify > > that it is updated before calling the function which updates the > > "transactions list". > > Do you mean that buffer-chars-modified-tick may not be yet updated at > the time treesit_record_change is called? Yes. You can audit the code and see it for yourself. > Then, treesit_record_change only uses byte positions. Will it be > possible to record buffer point positions, or may it be problematic? Sorry, I don't understand what you are saying here and how this is relevant to the issue at hand and to my comment in particular. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-25 10:25 ` Eli Zaretskii @ 2023-08-25 10:49 ` Ihor Radchenko 0 siblings, 0 replies; 34+ messages in thread From: Ihor Radchenko @ 2023-08-25 10:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: casouri, 65451 Eli Zaretskii <eliz@gnu.org> writes: >> Then, treesit_record_change only uses byte positions. Will it be >> possible to record buffer point positions, or may it be problematic? > > Sorry, I don't understand what you are saying here and how this is > relevant to the issue at hand and to my comment in particular. Never mind. I am jumping ahead of the discussion. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2023-08-22 12:22 ` Eli Zaretskii 2023-08-22 12:42 ` Ihor Radchenko @ 2024-03-30 13:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-30 14:11 ` Eli Zaretskii 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 2 replies; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-30 13:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Alan Mackenzie, Ihor Radchenko, 65451 >> ⛔ Warning (emacs): Before: 1278 1281 >> ⛔ Warning (emacs): Before: 1278 1288 >> ⛔ Warning (emacs): After: 1278 1288 delta: 0 >> ⛔ Warning (emacs): After: 1278 1288 delta: 7 >> >> Note how "After: 1278 1288 delta: 0" reports a change of "utf-8-unix" >> that did not alter the buffer text size. It is trigerred _before_ >> "After: 1278 1288 delta: 7" that corresponds to replacing "UTF" with >> "utf-8-unix". Hmm... yes, that's bad. Alan, have you looked at this? I suspect the best option in the above case is to inhibit the inner calls to before/after (assuming we're sure they change only the "new text"), so we'd be down to: ⛔ Warning (emacs): Before: 1278 1281 ⛔ Warning (emacs): After: 1278 1288 delta: 7 > I think Org mode is relying on something it should not. This > particular use case aside, Emacs is allowed to call a function that > changes the buffer from a function that itself changes the buffer, and > it is allowed to call that inner function _before_ it did all the > changes it intended to do. AFAIK the above sequences breaks the promise we make about `before-change-functions` and `after-change-functions`. Almost all the non-trivial users of those hooks (i.e. basically those that need to use both hooks) have extra sanity and raise the heads up in despair when faced with things like the above (my `track-changes.el` lacks such sanity checks, but that's because it's a PoC). Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-30 13:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-30 14:11 ` Eli Zaretskii 2024-03-30 15:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2024-03-30 14:11 UTC (permalink / raw) To: Stefan Monnier; +Cc: acm, yantar92, 65451 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Ihor Radchenko <yantar92@posteo.net>, 65451@debbugs.gnu.org, Alan > Mackenzie <acm@muc.de> > Date: Sat, 30 Mar 2024 09:51:59 -0400 > > > I think Org mode is relying on something it should not. This > > particular use case aside, Emacs is allowed to call a function that > > changes the buffer from a function that itself changes the buffer, and > > it is allowed to call that inner function _before_ it did all the > > changes it intended to do. > > AFAIK the above sequences breaks the promise we make about > `before-change-functions` and `after-change-functions`. > > Almost all the non-trivial users of those hooks (i.e. basically those > that need to use both hooks) have extra sanity and raise the heads up in > despair when faced with things like the above (my `track-changes.el` > lacks such sanity checks, but that's because it's a PoC). I still stand by my opinion: Org is relying on something it cannot rely upon, not as long as a function that changes a buffer can be called from another function which changes the same buffer. I don't see how we can avoid breaking code which relies on such assumptions, not in general anyway. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-30 14:11 ` Eli Zaretskii @ 2024-03-30 15:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-30 16:47 ` Eli Zaretskii 0 siblings, 1 reply; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-30 15:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, yantar92, 65451 > I still stand by my opinion: Org is relying on something it cannot > rely upon, not as long as a function that changes a buffer can be > called from another function which changes the same buffer. `*-change-functions` should not modify the buffer (we could try and enforce this, tho in my experience those that do will get punished pretty quickly already), so the only cases I can think of where "a function that changes a buffer can be called from another function which changes the same buffer" is when both of those functions are in our C code and we should have enough control to fix those cases. > I don't see how we can avoid breaking code which relies on such > assumptions, not in general anyway. All sophisticated-enough users of `*-change-functions` rely on this (and come with sanity checks to detect the problem and fallback on an expensive recovery when needed, because indeed we tend to break our promises 🙁). So we really should try and fix those. Alan did convince me that we should treat them as bugs and that we should try and fix them. Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-30 15:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-30 16:47 ` Eli Zaretskii 2024-03-31 3:04 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2024-03-30 16:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: acm, yantar92, 65451 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: yantar92@posteo.net, 65451@debbugs.gnu.org, acm@muc.de > Date: Sat, 30 Mar 2024 11:38:24 -0400 > > > I still stand by my opinion: Org is relying on something it cannot > > rely upon, not as long as a function that changes a buffer can be > > called from another function which changes the same buffer. > > `*-change-functions` should not modify the buffer That's not what happened in the case described in that bug, AFAIR. Simply a buffer change started, then, before it ended, another nested change was started. > so the only cases I can think of where "a function that changes a > buffer can be called from another function which changes the same > buffer" is when both of those functions are in our C code and we > should have enough control to fix those cases. You forget the various hooks, other than buffer modification hooks. > Alan did convince me that we should treat them as bugs and that we > should try and fix them. He didn't convince me. ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-30 16:47 ` Eli Zaretskii @ 2024-03-31 3:04 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-31 3:04 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, yantar92, 65451 >> `*-change-functions` should not modify the buffer > That's not what happened in the case described in that bug, AFAIR. No, indeed. >> so the only cases I can think of where "a function that changes a >> buffer can be called from another function which changes the same >> buffer" is when both of those functions are in our C code and we >> should have enough control to fix those cases. > You forget the various hooks, other than buffer modification hooks. If we have to run them some time between `before-c-f` and `after-c-f`, then they should not modify the buffer, just like the `*-change-functions` hooks, >> Alan did convince me that we should treat them as bugs and that we >> should try and fix them. > He didn't convince me. 🙂 Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-30 13:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-30 14:11 ` Eli Zaretskii @ 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-31 6:06 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-31 3:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Alan Mackenzie, Ihor Radchenko, 65451 > I suspect the best option in the above case is to inhibit the inner > calls to before/after (assuming we're sure they change only the "new > text"), so we'd be down to: > > ⛔ Warning (emacs): Before: 1278 1281 > ⛔ Warning (emacs): After: 1278 1288 delta: 7 A simpler option is the patch below. Stefan diff --git a/src/search.c b/src/search.c index 2ad565fadde..83ff7120e43 100644 --- a/src/search.c +++ b/src/search.c @@ -2769,6 +2769,8 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0, /* Replace the old text with the new in the cleanest possible way. */ replace_range (sub_start, sub_end, newtext, 1, 0, 1, true, true); + signal_after_change (sub_start, sub_end - sub_start, SCHARS (newtext)); + if (case_action == all_caps) Fupcase_region (make_fixnum (search_regs.start[sub]), make_fixnum (newpoint), @@ -2792,7 +2794,6 @@ DEFUN ("replace-match", Freplace_match, Sreplace_match, 1, 5, 0, /* Now move point "officially" to the end of the inserted replacement. */ move_if_not_intangible (newpoint); - signal_after_change (sub_start, sub_end - sub_start, SCHARS (newtext)); update_compositions (sub_start, newpoint, CHECK_BORDER); return Qnil; ^ permalink raw reply related [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-31 6:06 ` Eli Zaretskii 2024-03-31 13:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 1 reply; 34+ messages in thread From: Eli Zaretskii @ 2024-03-31 6:06 UTC (permalink / raw) To: Stefan Monnier; +Cc: acm, yantar92, 65451 > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Ihor Radchenko <yantar92@posteo.net>, 65451@debbugs.gnu.org, Alan > Mackenzie <acm@muc.de> > Date: Sat, 30 Mar 2024 23:02:14 -0400 > > > I suspect the best option in the above case is to inhibit the inner > > calls to before/after (assuming we're sure they change only the "new > > text"), so we'd be down to: > > > > ⛔ Warning (emacs): Before: 1278 1281 > > ⛔ Warning (emacs): After: 1278 1288 delta: 7 > > A simpler option is the patch below. Doesn't that miss the changes done by upcase-region? Also, what about point not being after the inserted replacement at that place? ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-31 6:06 ` Eli Zaretskii @ 2024-03-31 13:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 0 replies; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-03-31 13:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, yantar92, 65451 >> > I suspect the best option in the above case is to inhibit the inner >> > calls to before/after (assuming we're sure they change only the "new >> > text"), so we'd be down to: >> > >> > ⛔ Warning (emacs): Before: 1278 1281 >> > ⛔ Warning (emacs): After: 1278 1288 delta: 7 >> >> A simpler option is the patch below. > > Doesn't that miss the changes done by upcase-region? No: `upcase-region` runs its own `before/after-change-functions` (indeed, these were the problematic nested ones which break the order). > Also, what about point not being after the inserted replacement at > that place? `*-change-functions` can't rely on the position of point so that's not an issue. Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-31 6:06 ` Eli Zaretskii @ 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-04-08 19:10 ` Ihor Radchenko 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 1 reply; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-07 18:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Alan Mackenzie, Ihor Radchenko, 65451 > A simpler option is the patch below. Pushed to `master` along with a corresponding test. The test uses a "sanity check" suite which encodes the promises we make after before/after-change-functions. [ My Emacs now runs those sanity checks in all non-temp buffers. 🙂 ] Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-08 19:10 ` Ihor Radchenko 0 siblings, 0 replies; 34+ messages in thread From: Ihor Radchenko @ 2024-04-08 19:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: Alan Mackenzie, Eli Zaretskii, 65451 Stefan Monnier <monnier@iro.umontreal.ca> writes: >> A simpler option is the patch below. > > Pushed to `master` along with a corresponding test. > The test uses a "sanity check" suite which encodes the promises we make > after before/after-change-functions. I confirm the fix on the latest master. Thanks! -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92> ^ permalink raw reply [flat|nested] 34+ messages in thread
* bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-31 6:06 ` Eli Zaretskii 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2 siblings, 0 replies; 34+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-04-07 18:19 UTC (permalink / raw) To: 65451-done Closing, Stefan ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-04-08 19:10 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-22 9:30 bug#65451: 30.0.50; `after-change-functions' are not triggered in the same order the changes are made Ihor Radchenko 2023-08-22 12:22 ` Eli Zaretskii 2023-08-22 12:42 ` Ihor Radchenko 2023-08-22 12:58 ` Eli Zaretskii 2023-08-22 13:41 ` Ihor Radchenko 2023-08-22 16:02 ` Eli Zaretskii 2023-08-23 8:52 ` Ihor Radchenko 2023-08-23 17:58 ` Eli Zaretskii 2023-08-24 7:46 ` Ihor Radchenko 2023-08-24 8:08 ` Eli Zaretskii 2023-08-24 11:24 ` Ihor Radchenko 2023-08-24 12:08 ` Eli Zaretskii 2023-08-24 13:27 ` Ihor Radchenko 2023-08-24 14:53 ` Eli Zaretskii 2023-08-25 6:37 ` Eli Zaretskii 2023-08-25 9:09 ` Ihor Radchenko 2023-08-26 7:10 ` Eli Zaretskii 2023-08-27 8:13 ` Ihor Radchenko 2023-08-27 8:29 ` Eli Zaretskii 2023-08-29 7:39 ` Ihor Radchenko 2023-08-25 8:09 ` Ihor Radchenko 2023-08-25 10:25 ` Eli Zaretskii 2023-08-25 10:49 ` Ihor Radchenko 2024-03-30 13:51 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-30 14:11 ` Eli Zaretskii 2024-03-30 15:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-30 16:47 ` Eli Zaretskii 2024-03-31 3:04 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-31 3:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-03-31 6:06 ` Eli Zaretskii 2024-03-31 13:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2024-04-08 19:10 ` Ihor Radchenko 2024-04-07 18:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).