Lisp primitives and their calling of the change hooks

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Lisp primitives and their calling of the change hooks
@ 2018-01-03 12:45 Alan Mackenzie
  2018-01-03 21:51 ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-03 12:45 UTC (permalink / raw)
  To: emacs-devel

Hello, Emacs.

I've just had an interesting few days investigating our privitives'
calling of before-change-functions and after-change-functions.

In an ideal world, each primitive would call each of those hooks exactly
once.  However, we're not in an ideal world, and there are primitives
which perform several (or many) distinct changes (e.g. transpose-regions
or subst-char-in-region) where a single pair of hook calls wouldn't make
sense.

TL;DR: There are several, but not many, primitives where
before-change-functions doesn't match after-change-functions.

Here is how I went about this:
1. Extract a list of external functions in insdel.c from the section in
lisp.h which declares them.  Form a regexp from this list, and convert it
into a grep regexp (by replacing '?' by '\?'.

2. grep -l *.c with this regexp.  This gave these files:
    buffer.c
    callproc.c
    casefiddle.c
    cmds.c
    coding.c
    decompress.c
    editfns.c
    emacs.c
    fileio.c
    fns.c
    indent.c
    insdel.c
    print.c
    process.c
    search.c
    textprop.c
    xdisp.c
    xml.c

3. Using GNU cflow, a utility which creates call graphs for C files,
create a reverse call graph (i.e. an "is called by" graph) for the 18 C
files.

4. Analyse this graph with an elisp script to find all functions which,
directly or indirectly, call signal_before_change or signal_after_change.

5. Filter this list of functions to leave only the lisp primitives (i.e.
functions starting with "F"), and convert to Lisp names.  Edit this list
by hand to remove those primitives which don't change the buffer (most of
them were removed).  This left the following list:
    (add-face-text-property)
    (add-text-properties)
    (base64-decode-region)
    (base64-encode-region)
    (capitalize-region)

    (capitalize-word)
    (delete-and-extract-region)
    (delete-char)
    (delete-field)
    (delete-region)

    (downcase-region)
    (downcase-word)
    (erase-buffer)
    (indent-to)
    (insert)

    (insert-and-inherit)
    (insert-before-markers)
    (insert-buffer-substring)
    (insert-byte)
    (insert-char)

    (insert-file-contents)
    (move-to-column)
    (princ)
    (print)
    (put-text-property)

    (remove-list-of-text-properties)
    (remove-text-properties)
    (replace-buffer-contents)
    (replace-match)
    (self-insert-command)

    (set-buffer-multibyte)
    (set-text-properties)
    (subst-char-in-region)
    (terpri)
    (translate-region-internal)

    (transpose-regions)
    (upcase-initials-region)
    (upcase-region)
    (upcase-word)
    (write-char)

    (zlib-decompress-region)

6. Write and run a script which executes each of these primitives whilst
counting the number of times it invokes before-change-hooks and
after-change-hooks.  Output messages where these numbers aren't 1 and 1.
This gave the following:
    Primitive              b-c-f calls         a-c-f calls
base64-encode-region          2                    2
base64-decode-region          2                    1
insert-file-contents          2                    2
move-to-column-1              3                    3   ; with the &optional
                                                       ; which untabifies.
replace-buffer-contents      123                  123
set-buffer-multibyte          0                    0   ; May have done nothing
tranlate-region-internal      1                   146
tranpose-regions              2                    1
upcase-initials-region        1                    0[*]
upcase-region                 1                    0[*]
zlib-decompress-region        0                    0
erase-buffer                  0                    0

[*] In upcase-... there weren't any lower case characters in the buffer
at the time.

This list is incomplete.  There were one or two other primitives which
triggered messages which I didn't write down, and now cannot reproduce.

7. Surpising is that insert-file-contents (which gave so much trouble in
summer 2016) returned balanced counts.  It seems to be mostly the special
purpose primitives, those not widely used in Lisp, which are most likely
to have strange b-c-f and a-c-f counts.  But the case-switching
primitives might be troublesome.  So might transpose-regions.

8. Possibly these things could be more accurately documented in
elisp.info.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-03 12:45 Lisp primitives and their calling of the change hooks Alan Mackenzie
@ 2018-01-03 21:51 ` Stefan Monnier
  2018-01-04 15:51   ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-03 21:51 UTC (permalink / raw)
  To: emacs-devel

> 6. Write and run a script which executes each of these primitives whilst
> counting the number of times it invokes before-change-hooks and
> after-change-hooks.

FWIW, we do not try to make those numbers match (and their begin/end
specs don't necessarily match either).

What we aim to do (i.e. what defines what I would consider as a bug) is
to make sure every a-c-f is preceded by a "covering" b-c-f.  IOW, b-c-f
may be followed by any number of a-f-c (including 0) as long as those
are within the text chunk covered by the b-f-c.

Some of your results clearly indicate what I'd consider as bugs.
E.g. `erase-buffer` should call those hooks (unless the buffer was
already empty).  OTOH for upcase-region 1 call to b-c-f and 0 to a-c-f
is acceptable.  For most of the others, a deeper inspection would be
needed to figure out if there's an actual bug or if it's just
a normal occurrence.

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-03 21:51 ` Stefan Monnier
@ 2018-01-04 15:51   ` Alan Mackenzie
  2018-01-04 18:16     ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-04 15:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Wed, Jan 03, 2018 at 16:51:29 -0500, Stefan Monnier wrote:
> > 6. Write and run a script which executes each of these primitives whilst
> > counting the number of times it invokes before-change-hooks and
> > after-change-hooks.

> FWIW, we do not try to make those numbers match (and their begin/end
> specs don't necessarily match either).

In practice, these numbers match for the vast majority of buffer
changing calls, and they match at 1-1.  They match for all the
"primitive" primitives, which are basically insert, delete, and possibly
change.

These numbers, in an ideal world, would match.  It is only because we
have "non-primitive" primitives (i.e. primitives which perform several
distinct buffer changes) that they don't.

> What we aim to do (i.e. what defines what I would consider as a bug) is
> to make sure every a-c-f is preceded by a "covering" b-c-f.  IOW, b-c-f
> may be followed by any number of a-f-c (including 0) as long as those
> are within the text chunk covered by the b-f-c.

Yes.  That applies, however, only to "compound primitives", not to the
"primitive primitives", insert and delete, which comprise nearly all the
calls in actual use, which are all 1-1.

It is an awkward state of affairs, where after-c-f's have somehow got to
"remember" that they may only be processing part of the change announced
by before-c-f.

It is also not true: insert-file-contents, in circumstances explored in
summer 2016, invokes only a-c-f, not b-c-f.

> Some of your results clearly indicate what I'd consider as bugs.
> E.g. `erase-buffer` should call those hooks (unless the buffer was
> already empty).

I've had a look at the source, and erase-buffer clearly calls the two
hooks.  I can't at the moment see what went wrong.

> OTOH for upcase-region 1 call to b-c-f and 0 to a-c-f is acceptable.

I don't really agree, but that won't change anything.  ;-(

> For most of the others, a deeper inspection would be needed to figure
> out if there's an actual bug or if it's just a normal occurrence.

We know there is a bug in insert-file-contents (See summer 2016).  I
would be surprised indeed if there weren't others, too.

A way to fix them if we were going to (which we're not), would be to
take all the b-c-f and a-c-f calls out of the "compound primitives" and
have the latter effect their actions through calling the "true
primitivies".

However, we could improve the documentation of this situation in the
eilsp manual.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-04 15:51   ` Alan Mackenzie
@ 2018-01-04 18:16     ` Stefan Monnier
  2018-01-04 21:11       ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-04 18:16 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> FWIW, we do not try to make those numbers match (and their begin/end
>> specs don't necessarily match either).
> In practice, these numbers match for the vast majority of buffer
> changing calls, and they match at 1-1.

Yes, as discussed numerous times in the past ;-)

> These numbers, in an ideal world, would match.

Not in my ideal world, no.

> It is also not true:

As mentioned, I'd consider that as a bug.

> insert-file-contents, in circumstances explored in
> summer 2016, invokes only a-c-f, not b-c-f.

And indeed, I considered it a bug and AFAIK this is now fixed.

>> For most of the others, a deeper inspection would be needed to figure
>> out if there's an actual bug or if it's just a normal occurrence.
> We know there is a bug in insert-file-contents (See summer 2016).
                ^^
                was

> I would be surprised indeed if there weren't others, too.

I would too.

> However, we could improve the documentation of this situation in the
> elisp manual.

We currently say:

      Do @emph{not} expect the before-change hooks and the after-change
    hooks be called in balanced pairs around each buffer change.  Also
    don't expect the before-change hooks to be called for every chunk of
    text Emacs is about to delete.  These hooks are provided on the
    assumption that Lisp programs will use either before- or the
    after-change hooks, but not both, and the boundaries of the region
    where the changes happen might include more than just the actual
    changed text, or even lump together several changes done piecemeal.

which is lax enough that any behavior could be argued to be acceptable.
IOW I think it's too lax.  We should probably try and fix it to reflect
the fact that every change should be covered by the last preceding b-c-f
and should be followed by a corresponding call to a-c-f (and this
before the next call to b-c-f).


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-04 18:16     ` Stefan Monnier
@ 2018-01-04 21:11       ` Alan Mackenzie
  2018-01-04 21:36         ` Stefan Monnier
  2018-01-05  6:55         ` Eli Zaretskii
  0 siblings, 2 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-04 21:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Jan 04, 2018 at 13:16:23 -0500, Stefan Monnier wrote:

> As mentioned, I'd consider that [insert-file-contents not calling
> b-c-f] as a bug.

> > insert-file-contents, in circumstances explored in
> > summer 2016, invokes only a-c-f, not b-c-f.

> And indeed, I considered it a bug and AFAIK this is now fixed.

Hey!  It is indeed fixed.  Thanks!  I didn't know that.

[ .... ]

> > I would be surprised indeed if there weren't others, too.

> I would too.

> > However, we could improve the documentation of this situation in the
> > elisp manual.

> We currently say:

>       Do @emph{not} expect the before-change hooks and the after-change
>     hooks be called in balanced pairs around each buffer change.  Also
>     don't expect the before-change hooks to be called for every chunk of
>     text Emacs is about to delete.  These hooks are provided on the
>     assumption that Lisp programs will use either before- or the
>     after-change hooks, but not both, and the boundaries of the region
>     where the changes happen might include more than just the actual
>     changed text, or even lump together several changes done piecemeal.

> which is lax enough that any behavior could be argued to be acceptable.
> IOW I think it's too lax.  We should probably try and fix it to reflect
> the fact that every change should be covered by the last preceding b-c-f
> and should be followed by a corresponding call to a-c-f (and this
> before the next call to b-c-f).

Is that quite right?  The upcase-region call in my test had no a-c-f
call, almost certainly because there were no lower case letters in the
buffer at the time.  From your answers in this thread, I'm thinking that
every primitive-call which could change the buffer will have exactly one
b-c-f and zero or more a-c-f's.

How about something like this to replace that paragraph from the elisp
manual?

    The primitives which atomically insert or delete a contiguous chunk
    of text into or from a buffer will call `before-change-functions'
    and `after-change-functions' in balanced pairs, once for each
    change.  The arguments to these hooks will exactly delimit the
    change being made.  Calls to these primitives comprise the vast bulk
    of buffer changes.

    Other, more complex primitives aim to call `before-change-functions'
    once before making any changes, then to call
    `after-change-functions' zero, one, or several times, depending on
    how many individual changes the primitive makes.  The `BEG' and
    `END' arguments to `before-change-functions' will enclose a region
    in which the individual changes are made, but won't necessarily be
    the minimal such region.  The `BEG', `END', and `OLD-LEN' arguments
    to each successive call of `after-change-functions' will accurately
    delimit the current change.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-04 21:11       ` Alan Mackenzie
@ 2018-01-04 21:36         ` Stefan Monnier
  2018-01-06 15:18           ` Alan Mackenzie
  2018-01-05  6:55         ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-04 21:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> which is lax enough that any behavior could be argued to be acceptable.
>> IOW I think it's too lax.  We should probably try and fix it to reflect
>> the fact that every change should be covered by the last preceding b-c-f
>> and should be followed by a corresponding call to a-c-f (and this
>> before the next call to b-c-f).
> Is that quite right?

Probably not quite.

> The upcase-region call in my test had no a-c-f call, almost certainly
> because there were no lower case letters in the buffer at the time.

Indeed, there were no changes, so no need to call a-c-f.

> From your answers in this thread, I'm thinking that every
> primitive-call which could change the buffer will have exactly one
> b-c-f and zero or more a-c-f's.

Sounds about right, tho I expect some primitives might just call insert
and delete a few times, thus calling b-c-f several times.

> How about something like this to replace that paragraph from the elisp
> manual?
>
>     The primitives which atomically insert or delete a contiguous chunk
>     of text into or from a buffer will call `before-change-functions'
>     and `after-change-functions' in balanced pairs, once for each
>     change.  The arguments to these hooks will exactly delimit the
>     change being made.  Calls to these primitives comprise the vast bulk
>     of buffer changes.
>
>     Other, more complex primitives aim to call `before-change-functions'
>     once before making any changes, then to call
>     `after-change-functions' zero, one, or several times, depending on
>     how many individual changes the primitive makes.  The `BEG' and
>     `END' arguments to `before-change-functions' will enclose a region
>     in which the individual changes are made, but won't necessarily be
>     the minimal such region.  The `BEG', `END', and `OLD-LEN' arguments
>     to each successive call of `after-change-functions' will accurately
>     delimit the current change.

Looks good to me, thank you.

I think in the case of subst-chars-in-region we only call a-c-f one time
(but with tighter bounds than those of the preceding b-c-f) rather than
once per character that's substituted, so maybe "The `BEG', `END', and
`OLD-LEN' arguments to each successive call of `after-change-functions'
will accurately delimit the current change" promises a bit more than we
deliver, although it depends on how we interpret "current change".

In any case, the above is much better than what we have now and I think
it gives a pretty good rendition of our intention.


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-04 21:11       ` Alan Mackenzie
  2018-01-04 21:36         ` Stefan Monnier
@ 2018-01-05  6:55         ` Eli Zaretskii
  2018-01-05 11:41           ` Alan Mackenzie
  2018-01-05 16:50           ` Stefan Monnier
  1 sibling, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-05  6:55 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Thu, 4 Jan 2018 21:11:54 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
>     The primitives which atomically insert or delete a contiguous chunk
>     of text into or from a buffer will call `before-change-functions'
>     and `after-change-functions' in balanced pairs, once for each
>     change.  The arguments to these hooks will exactly delimit the
>     change being made.  Calls to these primitives comprise the vast bulk
>     of buffer changes.
> 
>     Other, more complex primitives aim to call `before-change-functions'
>     once before making any changes, then to call
>     `after-change-functions' zero, one, or several times, depending on
>     how many individual changes the primitive makes.  The `BEG' and
>     `END' arguments to `before-change-functions' will enclose a region
>     in which the individual changes are made, but won't necessarily be
>     the minimal such region.  The `BEG', `END', and `OLD-LEN' arguments
>     to each successive call of `after-change-functions' will accurately
>     delimit the current change.

How will the reader know to distinguish between these two classes of
primitives?  Without such an ability, the extra accuracy in this text
is not useful.




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05  6:55         ` Eli Zaretskii
@ 2018-01-05 11:41           ` Alan Mackenzie
  2018-01-05 13:00             ` Eli Zaretskii
  2018-01-05 16:50           ` Stefan Monnier
  1 sibling, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-05 11:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Fri, Jan 05, 2018 at 08:55:20 +0200, Eli Zaretskii wrote:
> > Date: Thu, 4 Jan 2018 21:11:54 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> >     The primitives which atomically insert or delete a contiguous chunk
> >     of text into or from a buffer will call `before-change-functions'
> >     and `after-change-functions' in balanced pairs, once for each
> >     change.  The arguments to these hooks will exactly delimit the
> >     change being made.  Calls to these primitives comprise the vast bulk
> >     of buffer changes.

> >     Other, more complex primitives aim to call `before-change-functions'
> >     once before making any changes, then to call
> >     `after-change-functions' zero, one, or several times, depending on
> >     how many individual changes the primitive makes.  The `BEG' and
> >     `END' arguments to `before-change-functions' will enclose a region
> >     in which the individual changes are made, but won't necessarily be
> >     the minimal such region.  The `BEG', `END', and `OLD-LEN' arguments
> >     to each successive call of `after-change-functions' will accurately
> >     delimit the current change.

> How will the reader know to distinguish between these two classes of
> primitives?  Without such an ability, the extra accuracy in this text
> is not useful.

A good point.  Wherein lies the difference, from a programmers point of
view?  Briefly I think that we have a "complex primitive" when we don't
have an exact match between one b-c-f and one a-c-f.  How about adding
the following paragraph to what comes above:

    The "complex primitive" case can be distinguised from the "atomic
    primitive" case because either the call to `after-change-functions'
    is missing (i.e. there are two consecutive calls to
    `before-change-functions'), or in the first call to
    `after-change-functions', `OLD-LEN' is less then `END' - `BEG' in
    `before-change-functions'.

The above leaves unsaid what happens when a "complex primitive" happens
to call b-c-f and a-c-f as though it were an "atomic primitive".  This
doesn't seem important enough to take up the space.

Personally, I think that when we come to rationalise and refactor
insdel.c and related files sometime in the medium future, we should
arrange to have b-c-f and a-c-f called only as "atomic" changes.  There
is no longer any need to optimise the calling of these hooks, and the
irregularites of these optimisations imposes an overhead on the use of
these hooks.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 11:41           ` Alan Mackenzie
@ 2018-01-05 13:00             ` Eli Zaretskii
  2018-01-05 13:34               ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-05 13:00 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Fri, 5 Jan 2018 11:41:07 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
>     The "complex primitive" case can be distinguised from the "atomic
>     primitive" case because either the call to `after-change-functions'
>     is missing (i.e. there are two consecutive calls to
>     `before-change-functions'), or in the first call to
>     `after-change-functions', `OLD-LEN' is less then `END' - `BEG' in
>     `before-change-functions'.
> 
> The above leaves unsaid what happens when a "complex primitive" happens
> to call b-c-f and a-c-f as though it were an "atomic primitive".

It also provides no way to know, up front, whether a given primitive
I'm about to call, is one or the other.  IMO, we need some way of
doing that, if we want to document this distinction.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 13:00             ` Eli Zaretskii
@ 2018-01-05 13:34               ` Alan Mackenzie
  2018-01-05 14:08                 ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-05 13:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Fri, Jan 05, 2018 at 15:00:21 +0200, Eli Zaretskii wrote:
> > Date: Fri, 5 Jan 2018 11:41:07 +0000
> > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> >     The "complex primitive" case can be distinguised from the "atomic
> >     primitive" case because either the call to `after-change-functions'
> >     is missing (i.e. there are two consecutive calls to
> >     `before-change-functions'), or in the first call to
> >     `after-change-functions', `OLD-LEN' is less then `END' - `BEG' in
> >     `before-change-functions'.

> > The above leaves unsaid what happens when a "complex primitive" happens
> > to call b-c-f and a-c-f as though it were an "atomic primitive".

> It also provides no way to know, up front, whether a given primitive
> I'm about to call, is one or the other.  IMO, we need some way of
> doing that, if we want to document this distinction.

Do we really need this level of detail?  My idea was to enable users of
b-c-f and a-c-f to predict what they're going to be being hit with.

There are two patterns of handling b/a-c-f, the "atomic" and the
"complex".  My above proposal documents enough for somebody using
b/a-c-f to be able to handle the "atomic" and "complex" uses.

Why does that hacker need to know exactly what each buffer-changing
primitive does, or which falls into which category?  Surely it is enough
that she handle the b/a-c-f calls appropriately.

What am I missing here?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 13:34               ` Alan Mackenzie
@ 2018-01-05 14:08                 ` Eli Zaretskii
  2018-01-05 15:54                   ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-05 14:08 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Fri, 5 Jan 2018 13:34:48 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> On Fri, Jan 05, 2018 at 15:00:21 +0200, Eli Zaretskii wrote:
> > > Date: Fri, 5 Jan 2018 11:41:07 +0000
> > > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > > From: Alan Mackenzie <acm@muc.de>
> 
> > >     The "complex primitive" case can be distinguised from the "atomic
> > >     primitive" case because either the call to `after-change-functions'
> > >     is missing (i.e. there are two consecutive calls to
> > >     `before-change-functions'), or in the first call to
> > >     `after-change-functions', `OLD-LEN' is less then `END' - `BEG' in
> > >     `before-change-functions'.
> 
> > > The above leaves unsaid what happens when a "complex primitive" happens
> > > to call b-c-f and a-c-f as though it were an "atomic primitive".
> 
> > It also provides no way to know, up front, whether a given primitive
> > I'm about to call, is one or the other.  IMO, we need some way of
> > doing that, if we want to document this distinction.
> 
> Do we really need this level of detail?  My idea was to enable users of
> b-c-f and a-c-f to predict what they're going to be being hit with.
> 
> There are two patterns of handling b/a-c-f, the "atomic" and the
> "complex".  My above proposal documents enough for somebody using
> b/a-c-f to be able to handle the "atomic" and "complex" uses.
> [...]
> What am I missing here?

Maybe it's me that is missing something.  You first say above that you
want to "enable users of b-c-f and a-c-f to predict what they're going
to be being hit with", which is exactly my concern, but then provide a
recipe that AFAIU only works post-factum, i.e. the user can only know
whether they called an "atomic" or a "complex" primitive by analyzing
the calls to the 2 hooks as result of calling the primitive.  If
that's indeed what you are saying, IMO it's not a useful criterion,
because generally when I read documentation, I shouldn't be required
to write code in order to interpret the documentation.

> Why does that hacker need to know exactly what each buffer-changing
> primitive does, or which falls into which category?  Surely it is enough
> that she handle the b/a-c-f calls appropriately.

How can she handle these calls correctly unless she knows which of the
hooks will be called by a given primitive, and whether these calls
will be balanced?  And if she doesn't need to know that, then why do
we have to tell here these details about the 2 classes of primitives?

IOW, accurate information is only useful if one knows exactly how to
apply it to the practical case in hand.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 14:08                 ` Eli Zaretskii
@ 2018-01-05 15:54                   ` Alan Mackenzie
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-05 15:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Fri, Jan 05, 2018 at 16:08:31 +0200, Eli Zaretskii wrote:
> > Date: Fri, 5 Jan 2018 13:34:48 +0000
> > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > On Fri, Jan 05, 2018 at 15:00:21 +0200, Eli Zaretskii wrote:
> > > > Date: Fri, 5 Jan 2018 11:41:07 +0000
> > > > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > > > From: Alan Mackenzie <acm@muc.de>

> > > >     The "complex primitive" case can be distinguised from the "atomic
> > > >     primitive" case because either the call to `after-change-functions'
> > > >     is missing (i.e. there are two consecutive calls to
> > > >     `before-change-functions'), or in the first call to
> > > >     `after-change-functions', `OLD-LEN' is less then `END' - `BEG' in
> > > >     `before-change-functions'.

> > > > The above leaves unsaid what happens when a "complex primitive" happens
> > > > to call b-c-f and a-c-f as though it were an "atomic primitive".

> > > It also provides no way to know, up front, whether a given primitive
> > > I'm about to call, is one or the other.  IMO, we need some way of
> > > doing that, if we want to document this distinction.

> > Do we really need this level of detail?  My idea was to enable users of
> > b-c-f and a-c-f to predict what they're going to be being hit with.

> > There are two patterns of handling b/a-c-f, the "atomic" and the
> > "complex".  My above proposal documents enough for somebody using
> > b/a-c-f to be able to handle the "atomic" and "complex" uses.
> > [...]
> > What am I missing here?

> Maybe it's me that is missing something.  You first say above that you
> want to "enable users of b-c-f and a-c-f to predict what they're going
> to be being hit with", which is exactly my concern, but then provide a
> recipe that AFAIU only works post-factum, i.e. the user can only know
> whether they called an "atomic" or a "complex" primitive by analyzing
> the calls to the 2 hooks as result of calling the primitive.  If
> that's indeed what you are saying, IMO it's not a useful criterion,
> because generally when I read documentation, I shouldn't be required
> to write code in order to interpret the documentation.

I think I understand what you're getting at now: that Lisp hackers will
be using these "complex" primitives in their code, and hence need to
know the b/a-c-f calling details in detail for each such primitive.
I don't think people writing modes will be using the "complex" buffer
changing primitives explicitly, at least not very much.  There are no
such calls of these primitives in CC Mode (as far as I know).

But the Lisp code will need to handle any "complex" primitives the user
throws at it, e.g. upcase-region (C-x C-u).  For this purpose, it is
only necessary to know what sequences of b/a-c-f are foreseen, so as to
be able to handle them.

> > Why does that hacker need to know exactly what each buffer-changing
> > primitive does, or which falls into which category?  Surely it is enough
> > that she handle the b/a-c-f calls appropriately.

> How can she handle these calls correctly unless she knows which of the
> hooks will be called by a given primitive, and whether these calls
> will be balanced?  And if she doesn't need to know that, then why do
> we have to tell her these details about the 2 classes of primitives?

Perhaps my idea of describing the primitives' use of b/a-c-f in the two
categories "atomic" and "complex" would create more confusion than it
would alleviate.  The "atomic" primitives constitute the overwhelming
bulk of those actually called at run time, so it seemed sensible to
describe this common, simple case separately.  Maybe this isn't the
case.

> IOW, accurate information is only useful if one knows exactly how to
> apply it to the practical case in hand.

I thought the proposed text was adequate to instruct hackers how to
write b/a-c-f's to handle _any_ existing primitives.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05  6:55         ` Eli Zaretskii
  2018-01-05 11:41           ` Alan Mackenzie
@ 2018-01-05 16:50           ` Stefan Monnier
  2018-01-05 17:38             ` Alan Mackenzie
  2018-01-05 19:53             ` Eli Zaretskii
  1 sibling, 2 replies; 66+ messages in thread
From: Stefan Monnier @ 2018-01-05 16:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Alan Mackenzie, emacs-devel

> How will the reader know to distinguish between these two classes of
> primitives?

He won't and shouldn't attempt to (the boundary between those two is an
internal implementation detail that is subject to change).

> Without such an ability, the extra accuracy in this text
> is not useful.

I find it useful in order to explain why naively observing the behavior
may give one the impression that all b-c-f and a-c-f calls are
"balanced".

Maybe the first paragraph should be reworded a bit so it doesn't sound
like a promise of behavior?  How 'bout:

    The vast bulk of buffer changes will call `before-change-functions'
    and `after-change-functions' in balanced pairs, once for each
    change where the arguments to these hooks will exactly delimit the
    change being made.  Yet, hook functions should not rely on this
    being always the case:

    Other, more complex primitives may call `before-change-functions'
    once before making changes and then call `after-change-functions'
    zero, one, or several times, depending on how many individual
    changes the primitive makes.  The `BEG' and `END' arguments to
    `before-change-functions' will enclose a region in which the
    individual changes are made, but won't necessarily be the minimal
    such region.  The `BEG', `END', and `OLD-LEN' arguments to each
    successive call of `after-change-functions' will more accurately
    delimit the current change.


-- Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 16:50           ` Stefan Monnier
@ 2018-01-05 17:38             ` Alan Mackenzie
  2018-01-05 18:09               ` Stefan Monnier
  2018-01-05 19:53             ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-05 17:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Hello, Stefan.

On Fri, Jan 05, 2018 at 11:50:56 -0500, Stefan Monnier wrote:
> > How will the reader know to distinguish between these two classes of
> > primitives?

> He won't and shouldn't attempt to (the boundary between those two is an
> internal implementation detail that is subject to change).

> > Without such an ability, the extra accuracy in this text
> > is not useful.

> I find it useful in order to explain why naively observing the behavior
> may give one the impression that all b-c-f and a-c-f calls are
> "balanced".

> Maybe the first paragraph should be reworded a bit so it doesn't sound
> like a promise of behavior?  How 'bout:

>     The vast bulk of buffer changes will call `before-change-functions'
>     and `after-change-functions' in balanced pairs, once for each
>     change where the arguments to these hooks will exactly delimit the
>     change being made.  Yet, hook functions should not rely on this
>     being always the case:

>     Other, more complex primitives may call `before-change-functions'
>     once before making changes and then call `after-change-functions'
>     zero, one, or several times, depending on how many individual
>     changes the primitive makes.  The `BEG' and `END' arguments to
>     `before-change-functions' will enclose a region in which the
>     individual changes are made, but won't necessarily be the minimal
>     such region.  The `BEG', `END', and `OLD-LEN' arguments to each
>     successive call of `after-change-functions' will more accurately
>     delimit the current change.

I like that, in general.  :-)  It gets rid of the awkward terms "atomic"
and "complex" which were more trouble than they were worth.

Just two tiny amendments:
(i) I think a comma is needed in the first paragraph after "in balanced
pairs, once for each change".

(ii) The "may" at the start of the second paragraph is not wanted.  It
suggests that b-c-f is optional.  Simply "Other, more complex primitives
call `b-c-f' once before ....".

> -- Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 17:38             ` Alan Mackenzie
@ 2018-01-05 18:09               ` Stefan Monnier
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Monnier @ 2018-01-05 18:09 UTC (permalink / raw)
  To: emacs-devel

> Just two tiny amendments:
> (i) I think a comma is needed in the first paragraph after "in balanced
> pairs, once for each change".

I'll trust you on that.  My punctuation fu is weak.

> (ii) The "may" at the start of the second paragraph is not wanted.  It
> suggests that b-c-f is optional.  Simply "Other, more complex primitives
> call `b-c-f' once before ....".

Sounds good.


        Stefan




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 16:50           ` Stefan Monnier
  2018-01-05 17:38             ` Alan Mackenzie
@ 2018-01-05 19:53             ` Eli Zaretskii
  2018-01-05 22:28               ` Stefan Monnier
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-05 19:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Alan Mackenzie <acm@muc.de>,  emacs-devel@gnu.org
> Date: Fri, 05 Jan 2018 11:50:56 -0500
> 
> I find it useful in order to explain why naively observing the behavior
> may give one the impression that all b-c-f and a-c-f calls are
> "balanced".

We don't normally include such "preemptive" explanations in the
manual.  If the text doesn't say one should expect balanced calls, the
reader has no reason to expect balanced calls.  The current text even
makes a point of saying that explicitly.

>     The vast bulk of buffer changes will call `before-change-functions'
>     and `after-change-functions' in balanced pairs, once for each
>     change where the arguments to these hooks will exactly delimit the
>     change being made.  Yet, hook functions should not rely on this
>     being always the case:
> 
>     Other, more complex primitives may call `before-change-functions'
>     once before making changes and then call `after-change-functions'
>     zero, one, or several times, depending on how many individual
>     changes the primitive makes.  The `BEG' and `END' arguments to
>     `before-change-functions' will enclose a region in which the
>     individual changes are made, but won't necessarily be the minimal
>     such region.  The `BEG', `END', and `OLD-LEN' arguments to each
>     successive call of `after-change-functions' will more accurately
>     delimit the current change.

This basically says the calls are mostly balanced, but don't rely on
that, because sometimes they aren't".  The text about
after-change-functions being called zero or more times adds
non-trivial information, but what is its practical usefulness?  Same
with the text about BEG and END.

Maybe I don't understand what are we trying to accomplish with these
changes, and that's why I fail to see why the proposed changes are for
the better.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 19:53             ` Eli Zaretskii
@ 2018-01-05 22:28               ` Stefan Monnier
  2018-01-06  9:05                 ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-05 22:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel

> We don't normally include such "preemptive" explanations in the
> manual.  If the text doesn't say one should expect balanced calls, the
> reader has no reason to expect balanced calls.

I think the name of the hooks suggests such a balance, and actual
experimentation can very easily lead the user to think that
they're balanced.  Alan may not be the "average Emacs coder", but
I think his experience is not completely unexpected.

> The current text even makes a point of saying that explicitly.

Indeed, and this discussion wants to replace this with something a bit
more specific.

> This basically says the calls are mostly balanced, but don't rely on
> that, because sometimes they aren't".

It says a bit more because it describes the way in which they're
not balanced.

> The text about after-change-functions being called zero or more times
> adds non-trivial information, but what is its practical usefulness?

It says that subsequent calls to a-c-f aren't calls with a missing b-c-f
but that they "belong" to (I think of it as "they are covered by") the
last b-c-f.

> Same with the text about BEG and END.

Not sure how important it is, but I think it can help the coder have
a mental model of the kind of unbalancedness that can occur.

> Maybe I don't understand what are we trying to accomplish with these
> changes, and that's why I fail to see why the proposed changes are for
> the better.

The current text basically says "don't rely on them being balanced" but
doesn't say what the coder can rely on if he wants to share information
between a-c-f and b-c-f.

The new text tries to be sufficiently loose that if Emacs doesn't obey
it it's actually a bug, yet sufficiently precise that an Elisp coder
can make use of it to reliably share information between a-c-f and
b-c-f.

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-05 22:28               ` Stefan Monnier
@ 2018-01-06  9:05                 ` Eli Zaretskii
  2018-01-06 15:26                   ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-06  9:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: acm@muc.de,  emacs-devel@gnu.org
> Date: Fri, 05 Jan 2018 17:28:15 -0500
> 
> > Maybe I don't understand what are we trying to accomplish with these
> > changes, and that's why I fail to see why the proposed changes are for
> > the better.
> 
> The current text basically says "don't rely on them being balanced" but
> doesn't say what the coder can rely on if he wants to share information
> between a-c-f and b-c-f.
> 
> The new text tries to be sufficiently loose that if Emacs doesn't obey
> it it's actually a bug, yet sufficiently precise that an Elisp coder
> can make use of it to reliably share information between a-c-f and
> b-c-f.

Can you describe a practical situation where an Elisp coder could use
the new text to some practical benefit, i.e. to change her
implementation to be better/more resilient (as opposed to just
enhancing her understanding of this stuff)?  I guess I don't see how
such practical benefits would be possible with the new text.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-04 21:36         ` Stefan Monnier
@ 2018-01-06 15:18           ` Alan Mackenzie
  2018-01-06 15:51             ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-06 15:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Jan 04, 2018 at 16:36:42 -0500, Stefan Monnier wrote:
> >> which is lax enough that any behavior could be argued to be acceptable.
> >> IOW I think it's too lax.  We should probably try and fix it to reflect
> >> the fact that every change should be covered by the last preceding b-c-f
> >> and should be followed by a corresponding call to a-c-f (and this
> >> before the next call to b-c-f).
> > Is that quite right?

> Probably not quite.

> > The upcase-region call in my test had no a-c-f call, almost certainly
> > because there were no lower case letters in the buffer at the time.

> Indeed, there were no changes, so no need to call a-c-f.

> > From your answers in this thread, I'm thinking that every
> > primitive-call which could change the buffer will have exactly one
> > b-c-f and zero or more a-c-f's.

> Sounds about right, tho I expect some primitives might just call insert
> and delete a few times, thus calling b-c-f several times.

> > How about something like this to replace that paragraph from the elisp
> > manual?

> >     The primitives which atomically insert or delete a contiguous chunk
> >     of text into or from a buffer will call `before-change-functions'
> >     and `after-change-functions' in balanced pairs, once for each
> >     change.  The arguments to these hooks will exactly delimit the
> >     change being made.  Calls to these primitives comprise the vast bulk
> >     of buffer changes.

> >     Other, more complex primitives aim to call `before-change-functions'
> >     once before making any changes, then to call
> >     `after-change-functions' zero, one, or several times, depending on
> >     how many individual changes the primitive makes.  The `BEG' and
> >     `END' arguments to `before-change-functions' will enclose a region
> >     in which the individual changes are made, but won't necessarily be
> >     the minimal such region.  The `BEG', `END', and `OLD-LEN' arguments
> >     to each successive call of `after-change-functions' will accurately
> >     delimit the current change.

> Looks good to me, thank you.

I've found a discrepancy.  Just one.  In (transpose-regions 1 10 11 20),
the hook calls are, in order, ((1 10) (11 20) (1 20 19)).  The two
consecutive b-c-f's happen when the two regions are of equal size and
non-contiguous.

The cause of this is not hard to find: in Ftranspose_region, editfns.c
L5204, there are two calls to modify_text on consecutive lines.  This
seems to be some sort of optimisation.  It is not done elsewhere in
Ftranspose_region.  I dare say this could be fixed easily.

> I think in the case of subst-chars-in-region we only call a-c-f one time
> (but with tighter bounds than those of the preceding b-c-f) rather than
> once per character that's substituted, so maybe "The `BEG', `END', and
> `OLD-LEN' arguments to each successive call of `after-change-functions'
> will accurately delimit the current change" promises a bit more than we
> deliver, although it depends on how we interpret "current change".

> In any case, the above is much better than what we have now and I think
> it gives a pretty good rendition of our intention.

Perhaps for Emacs-27, if we want to fix transpose-regions.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-06  9:05                 ` Eli Zaretskii
@ 2018-01-06 15:26                   ` Stefan Monnier
  0 siblings, 0 replies; 66+ messages in thread
From: Stefan Monnier @ 2018-01-06 15:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel

> Can you describe a practical situation where an Elisp coder could use
> the new text to some practical benefit, i.e. to change her
> implementation to be better/more resilient (as opposed to just
> enhancing her understanding of this stuff)?  I guess I don't see how
> such practical benefits would be possible with the new text.

In CC-mode, Alan wants to store information about the state of the
buffer before a change and then use this info after the change (IIUC
this is mostly to try and avoid recomputing parsing info about the rest
of the buffer).  With the current description, all he can say is "in
practice my hack works 99% of the time, and the doc says that
I basically can't bring it to 100%".  With the new description, it
should be possible for him to bring it to 100% (as you know I think he'd
be better off using an approach like that of syntax-ppss, but that
doesn't mean we shouldn't try to make it possible to do it reliably his
way).

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-06 15:18           ` Alan Mackenzie
@ 2018-01-06 15:51             ` Stefan Monnier
  2018-01-06 16:18               ` Eli Zaretskii
  2018-01-06 20:24               ` Alan Mackenzie
  0 siblings, 2 replies; 66+ messages in thread
From: Stefan Monnier @ 2018-01-06 15:51 UTC (permalink / raw)
  To: emacs-devel

> I've found a discrepancy.  Just one.  In (transpose-regions 1 10 11 20),
> the hook calls are, in order, ((1 10) (11 20) (1 20 19)).  The two
> consecutive b-c-f's happen when the two regions are of equal size and
> non-contiguous.

Sounds like a bug alright.

> The cause of this is not hard to find: in Ftranspose_region, editfns.c
> L5204, there are two calls to modify_text on consecutive lines.  This
> seems to be some sort of optimisation.  It is not done elsewhere in
> Ftranspose_region.  I dare say this could be fixed easily.

Feel free to fix it (on master).

> Perhaps for Emacs-27, if we want to fix transpose-regions.

AFAIC this text can be installed in emacs-26 even if there's still a known
bug where we don't behave as promised.  For all we know, there are yet
more (not yet known) such bugs anyway.


        Stefan




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-06 15:51             ` Stefan Monnier
@ 2018-01-06 16:18               ` Eli Zaretskii
  2018-01-06 19:06                 ` Alan Mackenzie
  2018-01-06 20:24               ` Alan Mackenzie
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-06 16:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 06 Jan 2018 10:51:30 -0500
> 
> AFAIC this text can be installed in emacs-26 even if there's still a known
> bug where we don't behave as promised.  For all we know, there are yet
> more (not yet known) such bugs anyway.

FWIW, I don't object to changing the manual on the release branch,
although I'm still not convinced it will be more helpful in practice.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-06 16:18               ` Eli Zaretskii
@ 2018-01-06 19:06                 ` Alan Mackenzie
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-06 19:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

Hello, Eli.

On Sat, Jan 06, 2018 at 18:18:09 +0200, Eli Zaretskii wrote:
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Date: Sat, 06 Jan 2018 10:51:30 -0500

> > AFAIC this text can be installed in emacs-26 even if there's still a known
> > bug where we don't behave as promised.  For all we know, there are yet
> > more (not yet known) such bugs anyway.

> FWIW, I don't object to changing the manual on the release branch,
> although I'm still not convinced it will be more helpful in practice.

Thanks.  I've just committed this change to the emacs-26 branch.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-06 15:51             ` Stefan Monnier
  2018-01-06 16:18               ` Eli Zaretskii
@ 2018-01-06 20:24               ` Alan Mackenzie
  2018-01-07 11:36                 ` Alan Mackenzie
  1 sibling, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-06 20:24 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sat, Jan 06, 2018 at 10:51:30 -0500, Stefan Monnier wrote:
> > I've found a discrepancy.  Just one.  In (transpose-regions 1 10 11 20),
> > the hook calls are, in order, ((1 10) (11 20) (1 20 19)).  The two
> > consecutive b-c-f's happen when the two regions are of equal size and
> > non-contiguous.

> Sounds like a bug alright.

> > The cause of this is not hard to find: in Ftranspose_region, editfns.c
> > L5204, there are two calls to modify_text on consecutive lines.  This
> > seems to be some sort of optimisation.  It is not done elsewhere in
> > Ftranspose_region.  I dare say this could be fixed easily.

> Feel free to fix it (on master).

Done.

[ .... ]

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-06 20:24               ` Alan Mackenzie
@ 2018-01-07 11:36                 ` Alan Mackenzie
  2018-01-07 11:49                   ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-07 11:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Hello, Stefan.

On Sat, Jan 06, 2018 at 20:24:03 +0000, Alan Mackenzie wrote:
> On Sat, Jan 06, 2018 at 10:51:30 -0500, Stefan Monnier wrote:
> > > I've found a discrepancy.  Just one.  In (transpose-regions 1 10 11 20),
> > > the hook calls are, in order, ((1 10) (11 20) (1 20 19)).  The two
> > > consecutive b-c-f's happen when the two regions are of equal size and
> > > non-contiguous.

> > Sounds like a bug alright.

> > > The cause of this is not hard to find: in Ftranspose_region, editfns.c
> > > L5204, there are two calls to modify_text on consecutive lines.  This
> > > seems to be some sort of optimisation.  It is not done elsewhere in
> > > Ftranspose_region.  I dare say this could be fixed easily.

> > Feel free to fix it (on master).

> Done.

> [ .... ]

> >         Stefan

I've just taken the liberty of fixing (in master) another such bug in
base64-decode-region, where the a-c-f call after a text insertion was
missing.

I'm aware of exactly one further bug of this nature: in
zlib-decompress-region, there are no calls to the change hooks at all.
To fix this would be somewhat more involved.  Should it be fixed?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 11:36                 ` Alan Mackenzie
@ 2018-01-07 11:49                   ` Eli Zaretskii
  2018-01-07 12:08                     ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-07 11:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Sun, 7 Jan 2018 11:36:28 +0000
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> I'm aware of exactly one further bug of this nature: in
> zlib-decompress-region, there are no calls to the change hooks at all.

zlib-decompress-region calls del_range, which calls the hooks.  Or am
I missing something?



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 11:49                   ` Eli Zaretskii
@ 2018-01-07 12:08                     ` Alan Mackenzie
  2018-01-07 13:56                       ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-07 12:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Sun, Jan 07, 2018 at 13:49:17 +0200, Eli Zaretskii wrote:
> > Date: Sun, 7 Jan 2018 11:36:28 +0000
> > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > I'm aware of exactly one further bug of this nature: in
> > zlib-decompress-region, there are no calls to the change hooks at all.

> zlib-decompress-region calls del_range, which calls the hooks.  Or am
> I missing something?

Yes, you're right, sorry.  There are change-hook calls for the second,
deletion, part of the operation.  No hooks are called for the first
part, the insertion of the decompressed text into the buffer.  So
zlib-decompress-region is still not right.

I'm going to take a careful look for any more such primitives, which
have one balanced pair of change hook calls, yet these calls only cover
part of the operation.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 12:08                     ` Alan Mackenzie
@ 2018-01-07 13:56                       ` Alan Mackenzie
  2018-01-07 15:21                         ` [SUSPECTED SPAM] " Stefan Monnier
  2018-01-07 16:47                         ` Eli Zaretskii
  0 siblings, 2 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-07 13:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello again, Eli.

On Sun, Jan 07, 2018 at 12:08:59 +0000, Alan Mackenzie wrote:
> On Sun, Jan 07, 2018 at 13:49:17 +0200, Eli Zaretskii wrote:
> > > Date: Sun, 7 Jan 2018 11:36:28 +0000
> > > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> > > From: Alan Mackenzie <acm@muc.de>

> > > I'm aware of exactly one further bug of this nature: in
> > > zlib-decompress-region, there are no calls to the change hooks at all.

> > zlib-decompress-region calls del_range, which calls the hooks.  Or am
> > I missing something?

> Yes, you're right, sorry.  There are change-hook calls for the second,
> deletion, part of the operation.  No hooks are called for the first
> part, the insertion of the decompressed text into the buffer.  So
> zlib-decompress-region is still not right.

> I'm going to take a careful look for any more such primitives, which
> have one balanced pair of change hook calls, yet these calls only cover
> part of the operation.

There appear to be no more such primitives.

Should I try to fix zlib-decompress-region?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* [SUSPECTED SPAM] Re: Lisp primitives and their calling of the change hooks
  2018-01-07 13:56                       ` Alan Mackenzie
@ 2018-01-07 15:21                         ` Stefan Monnier
  2018-01-07 16:47                         ` Eli Zaretskii
  1 sibling, 0 replies; 66+ messages in thread
From: Stefan Monnier @ 2018-01-07 15:21 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

> Should I try to fix zlib-decompress-region?

I would appreciate it, yes,


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 13:56                       ` Alan Mackenzie
  2018-01-07 15:21                         ` [SUSPECTED SPAM] " Stefan Monnier
@ 2018-01-07 16:47                         ` Eli Zaretskii
  2018-01-07 17:50                           ` Stefan Monnier
  2018-01-07 17:54                           ` Alan Mackenzie
  1 sibling, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-07 16:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Sun, 7 Jan 2018 13:56:29 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> > > zlib-decompress-region calls del_range, which calls the hooks.  Or am
> > > I missing something?
> 
> > Yes, you're right, sorry.  There are change-hook calls for the second,
> > deletion, part of the operation.  No hooks are called for the first
> > part, the insertion of the decompressed text into the buffer.  So
> > zlib-decompress-region is still not right.
> 
> > I'm going to take a careful look for any more such primitives, which
> > have one balanced pair of change hook calls, yet these calls only cover
> > part of the operation.
> 
> There appear to be no more such primitives.
> 
> Should I try to fix zlib-decompress-region?

How do you want to fix it?  That function inserts the decompressed
stuff, then deletes the original compressed stuff.  What do you want
that to produce in terms of the change hooks?



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 16:47                         ` Eli Zaretskii
@ 2018-01-07 17:50                           ` Stefan Monnier
  2018-01-07 17:58                             ` Eli Zaretskii
  2018-01-07 17:54                           ` Alan Mackenzie
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-07 17:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Alan Mackenzie, emacs-devel

>> Should I try to fix zlib-decompress-region?
> How do you want to fix it?  That function inserts the decompressed
> stuff, then deletes the original compressed stuff.  What do you want
> that to produce in terms of the change hooks?

Either one b-c-f at the beginning and and one a-c-f at the end,
or one b-c-f/a-c-f pair for the insertion followed by another for
the deletion.

It doesn't matter much which.


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 16:47                         ` Eli Zaretskii
  2018-01-07 17:50                           ` Stefan Monnier
@ 2018-01-07 17:54                           ` Alan Mackenzie
  2018-01-07 18:05                             ` Eli Zaretskii
  1 sibling, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-07 17:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Sun, Jan 07, 2018 at 18:47:41 +0200, Eli Zaretskii wrote:
> > Date: Sun, 7 Jan 2018 13:56:29 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org

> > > > zlib-decompress-region calls del_range, which calls the hooks.  Or am
> > > > I missing something?

> > > Yes, you're right, sorry.  There are change-hook calls for the second,
> > > deletion, part of the operation.  No hooks are called for the first
> > > part, the insertion of the decompressed text into the buffer.  So
> > > zlib-decompress-region is still not right.

> > > I'm going to take a careful look for any more such primitives, which
> > > have one balanced pair of change hook calls, yet these calls only cover
> > > part of the operation.

> > There appear to be no more such primitives.

> > Should I try to fix zlib-decompress-region?

> How do you want to fix it?  That function inserts the decompressed
> stuff, then deletes the original compressed stuff.  What do you want
> that to produce in terms of the change hooks?

IIUC, zlib-decompress-region decompresses in chunks into the gap,
expanding the gap after each chunk.

The strategy of one overarching before-change-functions followed by lots
of little after-change-functions can't really work here, since we don't
know in advance how big the region being written to is.  In fact, I think
this strategy is used only when the individual changes don't change the
buffer's size, e.g. when doing upcase-region, or
translate-region-internal.  We don't have this here.

One way of doing it would be to have a balanced pair of b/a-c-f for each
chunk of text inserted.  The other way would be having a single b/a-c-f
pair.

For reasons I can't articulate, I think the single pair of hooks would be
better; less "noisy", perhaps.  But either strategy would work well.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 17:50                           ` Stefan Monnier
@ 2018-01-07 17:58                             ` Eli Zaretskii
  2018-01-07 19:04                               ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-07 17:58 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Alan Mackenzie <acm@muc.de>,  emacs-devel@gnu.org
> Date: Sun, 07 Jan 2018 12:50:26 -0500
> 
> >> Should I try to fix zlib-decompress-region?
> > How do you want to fix it?  That function inserts the decompressed
> > stuff, then deletes the original compressed stuff.  What do you want
> > that to produce in terms of the change hooks?
> 
> Either one b-c-f at the beginning and and one a-c-f at the end,
> or one b-c-f/a-c-f pair for the insertion followed by another for
> the deletion.
> 
> It doesn't matter much which.

I think only the former makes sense, because this function is actually
a replacement function.

Also note that zlib-decompress-region works only in unibyte buffers,
so in practice almost every caller will immediately call
decode-coding-region or its ilk, which calls the hooks again.  But
unlike zlib-decompress-region, the decoding stuff will be able to0
report character positions, not byte positions.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 17:54                           ` Alan Mackenzie
@ 2018-01-07 18:05                             ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-07 18:05 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Sun, 7 Jan 2018 17:54:18 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> One way of doing it would be to have a balanced pair of b/a-c-f for each
> chunk of text inserted.  The other way would be having a single b/a-c-f
> pair.
> 
> For reasons I can't articulate, I think the single pair of hooks would be
> better; less "noisy", perhaps.  But either strategy would work well.

I like the single pair alternative better, also because no other
caller of insert_from_gap calls the hooks before and after that call.
It's too low-level an action to bother users with it.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 17:58                             ` Eli Zaretskii
@ 2018-01-07 19:04                               ` Stefan Monnier
  2018-01-07 19:48                                 ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-07 19:04 UTC (permalink / raw)
  To: emacs-devel

> I think only the former makes sense, because this function is actually
> a replacement function.

When we perform a "replacement" from Elisp, we do it with insert +
delete-region so there are two calls to each of b-c-f and a-c-f.
So two calls also makes sense.

We could even have one call to at the beginning b-c-f followed by two
calls to a-c-f (one right after inserting the new text and one after
deleting the old text).  That would also be acceptable.

And in reality, this choice doesn't really matter (as evidenced by the
fact that noone noticed those calls were missing until now).

We should just pick the one that's easier to implement (and in case we
still a preference because the implementation is just as easy, the
"single calls to b-c-f/a-c-f" is of course the better option).

> Also note that zlib-decompress-region works only in unibyte buffers,
> so in practice almost every caller will immediately call
> decode-coding-region or its ilk, which calls the hooks again.  But
> unlike zlib-decompress-region, the decoding stuff will be able to0
> report character positions, not byte positions.

In practice, I'd also expect both b-c-f and a-c-f to be nil (or
equivalent) when we call zlib-decompress-region.

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 19:04                               ` Stefan Monnier
@ 2018-01-07 19:48                                 ` Alan Mackenzie
  2018-01-07 19:58                                   ` Eli Zaretskii
  2018-01-08  4:29                                   ` Stefan Monnier
  0 siblings, 2 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-07 19:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel


Hello, Stefan and Eli.

On Sun, Jan 07, 2018 at 14:04:46 -0500, Stefan Monnier wrote:
> > I think only the former makes sense, because this function is actually
> > a replacement function.

> When we perform a "replacement" from Elisp, we do it with insert +
> delete-region so there are two calls to each of b-c-f and a-c-f.
> So two calls also makes sense.

> We could even have one call to at the beginning b-c-f followed by two
> calls to a-c-f (one right after inserting the new text and one after
> deleting the old text).  That would also be acceptable.

> And in reality, this choice doesn't really matter (as evidenced by the
> fact that noone noticed those calls were missing until now).

> We should just pick the one that's easier to implement (and in case we
> still a preference because the implementation is just as easy, the
> "single calls to b-c-f/a-c-f" is of course the better option).

Yes, that's what I've done.  Each zlib-decompress-region now has two
separate pairs of b/a-c-f, one for insert, then the other for delete.
This is the way that other "coding" primitives, like
base64-encode-region work.

> > Also note that zlib-decompress-region works only in unibyte buffers,
> > so in practice almost every caller will immediately call
> > decode-coding-region or its ilk, which calls the hooks again.  But
> > unlike zlib-decompress-region, the decoding stuff will be able to0
> > report character positions, not byte positions.

> In practice, I'd also expect both b-c-f and a-c-f to be nil (or
> equivalent) when we call zlib-decompress-region.

Yes.  But it's nice to have things consistent.  :-)

Anyhow, here's the patch, which appears to work as described above.  Any
comments?  Otherwise I'll commit it.



diff --git a/src/decompress.c b/src/decompress.c
index 41de6da1dd..07545544aa 100644
--- a/src/decompress.c
+++ b/src/decompress.c
@@ -141,6 +141,10 @@ This function can be called only in unibyte buffers.  */)
      the same.  */
   istart = XINT (start);
   iend = XINT (end);
+
+  /* Do the following before manipulating the gap. */
+  modify_text (iend, iend);
+
   move_gap_both (iend, iend);
 
   stream.zalloc = Z_NULL;
@@ -190,6 +194,8 @@ This function can be called only in unibyte buffers.  */)
     }
   while (inflate_status == Z_OK);
 
+  signal_after_change (iend, 0, unwind_data.nbytes);
+
   if (inflate_status != Z_STREAM_END)
     return unbind_to (count, Qnil);
 


>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 19:48                                 ` Alan Mackenzie
@ 2018-01-07 19:58                                   ` Eli Zaretskii
  2018-01-07 21:10                                     ` Alan Mackenzie
  2018-01-08  4:29                                   ` Stefan Monnier
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-07 19:58 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Sun, 7 Jan 2018 19:48:45 +0000
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> Anyhow, here's the patch, which appears to work as described above.  Any
> comments?  Otherwise I'll commit it.

Like I said, I think we should have only one pair of hooks, like we
have in replace_range.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 19:58                                   ` Eli Zaretskii
@ 2018-01-07 21:10                                     ` Alan Mackenzie
  2018-01-08  3:41                                       ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-07 21:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Sun, Jan 07, 2018 at 21:58:33 +0200, Eli Zaretskii wrote:
> > Date: Sun, 7 Jan 2018 19:48:45 +0000
> > Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > Anyhow, here's the patch, which appears to work as described above.  Any
> > comments?  Otherwise I'll commit it.

> Like I said, I think we should have only one pair of hooks, like we
> have in replace_range.

OK.  This is slightly more awkward: del_range needs to be replaced by
del_range_2, so as not to get an extra call to signal_after_change.
There is also no a-c-f call if the decompression exits with an error.
(It would be possible but a little awkward to do this.)



diff --git a/src/decompress.c b/src/decompress.c
index 41de6da1dd..f2dde7248e 100644
--- a/src/decompress.c
+++ b/src/decompress.c
@@ -141,6 +141,10 @@ This function can be called only in unibyte buffers.  */)
      the same.  */
   istart = XINT (start);
   iend = XINT (end);
+
+  /* Do the following before manipulating the gap. */
+  modify_text (istart, iend);
+
   move_gap_both (iend, iend);
 
   stream.zalloc = Z_NULL;
@@ -196,7 +200,10 @@ This function can be called only in unibyte buffers.  */)
   unwind_data.start = 0;
 
   /* Delete the compressed data.  */
-  del_range (istart, iend);
+  del_range_2 (istart, istart, /* byte and char offsets are the same. */
+               iend, iend, 0);
+
+  signal_after_change (istart, iend - istart, unwind_data.nbytes);
 
   return unbind_to (count, Qt);
 }


-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 21:10                                     ` Alan Mackenzie
@ 2018-01-08  3:41                                       ` Eli Zaretskii
  2018-01-08 19:24                                         ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-08  3:41 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Sun, 7 Jan 2018 21:10:55 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > Like I said, I think we should have only one pair of hooks, like we
> > have in replace_range.
> 
> OK.  This is slightly more awkward: del_range needs to be replaced by
> del_range_2, so as not to get an extra call to signal_after_change.

You need a call to update_compositions after del_range_2, to mimic
what del_range does, I think.

> There is also no a-c-f call if the decompression exits with an error.

You mean, if the user quits?  That throws to top level, so it would be
wrong to invoke any after-change hooks, and unwind_decompress will
call the hooks for the partially uncompressed data.  Do we need more?

Thanks.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-07 19:48                                 ` Alan Mackenzie
  2018-01-07 19:58                                   ` Eli Zaretskii
@ 2018-01-08  4:29                                   ` Stefan Monnier
  1 sibling, 0 replies; 66+ messages in thread
From: Stefan Monnier @ 2018-01-08  4:29 UTC (permalink / raw)
  To: emacs-devel

> Yes, that's what I've done.  Each zlib-decompress-region now has two
> separate pairs of b/a-c-f, one for insert, then the other for delete.
> This is the way that other "coding" primitives, like
> base64-encode-region work.

LGTM.

>> In practice, I'd also expect both b-c-f and a-c-f to be nil (or
>> equivalent) when we call zlib-decompress-region.
> Yes.  But it's nice to have things consistent.  :-)

Agreed.  There's no justification for inconsistency here.

> Anyhow, here's the patch, which appears to work as described above.  Any
> comments?  Otherwise I'll commit it.

The subsequent patch which tries to reduce the number of calls to
bcf/acf seems worse because it's more complex (and the marginal benefit
doesn't seem to justify the complexity).


        Stefan




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-08  3:41                                       ` Eli Zaretskii
@ 2018-01-08 19:24                                         ` Alan Mackenzie
  2018-01-08 21:15                                           ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-08 19:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Mon, Jan 08, 2018 at 05:41:02 +0200, Eli Zaretskii wrote:
> > Date: Sun, 7 Jan 2018 21:10:55 +0000
> > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > Like I said, I think we should have only one pair of hooks, like we
> > > have in replace_range.

> > OK.  This is slightly more awkward: del_range needs to be replaced by
> > del_range_2, so as not to get an extra call to signal_after_change.

> You need a call to update_compositions after del_range_2, to mimic
> what del_range does, I think.

Yes, I think so to (though I'm only vaguely aware of what compositions
are ;-).  See below.

> > There is also no a-c-f call if the decompression exits with an error.

> You mean, if the user quits?  That throws to top level, so it would be
> wrong to invoke any after-change hooks, and unwind_decompress will
> call the hooks for the partially uncompressed data.  Do we need more?

I was thinking more of when the compressed text is corrupt and the
decompression routines report an error.  With the current version (see
below), I simulate an error with

    (zlib-decompress-region (point-min) (1- (point-max)))

, i.e. chopping the last byte off the buffer.  (I am using my Linux
configuration from /proc/config.gz for all this.)

This produces these hook calls:

    (1 22016)    (22016 118057) (22016 22016 96041)

, the first being the opening modify_text(), and the last two being the
deletion of the incomplete decompressed region by a call to del_range()
from an unwind-protect.  The (1 22016) b-c-f is thus unbalanced when this
happens.

I'm asking you to consider again having two pairs of hook calls in this
primitive (as, for example, base64-decode-region does).  That way we need
only signal the b-c-f for the deletion after the decompression has
worked, and we know we are going to follow through with the deleteion.  I
think an aborted decompression operation would also be easier to close
off with an a-c-f with this strategy.

Here's the amended patch from last night with update_compositions in:

diff --git a/src/decompress.c b/src/decompress.c
index 41de6da1dd..eebaa2eb30 100644
--- a/src/decompress.c
+++ b/src/decompress.c
@@ -24,6 +24,7 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */

 #include "lisp.h"
 #include "buffer.h"
+#include "composite.h"

 #include <verify.h>

@@ -141,6 +142,10 @@ This function can be called only in unibyte buffers.  */)
      the same.  */
   istart = XINT (start);
   iend = XINT (end);
+
+  /* Do the following before manipulating the gap. */
+  modify_text (istart, iend);
+
   move_gap_both (iend, iend);

   stream.zalloc = Z_NULL;
@@ -196,7 +201,11 @@ This function can be called only in unibyte buffers.  */)
   unwind_data.start = 0;

   /* Delete the compressed data.  */
-  del_range (istart, iend);
+  del_range_2 (istart, istart, /* byte and char offsets are the same. */
+               iend, iend, 0);
+
+  signal_after_change (istart, iend - istart, unwind_data.nbytes);
+  update_compositions (istart, istart, CHECK_HEAD);

   return unbind_to (count, Qt);
 }

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-08 19:24                                         ` Alan Mackenzie
@ 2018-01-08 21:15                                           ` Eli Zaretskii
  2018-01-08 22:24                                             ` Stefan Monnier
  2018-01-09 19:53                                             ` Alan Mackenzie
  0 siblings, 2 replies; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-08 21:15 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Mon, 8 Jan 2018 19:24:15 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > > There is also no a-c-f call if the decompression exits with an error.
> 
> > You mean, if the user quits?  That throws to top level, so it would be
> > wrong to invoke any after-change hooks, and unwind_decompress will
> > call the hooks for the partially uncompressed data.  Do we need more?
> 
> I was thinking more of when the compressed text is corrupt and the
> decompression routines report an error.

What do other primitives do when there's an error?  That was never
reported in this discussion nor discussed, AFAICT.  Up front, I see no
reason to keep any promises when that happens.

> The (1 22016) b-c-f is thus unbalanced when this happens.

If this is really important (and I don't see why it would be), you can
add a call to after-change-hooks before unbind_to of the error return.

> I'm asking you to consider again having two pairs of hook calls in this
> primitive (as, for example, base64-decode-region does).  That way we need
> only signal the b-c-f for the deletion after the decompression has
> worked, and we know we are going to follow through with the deleteion.  I
> think an aborted decompression operation would also be easier to close
> off with an a-c-f with this strategy.

Is implementation convenience the only argument for Stefan's variant?
If so, it doesn't convince me, as the difference is barely tangible.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-08 21:15                                           ` Eli Zaretskii
@ 2018-01-08 22:24                                             ` Stefan Monnier
  2018-01-09  3:55                                               ` Eli Zaretskii
  2018-01-09 19:53                                             ` Alan Mackenzie
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-08 22:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Alan Mackenzie, emacs-devel

> What do other primitives do when there's an error?

They follow the same rules, AFAIK (e.g. they may for example call b-c-f
and nothing else if no modifications were performed).

> Up front, I see no reason to keep any promises when that happens.

We should still obey the general rules about b-c-f and a-c-f.

>> The (1 22016) b-c-f is thus unbalanced when this happens.
> If this is really important (and I don't see why it would be),

Indeed, it doesn't look like a problem.

> Is implementation convenience the only argument for Stefan's variant?

Not just convenience but also "obviously correct", i.e. more maintainable.


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-08 22:24                                             ` Stefan Monnier
@ 2018-01-09  3:55                                               ` Eli Zaretskii
  2018-01-09 13:30                                                 ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-09  3:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 08 Jan 2018 17:24:14 -0500
> Cc: Alan Mackenzie <acm@muc.de>, emacs-devel@gnu.org
> 
> > Is implementation convenience the only argument for Stefan's variant?
> 
> Not just convenience but also "obviously correct", i.e. more maintainable.

We obviously disagree about what's "correct" in this case, so this
argument doesn't convince me, exactly like my arguments didn't
convince you.

As for maintainability, I think this is beyond splitting hair, because
I cannot for the life of me see any difference in maintainability
between the two variants.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-09  3:55                                               ` Eli Zaretskii
@ 2018-01-09 13:30                                                 ` Stefan Monnier
  2018-01-09 18:50                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-09 13:30 UTC (permalink / raw)
  To: emacs-devel

>> Not just convenience but also "obviously correct", i.e. more maintainable.
> We obviously disagree about what's "correct" in this case, so this
> argument doesn't convince me, exactly like my arguments didn't
> convince you.

Calling `del_range` is "obviously correct" because it's higher level and
guarantees we follow various rules, such as those of b-c-f/a-c-f.
In contract `del_range_2` is a lower-level function which requires more
care to use.  For example a maintainer such as myself would not notice
if a call to del_range_2 is missing a subsequent call to
signal_after_change and even less a call to update_compositions.

So using del_range is more obviously correct.

Having a single call to b-c-f/a-c-f rather than two is not a correctness
issue, only a "quality of implementation" issue.

> As for maintainability, I think this is beyond splitting hair, because
> I cannot for the life of me see any difference in maintainability
> between the two variants.

The fact that someone like Alan failed to notice the need for a call to
update_compositions is good enough evidence for me.

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-09 13:30                                                 ` Stefan Monnier
@ 2018-01-09 18:50                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-09 18:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Tue, 09 Jan 2018 08:30:21 -0500
> 
> >> Not just convenience but also "obviously correct", i.e. more maintainable.
> > We obviously disagree about what's "correct" in this case, so this
> > argument doesn't convince me, exactly like my arguments didn't
> > convince you.
> 
> Calling `del_range` is "obviously correct" because it's higher level and
> guarantees we follow various rules, such as those of b-c-f/a-c-f.
> In contract `del_range_2` is a lower-level function which requires more
> care to use.

There are 3 other places in our sources that call del_range_2, so I
see no reason to fear of one more, or proclaim that "incorrect" or
"less obviously correct".

> > As for maintainability, I think this is beyond splitting hair, because
> > I cannot for the life of me see any difference in maintainability
> > between the two variants.
> 
> The fact that someone like Alan failed to notice the need for a call to
> update_compositions is good enough evidence for me.

No, it just is another manifestation of a known truism that patch
review is a very useful and efficient technique for catching bugs
before they happen.  We all make mistakes, and the fact that we do
doesn't necessarily mean the code we wrote is "incorrect" or
"difficult to maintain".



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-08 21:15                                           ` Eli Zaretskii
  2018-01-08 22:24                                             ` Stefan Monnier
@ 2018-01-09 19:53                                             ` Alan Mackenzie
  2018-01-09 20:05                                               ` Eli Zaretskii
  2018-01-09 20:07                                               ` Stefan Monnier
  1 sibling, 2 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-09 19:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Mon, Jan 08, 2018 at 23:15:11 +0200, Eli Zaretskii wrote:
> > Date: Mon, 8 Jan 2018 19:24:15 +0000
> > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > > > There is also no a-c-f call if the decompression exits with an error.

> > > You mean, if the user quits?  That throws to top level, so it would be
> > > wrong to invoke any after-change hooks, and unwind_decompress will
> > > call the hooks for the partially uncompressed data.  Do we need more?

> > I was thinking more of when the compressed text is corrupt and the
> > decompression routines report an error.

> What do other primitives do when there's an error?  That was never
> reported in this discussion nor discussed, AFAICT.  Up front, I see no
> reason to keep any promises when that happens.

base64-decode-region first decodes into a mallocked area of store, and
if that has worked, inserts that area, then deletes the original.  This
gives two balanced pairs of b/a-c-f.  If there's an error in decoding,
there is neither buffer manipulation nor change hook calls.

base64-encode-region does the same, using a mallocked area, and with two
balanced pairs of b/a-c-f on success.

> > The (1 22016) b-c-f is thus unbalanced when this happens.

> If this is really important (and I don't see why it would be), you can
> add a call to after-change-hooks before unbind_to of the error return.

I think b/a-c-f calls should follow the rules whether a primitive
operation succeeds or fails.  This will enable operations partly done by
before-change-functions to be backed out of without leaving a buffer in
an inconsistent state.

> > I'm asking you to consider again having two pairs of hook calls in this
> > primitive (as, for example, base64-decode-region does).  That way we need
> > only signal the b-c-f for the deletion after the decompression has
> > worked, and we know we are going to follow through with the deleteion.  I
> > think an aborted decompression operation would also be easier to close
> > off with an a-c-f with this strategy.

> Is implementation convenience the only argument for Stefan's variant?
> If so, it doesn't convince me, as the difference is barely tangible.

The point is, that zlib-decompress-region is NOT an atomic change, and
trying to treat it like one will tie us in knots.

base64-de/encode-region ARE atomic from the buffer manipulation point of
view - the replacement is either wholly done or not done at all, and this
is known in advance.

zlib-decompress-region, by contrast, always writes the decompressed bit
into the buffer, then deletes a portion of buffer, but does not know in
advance which bit of buffer will be deleted.  (It depends on the success
of the decompression operation.)

The way I see it, to announce in advance that the original region will be
deleted (by calling b-c-f for it) is suboptimal.  If the decompression
fails, we need to balance that b-c-f with a "fake" a-c-f call.

There are several primitives which have more than one balanced b/a-c-f
call in execution: examples are the base64- ones, move-to-column (when it
replaces tabs), replace-buffer-contents, insert.  (If you are interested
enough, I can send you my test file.)

However, I think we'll agree that we've already spent enough time
debating this issue.  If you still say one b/a-c-f pair for a
(successful) zlib-decompress-region call, I will accept it, and fix the
b/a-c-f call of the unsuccessful case.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-09 19:53                                             ` Alan Mackenzie
@ 2018-01-09 20:05                                               ` Eli Zaretskii
  2018-01-10 18:29                                                 ` Alan Mackenzie
  2018-01-09 20:07                                               ` Stefan Monnier
  1 sibling, 1 reply; 66+ messages in thread
From: Eli Zaretskii @ 2018-01-09 20:05 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Tue, 9 Jan 2018 19:53:57 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> However, I think we'll agree that we've already spent enough time
> debating this issue.  If you still say one b/a-c-f pair for a
> (successful) zlib-decompress-region call, I will accept it, and fix the
> b/a-c-f call of the unsuccessful case.

Yes, I still think that zlib-decompress-region is a replacement
primitive (and base64-decode-region is in error not behaving the
same).

Thanks.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-09 19:53                                             ` Alan Mackenzie
  2018-01-09 20:05                                               ` Eli Zaretskii
@ 2018-01-09 20:07                                               ` Stefan Monnier
  2018-01-10 18:45                                                 ` Alan Mackenzie
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-09 20:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

> The way I see it, to announce in advance that the original region will be
> deleted (by calling b-c-f for it) is suboptimal.

But not incorrect.

> If the decompression fails, we need to balance that b-c-f with
> a "fake" a-c-f call.

No, we don't.  It's still within the rules to announce an upcoming
change with b-c-f and then not to carry through (i.e. not make any
changes and not call a-c-f).


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-09 20:05                                               ` Eli Zaretskii
@ 2018-01-10 18:29                                                 ` Alan Mackenzie
  2018-01-12 16:40                                                   ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-10 18:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Tue, Jan 09, 2018 at 22:05:50 +0200, Eli Zaretskii wrote:
> > Date: Tue, 9 Jan 2018 19:53:57 +0000
> > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > However, I think we'll agree that we've already spent enough time
> > debating this issue.  If you still say one b/a-c-f pair for a
> > (successful) zlib-decompress-region call, I will accept it, and fix the
> > b/a-c-f call of the unsuccessful case.

> Yes, I still think that zlib-decompress-region is a replacement
> primitive (and base64-decode-region is in error not behaving the
> same).

You want me to fix it?  (Half joking.)

> Thanks.

OK, a patch, perhaps for the last time.  In a call to z-d-r, successful
or not, there will be exactly one b-c-f call and exactly one a-c-f call.
I think, for the failing case, this is neater than having a bare b-c-f
with nothing following it.



diff --git a/src/decompress.c b/src/decompress.c
index 41de6da1dd..6f75f821c4 100644
--- a/src/decompress.c
+++ b/src/decompress.c
@@ -24,6 +24,7 @@ along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.  */
 
 #include "lisp.h"
 #include "buffer.h"
+#include "composite.h"
 
 #include <verify.h>
 
@@ -66,7 +67,7 @@ init_zlib_functions (void)
 \f
 struct decompress_unwind_data
 {
-  ptrdiff_t old_point, start, nbytes;
+  ptrdiff_t old_point, orig, start, nbytes;
   z_stream *stream;
 };
 
@@ -76,10 +77,19 @@ unwind_decompress (void *ddata)
   struct decompress_unwind_data *data = ddata;
   inflateEnd (data->stream);
 
-  /* Delete any uncompressed data already inserted on error.  */
+  /* Delete any uncompressed data already inserted on error, but
+     without calling the change hooks.  */
   if (data->start)
-    del_range (data->start, data->start + data->nbytes);
-
+    {
+      del_range_2 (data->start, data->start, /* byte, char offsets the same */
+                   data->start + data->nbytes, data->start + data->nbytes,
+                   0);
+      update_compositions (data->start, data->start, CHECK_HEAD);
+      /* "Balance" the before-change-functions call, which would
+         otherwise be left "hanging". */
+      signal_after_change (data->orig, data->start - data->orig,
+                           data->start - data->orig);
+    }
   /* Put point where it was, or if the buffer has shrunk because the
      compressed data is bigger than the uncompressed, at
      point-max.  */
@@ -141,6 +151,10 @@ This function can be called only in unibyte buffers.  */)
      the same.  */
   istart = XINT (start);
   iend = XINT (end);
+
+  /* Do the following before manipulating the gap. */
+  modify_text (istart, iend);
+
   move_gap_both (iend, iend);
 
   stream.zalloc = Z_NULL;
@@ -154,6 +168,7 @@ This function can be called only in unibyte buffers.  */)
   if (inflateInit2 (&stream, MAX_WBITS + 32) != Z_OK)
     return Qnil;
 
+  unwind_data.orig = istart;
   unwind_data.start = iend;
   unwind_data.stream = &stream;
   unwind_data.old_point = PT;
@@ -196,7 +211,11 @@ This function can be called only in unibyte buffers.  */)
   unwind_data.start = 0;
 
   /* Delete the compressed data.  */
-  del_range (istart, iend);
+  del_range_2 (istart, istart, /* byte and char offsets are the same. */
+               iend, iend, 0);
+
+  signal_after_change (istart, iend - istart, unwind_data.nbytes);
+  update_compositions (istart, istart, CHECK_HEAD);
 
   return unbind_to (count, Qt);
 }


-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-09 20:07                                               ` Stefan Monnier
@ 2018-01-10 18:45                                                 ` Alan Mackenzie
  2018-01-10 19:30                                                   ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-10 18:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Hello, Stefan.

On Tue, Jan 09, 2018 at 15:07:58 -0500, Stefan Monnier wrote:
> > The way I see it, to announce in advance that the original region will be
> > deleted (by calling b-c-f for it) is suboptimal.

> But not incorrect.

> > If the decompression fails, we need to balance that b-c-f with
> > a "fake" a-c-f call.

> No, we don't.  It's still within the rules to announce an upcoming
> change with b-c-f and then not to carry through (i.e. not make any
> changes and not call a-c-f).

It's debatable whether it's within the rules or not.  The rules, which
we so carefully crafted a few days ago, say, in part; "The arguments to
`before-change-functions' will enclose a region in which the individual
changes are made, ...".  There will never be any changes made in the
quasi-deleted region, so to leave it without a balanceing a-c-f call
could be construed as against the rules.

I'm not saying it's a sensible debate to have, though.  More to the
point is that the b-c-f call may have made changes to the buffer which
need to be undone.  (Somebody, sometime, is going to try calling this
primitive in a C++ Mode buffer just to see what happens.  What they
should see is no buffer change, particularly not in the text
properties.)

Although, with unbalanced b/a-c-f being permitted anyway, for a mode
whose b-c-f does make buffer changes, there'll have to be, say, an entry
in post-command-hook anyway, to catch these unbalanced calls.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 18:45                                                 ` Alan Mackenzie
@ 2018-01-10 19:30                                                   ` Stefan Monnier
  2018-01-10 19:48                                                     ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-10 19:30 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

> It's debatable whether it's within the rules or not.  The rules, which
> we so carefully crafted a few days ago, say, in part; "The arguments to
> `before-change-functions' will enclose a region in which the individual
> changes are made, ...".  There will never be any changes made in the
> quasi-deleted region, so to leave it without a balanceing a-c-f call
> could be construed as against the rules.

If so we should clarify the rules to allow for it.  `upcase-region` will
call b-c-f before knowing whether any char will be upcased, and I think
we do want to allow that kind of behavior.


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 19:30                                                   ` Stefan Monnier
@ 2018-01-10 19:48                                                     ` Alan Mackenzie
  2018-01-10 20:33                                                       ` Stefan Monnier
  2018-01-10 22:06                                                       ` Clément Pit-Claudel
  0 siblings, 2 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-10 19:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Hello, Stefan.

On Wed, Jan 10, 2018 at 14:30:12 -0500, Stefan Monnier wrote:
> > It's debatable whether it's within the rules or not.  The rules, which
> > we so carefully crafted a few days ago, say, in part; "The arguments to
> > `before-change-functions' will enclose a region in which the individual
> > changes are made, ...".  There will never be any changes made in the
> > quasi-deleted region, so to leave it without a balanceing a-c-f call
> > could be construed as against the rules.

> If so we should clarify the rules to allow for it.

Why?  I mean, why should we allow this?  What does it gain us?

> `upcase-region` will call b-c-f before knowing whether any char will
> be upcased, and I think we do want to allow that kind of behavior.

At the moment, non-balancing b/a-c-f are used only in primitives which
can't change the length of the buffer - things like
translate-region-internal and upcase-region.  There might well be
advantages in keeping things that way, there surely are no
disadvantages.  Non-balanced change hooks necessitate special handling.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 19:48                                                     ` Alan Mackenzie
@ 2018-01-10 20:33                                                       ` Stefan Monnier
  2018-01-10 21:03                                                         ` Alan Mackenzie
  2018-01-10 22:06                                                       ` Clément Pit-Claudel
  1 sibling, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-10 20:33 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

> Non-balanced change hooks necessitate special handling.

I can't imagine why it would be significantly more difficult to handle

    bcf bcf acf

than

    bcf acf acf

And I can't see why it would be easier to handle

    bcf bcf acf

only for the special case where the "missing" acf would have had the
same length as the previous bcf (i.e. comes from a primitive which
doesn't change the length of the buffer).


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 20:33                                                       ` Stefan Monnier
@ 2018-01-10 21:03                                                         ` Alan Mackenzie
  2018-01-11 13:36                                                           ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-10 21:03 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Hello, Stefan.

On Wed, Jan 10, 2018 at 15:33:16 -0500, Stefan Monnier wrote:
> > Non-balanced change hooks necessitate special handling.

> I can't imagine why it would be significantly more difficult to handle

>     bcf bcf acf

> than

>     bcf acf acf

> And I can't see why it would be easier to handle

>     bcf bcf acf

> only for the special case where the "missing" acf would have had the
> same length as the previous bcf (i.e. comes from a primitive which
> doesn't change the length of the buffer).

Again, what advantages (other than not changing existing C code) are
there in allowing non-balanced b/a-c-f at all?  The difference in run
time between

    bcf acf acf acf .... acf

(in the cases we currently do this) and

    bcf acf  bcf acf  bcf acf .... bcf acf

is surely negligible, and it is not at all clear that the latter (which
can have simpler hook functions) would be slower than the former (which
may need elaborate special handling).

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 19:48                                                     ` Alan Mackenzie
  2018-01-10 20:33                                                       ` Stefan Monnier
@ 2018-01-10 22:06                                                       ` Clément Pit-Claudel
  2018-01-10 22:20                                                         ` Alan Mackenzie
  1 sibling, 1 reply; 66+ messages in thread
From: Clément Pit-Claudel @ 2018-01-10 22:06 UTC (permalink / raw)
  To: Alan Mackenzie, Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

On 2018-01-10 14:48, Alan Mackenzie wrote:
> At the moment, non-balancing b/a-c-f are used only in primitives which
> can't change the length of the buffer - things like
> translate-region-internal and upcase-region.

Sorry if I missed previous parts of the discussion that make the following irrelevant: in a buffer containing "Straße" a call to (buffer-size) returns 6, and a call to (buffer-size) after C-x h M-x upcase-region returns 7 (because the text becomes STRASSE).  Does this contradict your assertion above? Does it matter?

(Additionally, and likely unrelatedly, the region boundaries after running upcase-region on that example seem wrong — they don't include the 'E').

Clément.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 22:06                                                       ` Clément Pit-Claudel
@ 2018-01-10 22:20                                                         ` Alan Mackenzie
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-10 22:20 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: Eli Zaretskii, Stefan Monnier, emacs-devel

Hello, Clément.

On Wed, Jan 10, 2018 at 17:06:57 -0500, Clément Pit-Claudel wrote:
> On 2018-01-10 14:48, Alan Mackenzie wrote:
> > At the moment, non-balancing b/a-c-f are used only in primitives which
> > can't change the length of the buffer - things like
> > translate-region-internal and upcase-region.

> Sorry if I missed previous parts of the discussion that make the
> following irrelevant: in a buffer containing "Straße" a call to
> (buffer-size) returns 6, and a call to (buffer-size) after C-x h M-x
> upcase-region returns 7 (because the text becomes STRASSE).  Does this
> contradict your assertion above? Does it matter?

Yes, it does contradict my assertion, and it doesn't matter much, except
to me, because it makes my arguments weaker. ;-(  But I ought to have
thought of that myself, seeing how often and how far I walk along the
things in daily life.

> (Additionally, and likely unrelatedly, the region boundaries after
> running upcase-region on that example seem wrong — they don't include
> the 'E').

That sounds like a bug indeed.  But it's too late now for me to look at
it at the moment.

> Clément.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 21:03                                                         ` Alan Mackenzie
@ 2018-01-11 13:36                                                           ` Stefan Monnier
  2018-01-11 17:39                                                             ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-11 13:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

> is surely negligible, and it is not at all clear that the latter (which
> can have simpler hook functions) would be slower than the former (which
> may need elaborate special handling).

I'm very far from convinced about the "elaborate special handling".


        Stefan



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 13:36                                                           ` Stefan Monnier
@ 2018-01-11 17:39                                                             ` Alan Mackenzie
  2018-01-11 19:35                                                               ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-11 17:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

Hello, Stefan.

On Thu, Jan 11, 2018 at 08:36:55 -0500, Stefan Monnier wrote:
> > is surely negligible, and it is not at all clear that the latter (which
> > can have simpler hook functions) would be slower than the former (which
> > may need elaborate special handling).

> I'm very far from convinced about the "elaborate special handling".

Well, you agree that there're no advantages to having unbalanced change
hooks, so what does it matter how elaborate or not the special handling
is?  It will be necessary.  If we converted all change hooks to being
balanced it would not be necessary.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 17:39                                                             ` Alan Mackenzie
@ 2018-01-11 19:35                                                               ` Stefan Monnier
  2018-01-11 19:46                                                                 ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-11 19:35 UTC (permalink / raw)
  To: emacs-devel

>> I'm very far from convinced about the "elaborate special handling".
> Well, you agree that there're no advantages to having unbalanced change
> hooks,

No I don't.


        Stefan




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 19:35                                                               ` Stefan Monnier
@ 2018-01-11 19:46                                                                 ` Alan Mackenzie
  2018-01-11 20:15                                                                   ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-11 19:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Jan 11, 2018 at 14:35:06 -0500, Stefan Monnier wrote:
> >> I'm very far from convinced about the "elaborate special handling".
> > Well, you agree that there're no advantages to having unbalanced change
> > hooks,

> No I don't.

You don't?  What, then, might those advantages be, and in what
conditions would those advantages outweigh the known disadvantages in
allowing unbalanced change hooks?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 19:46                                                                 ` Alan Mackenzie
@ 2018-01-11 20:15                                                                   ` Stefan Monnier
  2018-01-11 21:20                                                                     ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-11 20:15 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> >> I'm very far from convinced about the "elaborate special handling".
>> > Well, you agree that there're no advantages to having unbalanced change
>> > hooks,
>> No I don't.
> You don't?

How can you be surprised?  For several years now you've been complaining
about the lack of balance (which supposedly breaks cc-mode assumptions
or something) while I have been pushing back arguing that it's not a bug
(in itself: there are bugs, but the mere fact that it's not balanced is
not a bug).

> What, then, might those advantages be,

Implementation flexibility, of course (which apparently Eli doesn't
want to take advantage of very much).

> and in what conditions would those advantages outweigh the known
> disadvantages in allowing unbalanced change hooks?

What known disadvantages?

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 20:15                                                                   ` Stefan Monnier
@ 2018-01-11 21:20                                                                     ` Alan Mackenzie
  2018-01-11 23:42                                                                       ` Stefan Monnier
  0 siblings, 1 reply; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-11 21:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Jan 11, 2018 at 15:15:49 -0500, Stefan Monnier wrote:
> >> >> I'm very far from convinced about the "elaborate special handling".
> >> > Well, you agree that there're no advantages to having unbalanced change
> >> > hooks,
> >> No I don't.
> > You don't?

> How can you be surprised?

Not surprised.  I've just spent the last three posts, and quite a few
before that, trying to get you to tell me in detail what's desirable
about unbalanced change hooks.  I'd genuinely like to know, since I
couldn't (and still can't) see this.

[ .... ]

> > What, then, might those advantages be,

> Implementation flexibility, of course (which apparently Eli doesn't
> want to take advantage of very much).

What flexibility?  What can you do with unbalanced hooks that you can't
do when you constrain yourself to balance them?  Are you talking about
flexibility in writing primitives or in using them?  Can you give me an
example?

> > and in what conditions would those advantages outweigh the known
> > disadvantages in allowing unbalanced change hooks?

> What known disadvantages?

That b-c-f and a-c-f can't work smoothly together in partnership.
Instead, at the moment, to be rigorous, code must use non-nice artifices
to cope with the rare occurences of unbalanced change hook calls.  You
know CC Mode does this.  An example, in C++ Mode, is where the b-c-f
function removes balanced paren syntax-table text properties from
template delimiters when half of a pair is about to be deleted.  This
mechanism expects a matching a-c-f so as to put them back where needed.

Indeed, rigorous pairing of b-c-f and a-c-f would increase the
flexibility of programs using them, not decrease it.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 21:20                                                                     ` Alan Mackenzie
@ 2018-01-11 23:42                                                                       ` Stefan Monnier
  2018-01-12 16:14                                                                         ` Alan Mackenzie
  0 siblings, 1 reply; 66+ messages in thread
From: Stefan Monnier @ 2018-01-11 23:42 UTC (permalink / raw)
  To: emacs-devel

>> Implementation flexibility, of course (which apparently Eli doesn't
>> want to take advantage of very much).
> What flexibility?

"Implementation flexibility" here means flexibility in
implementing primitives.

E.g. not needing to care about "what happens if I get an error after
I called b-c-f but before I actually modify the buffer"?

>> > and in what conditions would those advantages outweigh the known
>> > disadvantages in allowing unbalanced change hooks?
>> What known disadvantages?
> That b-c-f and a-c-f can't work smoothly together in partnership.

They can still work smoothly in partnership.

The assumptions and guarantees needed are slightly different but I still
haven't seen any example where the difference leads to a more complex
implementation of the b-c-f and a-c-f hooks.

> Instead, at the moment, to be rigorous, code must use non-nice artifices
> to cope with the rare occurences of unbalanced change hook calls.  You
> know CC Mode does this.  An example, in C++ Mode, is where the b-c-f
> function removes balanced paren syntax-table text properties from
> template delimiters when half of a pair is about to be deleted.  This
> mechanism expects a matching a-c-f so as to put them back where needed.

Yes, cc-mode has trouble because it was designed based on
balanced behavior.

If you had designed it based on correct assumptions you wouldn't have
removed the properties in the before part.  You'd instead have made this
removal lazily (i.e. in b-c-f just collect the info from the about-to-be
modified region (could be as simple as stashing (buffer-substring beg
end) or some buffer position indicating which template delimiters will need
to be re-analyzed), and then use it in the a-c-f).

I don't see any reason why the resulting system would be noticeably
more complex or less efficient.

Going from one to the other is extra work, but that's not the
fault of the system.

> Indeed, rigorous pairing of b-c-f and a-c-f would increase the
> flexibility of programs using them, not decrease it.

Yes, by definition.  But I still haven't seen any example where the
difference actually matters.

        Stefan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-11 23:42                                                                       ` Stefan Monnier
@ 2018-01-12 16:14                                                                         ` Alan Mackenzie
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-12 16:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Jan 11, 2018 at 18:42:00 -0500, Stefan Monnier wrote:
> >> Implementation flexibility, of course (which apparently Eli doesn't
> >> want to take advantage of very much).
> > What flexibility?

> "Implementation flexibility" here means flexibility in
> implementing primitives.

Which is probably not terribly important.  Few, if any, buffer changing
primitives have been implemented in the last 20 years, and it's
difficult to foresee that any new ones will be written in the next 20
years.

> E.g. not needing to care about "what happens if I get an error after
> I called b-c-f but before I actually modify the buffer"?

So the flexibility you gain by permitting non-balancing of the change
hooks is the ability to write unbalanced change hooks.  Somehow, I'm a
bit underwhelmed.

[ .... ]

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: Lisp primitives and their calling of the change hooks
  2018-01-10 18:29                                                 ` Alan Mackenzie
@ 2018-01-12 16:40                                                   ` Alan Mackenzie
  0 siblings, 0 replies; 66+ messages in thread
From: Alan Mackenzie @ 2018-01-12 16:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Wed, Jan 10, 2018 at 18:29:09 +0000, Alan Mackenzie wrote:
> On Tue, Jan 09, 2018 at 22:05:50 +0200, Eli Zaretskii wrote:
> > > Date: Tue, 9 Jan 2018 19:53:57 +0000
> > > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> > > From: Alan Mackenzie <acm@muc.de>

[ .... ]

> OK, a patch, perhaps for the last time.  In a call to z-d-r, successful
> or not, there will be exactly one b-c-f call and exactly one a-c-f call.
> I think, for the failing case, this is neater than having a bare b-c-f
> with nothing following it.

[ Patch snipped. ]

I've committed the patch (which contained the
before/after-change-functions calls as agreed) to master.

I think we're finished here.  :-)

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2018-01-12 16:40 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-03 12:45 Lisp primitives and their calling of the change hooks Alan Mackenzie
2018-01-03 21:51 ` Stefan Monnier
2018-01-04 15:51   ` Alan Mackenzie
2018-01-04 18:16     ` Stefan Monnier
2018-01-04 21:11       ` Alan Mackenzie
2018-01-04 21:36         ` Stefan Monnier
2018-01-06 15:18           ` Alan Mackenzie
2018-01-06 15:51             ` Stefan Monnier
2018-01-06 16:18               ` Eli Zaretskii
2018-01-06 19:06                 ` Alan Mackenzie
2018-01-06 20:24               ` Alan Mackenzie
2018-01-07 11:36                 ` Alan Mackenzie
2018-01-07 11:49                   ` Eli Zaretskii
2018-01-07 12:08                     ` Alan Mackenzie
2018-01-07 13:56                       ` Alan Mackenzie
2018-01-07 15:21                         ` [SUSPECTED SPAM] " Stefan Monnier
2018-01-07 16:47                         ` Eli Zaretskii
2018-01-07 17:50                           ` Stefan Monnier
2018-01-07 17:58                             ` Eli Zaretskii
2018-01-07 19:04                               ` Stefan Monnier
2018-01-07 19:48                                 ` Alan Mackenzie
2018-01-07 19:58                                   ` Eli Zaretskii
2018-01-07 21:10                                     ` Alan Mackenzie
2018-01-08  3:41                                       ` Eli Zaretskii
2018-01-08 19:24                                         ` Alan Mackenzie
2018-01-08 21:15                                           ` Eli Zaretskii
2018-01-08 22:24                                             ` Stefan Monnier
2018-01-09  3:55                                               ` Eli Zaretskii
2018-01-09 13:30                                                 ` Stefan Monnier
2018-01-09 18:50                                                   ` Eli Zaretskii
2018-01-09 19:53                                             ` Alan Mackenzie
2018-01-09 20:05                                               ` Eli Zaretskii
2018-01-10 18:29                                                 ` Alan Mackenzie
2018-01-12 16:40                                                   ` Alan Mackenzie
2018-01-09 20:07                                               ` Stefan Monnier
2018-01-10 18:45                                                 ` Alan Mackenzie
2018-01-10 19:30                                                   ` Stefan Monnier
2018-01-10 19:48                                                     ` Alan Mackenzie
2018-01-10 20:33                                                       ` Stefan Monnier
2018-01-10 21:03                                                         ` Alan Mackenzie
2018-01-11 13:36                                                           ` Stefan Monnier
2018-01-11 17:39                                                             ` Alan Mackenzie
2018-01-11 19:35                                                               ` Stefan Monnier
2018-01-11 19:46                                                                 ` Alan Mackenzie
2018-01-11 20:15                                                                   ` Stefan Monnier
2018-01-11 21:20                                                                     ` Alan Mackenzie
2018-01-11 23:42                                                                       ` Stefan Monnier
2018-01-12 16:14                                                                         ` Alan Mackenzie
2018-01-10 22:06                                                       ` Clément Pit-Claudel
2018-01-10 22:20                                                         ` Alan Mackenzie
2018-01-08  4:29                                   ` Stefan Monnier
2018-01-07 17:54                           ` Alan Mackenzie
2018-01-07 18:05                             ` Eli Zaretskii
2018-01-05  6:55         ` Eli Zaretskii
2018-01-05 11:41           ` Alan Mackenzie
2018-01-05 13:00             ` Eli Zaretskii
2018-01-05 13:34               ` Alan Mackenzie
2018-01-05 14:08                 ` Eli Zaretskii
2018-01-05 15:54                   ` Alan Mackenzie
2018-01-05 16:50           ` Stefan Monnier
2018-01-05 17:38             ` Alan Mackenzie
2018-01-05 18:09               ` Stefan Monnier
2018-01-05 19:53             ` Eli Zaretskii
2018-01-05 22:28               ` Stefan Monnier
2018-01-06  9:05                 ` Eli Zaretskii
2018-01-06 15:26                   ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).