From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED!not-for-mail
From: Alan Mackenzie <acm@muc.de>
Newsgroups: gmane.emacs.devel
Subject: Re: Unbalanced change hooks (part 2)
Date: Sun, 31 Jul 2016 17:28:04 +0000
Message-ID: <20160731172804.GD2205@acm.fritz.box>
References: <20160731121642.GB2205@acm.fritz.box>
 <83a8gxq288.fsf@gnu.org>
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: blaine.gmane.org 1469986168 14035 80.91.229.8 (31 Jul 2016 17:29:28 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Sun, 31 Jul 2016 17:29:28 +0000 (UTC)
User-Agent: Mutt/1.5.24 (2015-08-30)
Cc: ofv@wanadoo.es, rcopley@gmail.com, emacs-devel@gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jul 31 19:29:12 2016
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by blaine.gmane.org with esmtp (Exim 4.84_2)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1bTuY3-0003Vc-PO
	for ged-emacs-devel@m.gmane.org; Sun, 31 Jul 2016 19:29:12 +0200
Original-Received: from localhost ([::1]:40297 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1bTuXp-0005HH-OJ
	for ged-emacs-devel@m.gmane.org; Sun, 31 Jul 2016 13:28:57 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51074)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
	id 1bTuXi-0005HA-Vs
	for emacs-devel@gnu.org; Sun, 31 Jul 2016 13:28:52 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <acm@muc.de>) id 1bTuXe-0007hZ-Ne
	for emacs-devel@gnu.org; Sun, 31 Jul 2016 13:28:49 -0400
Original-Received: from mail.muc.de ([193.149.48.3]:12112)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
	id 1bTuXe-0007hL-Cw
	for emacs-devel@gnu.org; Sun, 31 Jul 2016 13:28:46 -0400
Original-Received: (qmail 308 invoked by uid 3782); 31 Jul 2016 17:28:45 -0000
Original-Received: from acm.muc.de (p548C7DF4.dip0.t-ipconnect.de [84.140.125.244]) by
	colin.muc.de (tmda-ofmipd) with ESMTP;
	Sun, 31 Jul 2016 19:28:43 +0200
Original-Received: (qmail 4542 invoked by uid 1000); 31 Jul 2016 17:28:04 -0000
Content-Disposition: inline
In-Reply-To: <83a8gxq288.fsf@gnu.org>
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
X-Primary-Address: acm@muc.de
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x
X-Received-From: 193.149.48.3
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:206270
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/206270>

Hello, Eli.

On Sun, Jul 31, 2016 at 06:03:19PM +0300, Eli Zaretskii wrote:
> > Date: Sun, 31 Jul 2016 12:16:42 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: Óscar Fuentes <ofv@wanadoo.es>,
> > 	Richard Copley <rcopley@gmail.com>

> > I propose that the call to signal_before_change should be removed from
> > prepare_to_modify_buffer, and that it should be called explicitly from
> > the places it is needed.  This would allow bug #240[79]4 to be fixed.
 
> > Comments?

> Some.

> First, I don't agree with your conclusion that calls to
> before-change-functions and after-change-functions must always be
> balanced.  For starters, they aren't, and never have been.  Neither is
> it written anywhere that they should, and your interpretation of the
> ELisp manual seems to take the desired world for the real one.

The pertinent section of Elisp ("Change Hooks") says:

    These hook variables let you arrange to take notice of ALL changes
    in ALL buffers.  [My emphasis]

Your interpretation of that seems to be that each buffer change will
call _at least_ one of before-... and after-..., but not necessarily
both.  That doesn't seem sensible to me.

Further down, in `before_change_functions' (and we have the same thing
in `after_change_functions') we have:

    This variable holds a list of functions to call before any buffer
    modification.

I can't see any sensible interpretation of "any ... modification" other
than "each and every modification".  I've been relying on this for many
years.

Looked at another way, how can it be sensible to call just one of these
hooks?  Like, if you were redesigning Emacs from scratch, are there any
categories of buffer modifications in which you would call only
after-change-functions, and not before-c-f?

> My interpretation of these two variables is that they are two hooks
> provided for different kinds of features, because hooking into both
> should not generally be required.  It should suffice for an
> application that wants to react to a change in a buffer to hook into
> after-change-functions only.

No.  A program (such as CC Mode) reacts to a _change_, not merely to
what a buffer looks like after a change.  A buffer change loses
information, so in the general case, any required info needs to be
recorded by before-change-functions before it gets lost in the change,
so that it can be used by after-change-functions.

As a concrete example of this, suppose we have this in a buffer:

    #define foo() first line \
                  second line \
	          third line \
	          fourth line

, and the change consists of deleting the backslash on L2.
before-change-functions will have noted that all four lines are in the
"active area".  after-change-functions unassisted would merely see that
the "active area" consists of L1 and L2.  The info from before-change
enables after-change to operate on the entire four lines.

> But even if I agree, for the sake of argument, with your
> interpretation, let's put this issue into a proper perspective, okay?
> We have a single major mode (albeit a very important one, one that I
> use every day) that under very specific conditions apparently fails to
> discard its caches, ....

Being precise, two critical variables were dirty store from the previous
invocation of c-after-change, not having been set in the missing call to
before-change-functions.

> .... because its design assumed calls to these two hooks will always
> be balanced.  As a solution, you are proposing a change in a very
> low-level and delicate part of Emacs, which means significant changes
> to code that was basically unchanged for more than 20, if not 30
> years.

prepare_to_modify buffer has been modified lots of times in recent
years.  (I count six modifications since 2014-01.)

> This code is involved in almost every operation on buffer
> text, and so changes you propose will necessarily cause ripples all
> over Emacs.  Just look what the code you propose to change does:

>   . call barf-if-buffer-read-only
>   . set the buffer's redisplay flag
>   . affect the undo recording
>   . save the region when select-active-regions is non-nil
>   . sets deactivate-mark non-nil

I'm not proposing to change any of that at all.  What I'm proposing is
strictly limited, and could be accompished in two steps, as follows:

1. (Pure refactoring).  Remove the call to signal_before_change from
prepare_to_modify_buffer.  Insert an explicit call to s_b_c after each
invocation of p_t_m_b.  Restore the check on inhibit_modification_hooks
in s_b_c.

2. Where a call to signal_buffer_change is within an "if (prepare) ..."
construct, move it to after that construct, so that

    if (prepare)
      {
        prepare_to_modify_buffer (...);
	signal_after_change (...);
      }

would become

    if (prepare)
      prepare_to_modify_buffer (...);
    signal_after_change (...);

.

> Can you imagine how much Lisp code and basic features will be affected
> by calling this where we never did before?  All that for the benefit
> of a single major mode that fails in a single, albeit important use
> case?

> I'm sorry, but that makes very little sense to me.

Please reconsider carefully the concrete proposal I've made (above) and
judge whether it will really affect very much at all.  I can foresee
that nothing will happen except the desired invocations of
before-change-functions being made.

I'm not entirely sure that only CC Mode will be helped here.  The value
of before-change-functions in many major modes contains the function
syntax-ppss-flush-cache.  There is a good chance that some call of, say,
del_range_byte is going to leave the syntax-ppss cache stale and
invalid, a difficult situation to diagnose.

> So I suggest to take one or two steps back, and try to find a safer
> solution for the problem.  Doing so will also greatly enhance the
> chances of Emacs users to see the solution in the next Emacs release
> (whereas doing what you suggest will push it back to Emacs 26 or maybe
> even later).

> Questions:

> 1) Why does CC Mode need both hooks?

I think I've already answered this above.

> What does it do in each one of them, in the specific use case reported
> in the named bugs?

It should calculate the two variables c-new-BEG and c-new-END (which
bound the active region) in c-before-change, and these get modified in
c-after-change.  In the bug situation, these variables simply held stale
values from the previous c-after-change call (which was from
transpose-lines).  Specifically, c-after-change modified c-new-END by
subtracting old-len.  This happened to make c-new-END < c-new-BEG.  So a
subsequent re-search-forward invocation had the limit on the wrong side
of point and threw an error.

> And why isn't it enough to make only the change you proposed in part 1
> of your report?

I tried that, but it didn't work.  With the help of gdb, it was clear
that the missing before-change-functions came from an invocation of
del_range_byte with `prepare' false, rather than directly from
Finsert_file_contents.

> 2) Can this problem be fixed in CC Mode itself, without touching
> either insert-file-contents or insdel.c functions?

I honestly don't know.  That would require extensive redesign if it's
possible.  That would need something like creating a new cache to
hold the information for the entire buffer which is currently determined
in c-before-change.  A simple workaround would be to ignore (i.e. not
process) an out of sequence invocation of c-after-change, but that's
hardly ideal.  I suspect I'm going to have to do that anyway in
standalone CC Mode.

> 3) If the answer to 2) above is a categorical NO (but please provide
> enough arguments so we could be very sure it is that), then I _might_
> be convinced to consider changes to insert-file-contents only, and
> only for the case when both VISIT and REPLACE are non-nil.  That's
> because I think this combination is only used by revert-buffer and
> similar functionality (if someone knows about other callers that do
> this, please holler), ....

The actual operation was C-x C-f where the file was already visited in
Emacs, but had been modified outside of Emacs, so Finsert_file_contents
went through the code which attempts to preserve markers, etc.

> .... so the ripples will be not as wide as if we mess with insdel.c.
> (Still, even this possibility scares the living s**t out of me; it
> should do that to you as well.)  To this end, please tell which calls
> to del_range_byte in insert-file-contents are involved with this use
> case and cause trouble to CC Mode, and maybe we could change only
> those, in addition to the 'if' clause that guards the call to
> before-change-functions.

The del_range_byte call which I caught with gdb last night was, I think,
the one at fileio.c L4037.  I don't know the code well enough to say
that that's the only one which might trigger.

-- 
Alan Mackenzie (Nuremberg, Germany).