From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: Unbalanced change hooks (part 2)
Date: Mon, 01 Aug 2016 16:07:50 +0300
Message-ID: <83shuoocwp.fsf@gnu.org>
References: <20160731121642.GB2205@acm.fritz.box> <83a8gxq288.fsf@gnu.org>
	<20160731172804.GD2205@acm.fritz.box> <834m75ptij.fsf@gnu.org>
	<20160731212635.GF2205@acm.fritz.box>
Reply-To: Eli Zaretskii <eliz@gnu.org>
NNTP-Posting-Host: blaine.gmane.org
X-Trace: blaine.gmane.org 1470056940 30192 80.91.229.8 (1 Aug 2016 13:09:00 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Mon, 1 Aug 2016 13:09:00 +0000 (UTC)
Cc: ofv@wanadoo.es, rcopley@gmail.com, emacs-devel@gnu.org
To: Alan Mackenzie <acm@muc.de>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Aug 01 15:08:37 2016
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by blaine.gmane.org with esmtp (Exim 4.84_2)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1bUCxQ-0007OK-Vg
	for ged-emacs-devel@m.gmane.org; Mon, 01 Aug 2016 15:08:37 +0200
Original-Received: from localhost ([::1]:50184 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1bUCxM-0000EK-S9
	for ged-emacs-devel@m.gmane.org; Mon, 01 Aug 2016 09:08:32 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34742)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <eliz@gnu.org>) id 1bUCx1-00007E-61
	for emacs-devel@gnu.org; Mon, 01 Aug 2016 09:08:21 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <eliz@gnu.org>) id 1bUCww-0004s4-P0
	for emacs-devel@gnu.org; Mon, 01 Aug 2016 09:08:10 -0400
Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43102)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@gnu.org>)
	id 1bUCww-0004s0-Lg; Mon, 01 Aug 2016 09:08:06 -0400
Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4731
	helo=home-c4e4a596f7)
	by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128)
	(Exim 4.82) (envelope-from <eliz@gnu.org>)
	id 1bUCwv-00083j-Ox; Mon, 01 Aug 2016 09:08:06 -0400
In-reply-to: <20160731212635.GF2205@acm.fritz.box> (message from Alan
	Mackenzie on Sun, 31 Jul 2016 21:26:35 +0000)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 2001:4830:134:3::e
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:206303
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/206303>

> Date: Sun, 31 Jul 2016 21:26:35 +0000
> Cc: ofv@wanadoo.es, rcopley@gmail.com, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > IOW, what the code does, and has been doing for many moons, is in this
> > case a de-facto _definition_ of what should be done.
> 
> Extending that principle ad absurdum, Emacs has no bugs at all, since
> what it does is what it should do.

Which is true.  Of course, since Emacs developers are reasonable
people (with the possible exception of yours truly), the result is
also reasonable.

So I suggest that we abandon this semi-philosophical sub-thread, and
instead focus on the practical issue at hand.

> I'm in favour of fixing what is wrong, rather than implementing
> workarounds somewhere else.  Workarounds are rarely fully satisfactory.

I'm not suggesting a workaround.  I'm suggesting a fix for a design
flaw in CC Mode -- its assumption that before-change-functions and
after-change-functions are always invoked in balanced pairs.  CC Mode
must not depend on that assumption, because it's false.

> I think you would agree with me that having before- and after- hook
> calls rigorously paired would be the ideal state - you haven't attempted
> so far to argue that missing out a before-change-functions call is a
> good thing.

It might be ideal, but it's next to impossible in practice.  It would
prohibit or greatly complicate code that deals with buffer
modifications, because for various reasons that code might do the
changes piecemeal.

In practice, we need to make sure that every modification triggers at
least one call to before-change-functions and at least one call to
after-change-functions.  The numbers of these calls do not have to
match.

> Solving the problem in insdel.c is the Right Thing to do, because that
> is where the cause of the problem lies.  You assert that the change I
> propose would "destabilise Emacs for years to come", but you haven't
> provided any analysis of my proposed change to support that assertion.

I had my share of hacking that code and its callers.  It's impossible
to convey everything I learned while doing that in a discussion such
as this one.  I say more about this below.  If that's still not
enough, you will have to trust me on that.

> The hackers who built Emacs were and are of a higher quality than to
> build such instability into Emacs.

"Instability" here means that various features that expect certain
things to happen will instead see unexpected situations.  Emacs will
still work and be stable as a program (i.e. won't crash and probably
won't corrupt your buffers).  But various functions will exhibit more
or less subtle bugs and misfeatures.

> The code does fancy things comparing the file to be revisited with
> the current buffer contents, and only loading the minimum (in some
> sense) portion of the file, so as to preserve markers, or something like
> that.

It's an internal optimization in insert-file-contents that CC Mode
doesn't have to follow, if doing that is painful.  Of course, if you
can reliably invalidate only the cached values that refer to the
region actually replaced, it's better.  But it's only an optimization.

> Please believe me, this recording of info in c-before-change was not
> done just for its own sake.

I believe you.  Just don't assume that there will be a c-before-change
call for each c-after-change call, that's all.

Specifically, in the case in point, I believe that the "unbalanced"
call to c-after-change clearly tells you that text has been deleted,
so I think it should be enough for you to do the job of the "missing"
c-before-change call as well.

> > I said "basically unchanged".  Trivial refactoring doesn't count.  And
> > any changes in this function were in response to issues common to all
> > the callers.  The case in point isn't such a case.
> 
> What I am proposing is only a little more than trivial refactoring.

No, it's far from that.  It fundamentally changes the list of things
that are done in many operations that modify buffers in various ways,
and the order in which these things are done.  We shouldn't make such
changes at this point in Emacs development, unless we identify a clear
bug that affects all the callers of these functions.

This case is not such a case.  It is a case of a single Lisp package
that made incorrect assumptions about the way the modification hooks
are called.  So the right place to fix this is in that Lisp package.

> > >     if (prepare)
> > >       prepare_to_modify_buffer (...);
> > >     signal_after_change (...);
> > 
> > See, that's the part that scares me.  You are proposing to run code
> > where we didn't run it before.  Did you look at what
> > signal_after_change does?  It doesn't just call
> > after-change-functions, it does much more.
> 
> I'm proposing to run code designed to run before buffer changes before a
> buffer change.

You are confusingly using signal_after_change and signal_before_change
interchangeably, so there's probably quite a lot of misunderstanding.

If you meant this:

     if (prepare)
       prepare_to_modify_buffer (...);
     signal_before_change (...);

then I don't see how it could work, because prepare_to_modify_buffer
computes one of the arguments to signal_before_change, so calling the
latter without calling the former is not really a good idea.  (This is
just one example of the many subtleties hiding in these seemingly
simple, almost innocently looking functions we have in insdel.c.)

> I'm still trying to work out what they are, beyond a generalised
> opposition to any change at this level of the code.

Without a good reason, yeah, that's pretty much it.

It's not some paranoia, mind you: I've been working on fixing quite a
few subtle bugs which originated due to such changes for the last few
years, so I know what I'm talking about.  Please trust my experience
and the conclusions I drew from it.

> > We will have to agree to disagree about this.
> 
> OK.  How about I implement what I've described, and we try it out?  If
> it doesn't work, or it throws up non-trivial problems, it will be easy
> enough to revert.

Sorry, no.  What I'd like you to try instead is eliminate from CC Mode
the incorrect assumption about pairwise calls to before- and
after-change-functions.  That is the correct way towards solving this
problem.  It is unwise to waste your time (and mine) on trying to
solve this by changing how modification hooks are called, because
that's not where the problem is.

> Programs other than CC Mode use before-change-functions.  It would be
> more effort to construct an actual failure than simply to fix the
> problem.

But before-change-functions _are_ being called in the scenario of the
bugs we are discussing.  They just aren't balanced with calls to
after-change-functions, that's all.  See, in this fragment of
insert-file-contents:

      if (same_at_end != same_at_start)
	{
	  invalidate_buffer_caches (current_buffer,
				    BYTE_TO_CHAR (same_at_start),
				    same_at_end_charpos);
	  del_range_byte (same_at_start, same_at_end, 0);
	  temp = GPT;
	  eassert (same_at_start == GPT_BYTE);
	  same_at_start = GPT_BYTE;
	}
      else
	{
	  temp = same_at_end_charpos;
	}
      /* Insert from the file at the proper position.  */
      SET_PT_BOTH (temp, same_at_start);
      same_at_start_charpos
	= buf_bytepos_to_charpos (XBUFFER (conversion_buffer),
				  same_at_start - BEGV_BYTE
				  + BUF_BEG_BYTE (XBUFFER (conversion_buffer)));
      eassert (same_at_start_charpos == temp - (BEGV - BEG));
      inserted_chars
	= (buf_bytepos_to_charpos (XBUFFER (conversion_buffer),
				   same_at_start + inserted - BEGV_BYTE
				  + BUF_BEG_BYTE (XBUFFER (conversion_buffer)))
	   - same_at_start_charpos);
      /* This binding is to avoid ask-user-about-supersession-threat
	 being called in insert_from_buffer (via in
	 prepare_to_modify_buffer).  */
      specbind (intern ("buffer-file-name"), Qnil);
      insert_from_buffer (XBUFFER (conversion_buffer),
			  same_at_start_charpos, inserted_chars, 0);

the before-change-functions are called from insert_from_buffer.  So
it's not like the replacement would be missed by the
before-change-functions.

IOW, your problem is not that before-change-functions are not called
in this specific scenario to indicate the replacement.  Your problem
is that the number of calls to before-change-functions doesn't match
the number of calls to after-change-functions, because del_range_byte
calls after-change-functions one extra time.  And that's the problem
which needs to be fixed, and it should be fixed in CC Mode, because
assuming these calls are balanced is simply wrong.

> > > It should calculate the two variables c-new-BEG and c-new-END (which
> > > bound the active region) in c-before-change, and these get modified in
> > > c-after-change.
> > 
> > Sorry, I don't understand.  Why can't you compute these two values in
> > the c-after-change?
> 
> In the general case because information gets lost at a change; things
> like where the beginning of a statement was, where deleted template
> delimiters used to be, what the bounds of a macro or string used to be
> (before backslashes were deleted).

I'm sure CC Mode is well equipped to recalculate all of these if
needed.  It does that when a file is first visited.

> Or are you suggesting I implement a special after-change handler for the
> case when the before-change-functions was missing?

Could be.  I don't know CC Mode well enough to propose specific
solutions.

> This would be possible, but it would still be more work than
> ensuring that before-change-functions actually gets called.

As I show above, before-change-functions _do_ get called by
insert-file-contents in this scenario.

> > If you are going to have to do that anyway, why do we have to consider
> > such low-level changes?
> 
> Because a proper fix is the Right Thing to do.

Well, we clearly disagree about that.  I think there's no problem in
insdel.c or its callers, and the cause of this bug is the incorrect
assumption in CC Mode that before- and after-change-functions will be
called in balanced pairs.  So from my POV, TRT would be to eliminate
that assumption, which should make CC Mode more robust in the long
run, as I'm not sure this is the only case where that assumption
breaks.

> Reverting the buffer is merely the recipe to reproduce the bug.  The bug
> itself is c-before-change not being called.

See above: it _is_ being called, from insert_from_buffer.

> As I said, why not let me implement what I've suggested?  If it doesn't
> work well enough, I can revert it.

Because it will waste time and efforts on a way towards a dead end.

Thanks.