From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: phillip.lord@newcastle.ac.uk (Phillip Lord) Newsgroups: gmane.emacs.devel Subject: Re: [Request for Mentor] subst-char-in-region Date: Mon, 15 Dec 2014 12:15:10 +0000 Message-ID: <87a92pyur5.fsf@newcastle.ac.uk> References: <87r3w5jdow.fsf@newcastle.ac.uk> <87bnn8j4c9.fsf@newcastle.ac.uk> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1418645761 10816 80.91.229.3 (15 Dec 2014 12:16:01 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 15 Dec 2014 12:16:01 +0000 (UTC) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Dec 15 13:15:54 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y0UZ7-0000dG-Ar for ged-emacs-devel@m.gmane.org; Mon, 15 Dec 2014 13:15:53 +0100 Original-Received: from localhost ([::1]:39180 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y0UZ6-0006SW-K5 for ged-emacs-devel@m.gmane.org; Mon, 15 Dec 2014 07:15:52 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54374) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y0UYd-0006RN-3D for emacs-devel@gnu.org; Mon, 15 Dec 2014 07:15:28 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Y0UYX-0002wp-7Z for emacs-devel@gnu.org; Mon, 15 Dec 2014 07:15:23 -0500 Original-Received: from cheviot22.ncl.ac.uk ([128.240.234.22]:39251) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Y0UYW-0002um-UP for emacs-devel@gnu.org; Mon, 15 Dec 2014 07:15:17 -0500 Original-Received: from smtpauth-vm.ncl.ac.uk ([10.8.233.129] helo=smtpauth.ncl.ac.uk) by cheviot22.ncl.ac.uk with esmtp (Exim 4.63) (envelope-from ) id 1Y0UYR-0001Nw-Fv; Mon, 15 Dec 2014 12:15:11 +0000 Original-Received: from jangai.ncl.ac.uk ([10.66.67.223] helo=localhost) by smtpauth.ncl.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1Y0UYQ-0000xc-QP; Mon, 15 Dec 2014 12:15:10 +0000 In-Reply-To: (Stefan Monnier's message of "Fri, 12 Dec 2014 11:17:22 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 128.240.234.22 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:180140 Archived-At: Stefan Monnier writes: >>>> before(1,8) >>>> after(1,4,3) >>> That looks correct to me. >> Why? The doc says "the positions of the beginning and end of the old >> text to be changed" for before-change-function. But the text from 4 to 8 >> is not changed. As indeed the after-change-functions value says. > > Similarly if your subst-char-in-region changes "oaaao" to "xaaax" the > aaa part isn't changed, so you could argue that we should call b-c-f and > a-c-f twice (once per "o->x" change). But instead we call them on > a "superset" of the actually changed text. A tighter superset is > preferable, all other things being equal, but making a single call > rather than many "smaller" calls also is preferable. Sure, I understand that, and aggregating calls is fine. It's just that the b-c-f call is something that is wrong or at least inconsistent with the a-c-f. Wrong for me isn't the problem, actually, inconsistent is. >> Given that the change in this case is a substution why is it not: >> before(1,4) >> after (1,4,3) >> This could be calculated, of course, by subst-char-in-region, although >> it would potentially require scanning the region twice (once to find >> start and stop, once to actually make changes). > > Exactly, it doesn't seem worth scanning the region twice just to give > a slightly tighter bound to the b-c-f. It is possible to do with a single scan, actually. Scan from the start to the first occurance. Scan from the end to the last occurance. Signal b-c-f. Then scan between the first and last, making any changes, and signal a-c-f. Add in a few if statements for edge cases and you're done. The code is more complex, but running time is the same. >> At the moment, yes, it does. I am keeping two buffers in sync by >> transfering changes between them. It does this by removing the text in >> "that buffer" between the before change values and replacing it with the >> text in "this buffer" (it's slightly more complex than this, but that's >> the basic idea). > > Why do it this way? Why not rely exclusively on the a-c-f? Initially, because I wasn't sure that the a-c-f gave me all the information that I needed. I thought it might do, but I was confused by the third argument which is "length in bytes of the pre-change text". Is "length in bytes" the same as "char position". I presumed note. The second reason is more complex. I am trying to keep two buffers in sync; to do this, I need to be able to convert between the position in that buffer for the position in that buffer. In the simple case where they contain exactly the same text this is easy (it's the same value), but this is not necessarily the case; the two buffers map to each other, but do not need to have identical content. So, I need to have a function to do this conversion. The difficulty is that in some cases the conversion function uses the contents of this buffer to work out the equivalent location in that buffer. When the b-c-f is called the two buffers are in sync, so this conversion works. But when the a-c-f is called, the two buffers are not in sync because we haven't percolated the change yet. So my conversion functions tend to break. If I use the b-c-f values, then I can guarantee the state of the two buffers is in-sync. Unfortunately, I depend on b-c-f and a-c-f being consistent. > Try your code on a diff-mode buffer using commands such as > diff-unified->context or diff-context->unified (hint, they use > combine-after-change-calls). Yes, these break also. Although, I don't see why, since the documentation says.... If `before-change-functions' is non-nil, then calls to the after-change functions can't be deferred, so in that case this macro has no effect. I had a quick look at combine-after-change-calls and combine-after-change-execute and neither seem to check the value of b-c-f; again, I've never learned C, so I could be wrong about the latter. I have got a partially working solution to this -- which is I can check the value of this-command, on only use these values on this-command's that I know to be safe (otherwise, I fall back to copying the whole buffer). Unfortunately, this requires me to white list commands--rather long winded, although this as this is mostly a performance optimisation (albeit a significant one) and as self-insert-command is safe this might be the way forward. >> In general, I have found that this works. > > I think you mean "usually" rather than "in general". Usually, I find, you understand things better than I, and specifically in this case! Phil