From: phillip.lord@newcastle.ac.uk (Phillip Lord)
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: [Request for Mentor] subst-char-in-region
Date: Mon, 15 Dec 2014 12:15:10 +0000 [thread overview]
Message-ID: <87a92pyur5.fsf@newcastle.ac.uk> (raw)
In-Reply-To: <jwvk31wua18.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Fri, 12 Dec 2014 11:17:22 -0500")
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>>>> before(1,8)
>>>> after(1,4,3)
>>> That looks correct to me.
>> Why? The doc says "the positions of the beginning and end of the old
>> text to be changed" for before-change-function. But the text from 4 to 8
>> is not changed. As indeed the after-change-functions value says.
>
> Similarly if your subst-char-in-region changes "oaaao" to "xaaax" the
> aaa part isn't changed, so you could argue that we should call b-c-f and
> a-c-f twice (once per "o->x" change). But instead we call them on
> a "superset" of the actually changed text. A tighter superset is
> preferable, all other things being equal, but making a single call
> rather than many "smaller" calls also is preferable.
Sure, I understand that, and aggregating calls is fine. It's just that
the b-c-f call is something that is wrong or at least inconsistent with
the a-c-f. Wrong for me isn't the problem, actually, inconsistent is.
>> Given that the change in this case is a substution why is it not:
>> before(1,4)
>> after (1,4,3)
>> This could be calculated, of course, by subst-char-in-region, although
>> it would potentially require scanning the region twice (once to find
>> start and stop, once to actually make changes).
>
> Exactly, it doesn't seem worth scanning the region twice just to give
> a slightly tighter bound to the b-c-f.
It is possible to do with a single scan, actually. Scan from the start
to the first occurance. Scan from the end to the last occurance. Signal
b-c-f. Then scan between the first and last, making any changes, and
signal a-c-f. Add in a few if statements for edge cases and you're done.
The code is more complex, but running time is the same.
>> At the moment, yes, it does. I am keeping two buffers in sync by
>> transfering changes between them. It does this by removing the text in
>> "that buffer" between the before change values and replacing it with the
>> text in "this buffer" (it's slightly more complex than this, but that's
>> the basic idea).
>
> Why do it this way? Why not rely exclusively on the a-c-f?
Initially, because I wasn't sure that the a-c-f gave me all the
information that I needed. I thought it might do, but I was confused by
the third argument which is "length in bytes of the pre-change text". Is
"length in bytes" the same as "char position". I presumed note.
The second reason is more complex. I am trying to keep two buffers in
sync; to do this, I need to be able to convert between the position in
that buffer for the position in that buffer. In the simple case where
they contain exactly the same text this is easy (it's the same value),
but this is not necessarily the case; the two buffers map to each other,
but do not need to have identical content. So, I need to have a function
to do this conversion.
The difficulty is that in some cases the conversion function uses the
contents of this buffer to work out the equivalent location in that
buffer. When the b-c-f is called the two buffers are in sync, so this
conversion works. But when the a-c-f is called, the two buffers are not
in sync because we haven't percolated the change yet. So my conversion
functions tend to break. If I use the b-c-f values, then I can guarantee
the state of the two buffers is in-sync. Unfortunately, I depend on
b-c-f and a-c-f being consistent.
> Try your code on a diff-mode buffer using commands such as
> diff-unified->context or diff-context->unified (hint, they use
> combine-after-change-calls).
Yes, these break also. Although, I don't see why, since the
documentation says....
If `before-change-functions' is non-nil, then calls to the after-change
functions can't be deferred, so in that case this macro has no effect.
I had a quick look at combine-after-change-calls and
combine-after-change-execute and neither seem to check the value of
b-c-f; again, I've never learned C, so I could be wrong about the
latter.
I have got a partially working solution to this -- which is I can check
the value of this-command, on only use these values on this-command's
that I know to be safe (otherwise, I fall back to copying the whole
buffer). Unfortunately, this requires me to white list commands--rather
long winded, although this as this is mostly a performance optimisation
(albeit a significant one) and as self-insert-command is safe this might
be the way forward.
>> In general, I have found that this works.
>
> I think you mean "usually" rather than "in general".
Usually, I find, you understand things better than I, and specifically
in this case!
Phil
next prev parent reply other threads:[~2014-12-15 12:15 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-12 11:43 [Request for Mentor] subst-char-in-region Phillip Lord
2014-12-12 14:35 ` Stefan Monnier
2014-12-12 15:05 ` Phillip Lord
2014-12-12 16:17 ` Stefan Monnier
2014-12-15 12:15 ` Phillip Lord [this message]
2014-12-15 14:29 ` Stefan Monnier
2014-12-16 11:30 ` Phillip Lord
2014-12-16 14:07 ` Stefan Monnier
2014-12-16 16:05 ` Phillip Lord
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a92pyur5.fsf@newcastle.ac.uk \
--to=phillip.lord@newcastle.ac.uk \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).