From: Alan Mackenzie <acm@muc.de>
To: Eli Zaretskii <eliz@gnu.org>
Cc: daniel.lopez999@gmail.com, monnier@iro.umontreal.ca,
34525@debbugs.gnu.org
Subject: bug#34525: replace-regexp missing some matches
Date: Sun, 24 Feb 2019 21:00:58 +0000 [thread overview]
Message-ID: <20190224210058.GB21808@ACM> (raw)
In-Reply-To: <83mumlnk8y.fsf@gnu.org>
Hello, Eli.
On Sun, Feb 24, 2019 at 19:56:13 +0200, Eli Zaretskii wrote:
> > Date: Sun, 24 Feb 2019 17:37:46 +0000
> > Cc: daniel.lopez999@gmail.com, 34525@debbugs.gnu.org,
> > Stefan Monnier <monnier@iro.umontreal.ca>
> > From: Alan Mackenzie <acm@muc.de>
> > The query-replace word ends up calling re-search-forward.
> > Fre_search_forward ends up calling re_search_2 (which is called
> > rpl_re_search_2 in gdb. :-( ).
> > This calls re_match_2_internal, which scans through the compiled regexp,
> > "\<Bitmap\>".
> > Up till now, we have said yes to replace the first Bitmap with
> > SharedBitmap in query-replace. Emacs is now seeking out the second
> > occurrence of Bitmap, which is on L69 of the OP's test file, and looks
> > like "Bitmap<", where the < has a syntax-table text property of (4 . 62),
> > an opening paren which matches ">".
> > re_natch_2_internal finds its way to case wordbeg: to handle the "\<" of
> > the regexp. It invokes UPDATE_SYNTAX_TABLE (charpos) to get the syntax
> > for the "B" it has already found.
> > Sadly, UPDATE_SYNTAX_TABLE sets its internal structure gl_state not for
> > the current contents of position 1948, but the contents of 1948 before
> > the change at the top of the buffer (Bitmap -> SharedBitmap) was made.
> > So it picks up the syntax for the "<" rather than the "B".
> Are you saying that we've modified buffer text, but
> re_match_2_internal still holds to a C pointer to buffer text before
> the change?
I don't think that's the case. The relevant buffer pointers/sizes are
calculated (in search_buffer_re) as
p1 = BEGV_ADDR;
s1 = GPT_BYTE - BEGV_BYTE;
p2 = GAP_END_ADDR;
s2 = ZV_BYTE - GPT_BYTE;
each time before a search.
> If so, it's a simple manner of recomputing the C pointer using the
> buffer position after the change, right? We do such things in a few
> places, like coding.c, by recording the offset of the text before the
> change and reapplying it after the change.
> > I think the glitch is in the text property interval handling code.
> > It is as though after the replacement of Bitmap by SharedBitmap, the
> > interval starting positions have not been adjusted for the extra six
> > characters.
> If the code has variables that record C pointers to buffer text, those
> need to be updated after every change, of else they will become
> invalid.
> But I'm surprised we have such blatant bugs in such veteran code, ....
The bug was introduced sometime between 25.3 and 26.1. I tried to bisect
the commits between 25.2 and 26.1, but couldn't, because autogen.sh was
broken in lots of the pertinent commits, so I couldn't build these Emacs
versions.
> .... so I'm probably missing something. Can you describe the above
> again, this time showing the relevant code fragments and variables
> involved in this?
I'm afraid my gdb session is too long and chaotic to extract anything
meaningful out of. I'll have to recreate it more purposefully, to get
these results. Not tonight!
We'll get this sorted out.
> Thanks.
--
Alan Mackenzie (Nuremberg, Germany).
next prev parent reply other threads:[~2019-02-24 21:00 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-18 8:28 bug#34525: replace-regexp missing some matches Daniel Lopez
[not found] ` <handler.34525.B.15504786524313.ack@debbugs.gnu.org>
2019-02-18 8:37 ` bug#34525: Acknowledgement (replace-regexp missing some matches) Daniel Lopez
2019-02-18 15:50 ` bug#34525: replace-regexp missing some matches Eli Zaretskii
2019-02-18 16:46 ` Alan Mackenzie
2019-02-18 21:10 ` Alan Mackenzie
2019-02-20 17:07 ` Alan Mackenzie
[not found] ` <20190220170722.GA9655@ACM>
2019-02-20 18:02 ` Eli Zaretskii
2019-02-20 18:58 ` Alan Mackenzie
2019-02-20 19:27 ` Eli Zaretskii
2019-02-20 21:30 ` Alan Mackenzie
[not found] ` <20190220213003.GC9655@ACM>
2019-02-21 3:40 ` Eli Zaretskii
2019-02-24 17:37 ` Alan Mackenzie
2019-02-24 17:56 ` Eli Zaretskii
2019-02-24 21:00 ` Alan Mackenzie [this message]
2019-02-25 20:11 ` Eli Zaretskii
2019-02-25 20:48 ` Alan Mackenzie
2019-02-26 13:50 ` Alan Mackenzie
[not found] ` <20190226135048.GA19653@ACM>
2019-02-26 15:00 ` Alan Mackenzie
2019-02-26 15:39 ` Eli Zaretskii
2019-02-26 16:11 ` Alan Mackenzie
2019-02-26 16:42 ` Eli Zaretskii
2019-02-26 16:55 ` Alan Mackenzie
[not found] ` <20190226165505.GD19653@ACM>
2019-02-26 17:20 ` Eli Zaretskii
2019-02-26 17:23 ` Alan Mackenzie
2019-02-26 15:36 ` Eli Zaretskii
2019-02-26 20:09 ` Stefan Monnier
[not found] ` <jwv8sy2z5yc.fsf-monnier+emacsbugs@gnu.org>
2019-02-26 21:45 ` Alan Mackenzie
2019-02-26 22:09 ` Stefan Monnier
2019-02-27 14:22 ` Alan Mackenzie
[not found] ` <20190227142251.GB4772@ACM>
2019-02-27 15:08 ` Alan Mackenzie
[not found] ` <20190227150849.GC4772@ACM>
2019-02-27 15:40 ` Stefan Monnier
2019-02-27 17:10 ` Alan Mackenzie
2019-02-27 16:39 ` Eli Zaretskii
2019-02-27 17:31 ` Alan Mackenzie
2019-02-27 17:41 ` Stefan Monnier
[not found] ` <20190227173132.GG4772@ACM>
2019-02-27 18:07 ` Eli Zaretskii
2019-02-28 10:50 ` Alan Mackenzie
2019-02-28 17:41 ` Eli Zaretskii
2019-02-28 21:54 ` Alan Mackenzie
[not found] ` <jwvpnrdb0xj.fsf-monnier+emacsbugs@gnu.org>
2019-02-27 18:48 ` Eli Zaretskii
2019-02-27 20:43 ` Alan Mackenzie
2019-02-26 23:00 ` Stefan Monnier
2019-02-20 21:25 ` Daniel Lopez
2019-02-22 16:26 ` Alan Mackenzie
2019-03-01 14:34 ` Alan Mackenzie
[not found] ` <20190301143414.GD5674@ACM>
2019-03-01 17:58 ` Daniel Lopez
2019-03-01 17:42 ` Alan Mackenzie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190224210058.GB21808@ACM \
--to=acm@muc.de \
--cc=34525@debbugs.gnu.org \
--cc=daniel.lopez999@gmail.com \
--cc=eliz@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.