From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#34525: replace-regexp missing some matches Date: Sun, 24 Feb 2019 21:00:58 +0000 Message-ID: <20190224210058.GB21808@ACM> References: <5a74a337-804e-2590-bffd-43a851f90240@gmail.com> <83zhqtjdtz.fsf@gnu.org> <20190220170722.GA9655@ACM> <83sgwigwxm.fsf@gnu.org> <20190220185850.GB9655@ACM> <83lg2agt0j.fsf@gnu.org> <20190220213003.GC9655@ACM> <83bm35hkqo.fsf@gnu.org> <20190224173746.GA21808@ACM> <83mumlnk8y.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="76036"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.10.1 (2018-07-13) Cc: daniel.lopez999@gmail.com, monnier@iro.umontreal.ca, 34525@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Feb 24 22:06:13 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1gy0yT-000JeZ-2Z for geb-bug-gnu-emacs@m.gmane.org; Sun, 24 Feb 2019 22:06:13 +0100 Original-Received: from localhost ([127.0.0.1]:56017 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gy0yS-0006ew-1Q for geb-bug-gnu-emacs@m.gmane.org; Sun, 24 Feb 2019 16:06:12 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:37561) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gy0yJ-0006eH-GO for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 16:06:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gy0yI-00086j-EA for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 16:06:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:37024) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gy0yI-00086X-8v; Sun, 24 Feb 2019 16:06:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gy0yI-0001v5-50; Sun, 24 Feb 2019 16:06:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bug-cc-mode@gnu.org Resent-Date: Sun, 24 Feb 2019 21:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34525 X-GNU-PR-Package: emacs,cc-mode Original-Received: via spool by 34525-submit@debbugs.gnu.org id=B34525.15510423297337 (code B ref 34525); Sun, 24 Feb 2019 21:06:02 +0000 Original-Received: (at 34525) by debbugs.gnu.org; 24 Feb 2019 21:05:29 +0000 Original-Received: from localhost ([127.0.0.1]:50568 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy0xl-0001uH-CI for submit@debbugs.gnu.org; Sun, 24 Feb 2019 16:05:29 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:27015 helo=mail.muc.de) by debbugs.gnu.org with smtp (Exim 4.84_2) (envelope-from ) id 1gy0xi-0001u7-99 for 34525@debbugs.gnu.org; Sun, 24 Feb 2019 16:05:27 -0500 Original-Received: (qmail 97914 invoked by uid 3782); 24 Feb 2019 21:05:23 -0000 Original-Received: from acm.muc.de (p2E5D538C.dip0.t-ipconnect.de [46.93.83.140]) by colin.muc.de (tmda-ofmipd) with ESMTP; Sun, 24 Feb 2019 22:05:21 +0100 Original-Received: (qmail 21566 invoked by uid 1000); 24 Feb 2019 21:00:58 -0000 Content-Disposition: inline In-Reply-To: <83mumlnk8y.fsf@gnu.org> X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:155746 Archived-At: Hello, Eli. On Sun, Feb 24, 2019 at 19:56:13 +0200, Eli Zaretskii wrote: > > Date: Sun, 24 Feb 2019 17:37:46 +0000 > > Cc: daniel.lopez999@gmail.com, 34525@debbugs.gnu.org, > > Stefan Monnier > > From: Alan Mackenzie > > The query-replace word ends up calling re-search-forward. > > Fre_search_forward ends up calling re_search_2 (which is called > > rpl_re_search_2 in gdb. :-( ). > > This calls re_match_2_internal, which scans through the compiled regexp, > > "\". > > Up till now, we have said yes to replace the first Bitmap with > > SharedBitmap in query-replace. Emacs is now seeking out the second > > occurrence of Bitmap, which is on L69 of the OP's test file, and looks > > like "Bitmap<", where the < has a syntax-table text property of (4 . 62), > > an opening paren which matches ">". > > re_natch_2_internal finds its way to case wordbeg: to handle the "\<" of > > the regexp. It invokes UPDATE_SYNTAX_TABLE (charpos) to get the syntax > > for the "B" it has already found. > > Sadly, UPDATE_SYNTAX_TABLE sets its internal structure gl_state not for > > the current contents of position 1948, but the contents of 1948 before > > the change at the top of the buffer (Bitmap -> SharedBitmap) was made. > > So it picks up the syntax for the "<" rather than the "B". > Are you saying that we've modified buffer text, but > re_match_2_internal still holds to a C pointer to buffer text before > the change? I don't think that's the case. The relevant buffer pointers/sizes are calculated (in search_buffer_re) as p1 = BEGV_ADDR; s1 = GPT_BYTE - BEGV_BYTE; p2 = GAP_END_ADDR; s2 = ZV_BYTE - GPT_BYTE; each time before a search. > If so, it's a simple manner of recomputing the C pointer using the > buffer position after the change, right? We do such things in a few > places, like coding.c, by recording the offset of the text before the > change and reapplying it after the change. > > I think the glitch is in the text property interval handling code. > > It is as though after the replacement of Bitmap by SharedBitmap, the > > interval starting positions have not been adjusted for the extra six > > characters. > If the code has variables that record C pointers to buffer text, those > need to be updated after every change, of else they will become > invalid. > But I'm surprised we have such blatant bugs in such veteran code, .... The bug was introduced sometime between 25.3 and 26.1. I tried to bisect the commits between 25.2 and 26.1, but couldn't, because autogen.sh was broken in lots of the pertinent commits, so I couldn't build these Emacs versions. > .... so I'm probably missing something. Can you describe the above > again, this time showing the relevant code fragments and variables > involved in this? I'm afraid my gdb session is too long and chaotic to extract anything meaningful out of. I'll have to recreate it more purposefully, to get these results. Not tonight! We'll get this sorted out. > Thanks. -- Alan Mackenzie (Nuremberg, Germany).