unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Shigeru Fukaya <shigeru.fukaya@gmail.com>
Cc: 44861@debbugs.gnu.org
Subject: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string'
Date: Wed, 25 Nov 2020 15:58:22 +0100	[thread overview]
Message-ID: <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> (raw)
In-Reply-To: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org>

[-- Attachment #1: Type: text/plain, Size: 640 bytes --]

forcemerge 15107 44861
stop

Suggested patch attached. A small test suite for replace-regexp-in-string has already been pushed to master -- very rudimentary, but better than nothing -- and the patch amends it with some new relevant cases that didn't work before.

It is basically your patch but slightly optimised; it turned out that the function call and allocation overhead of the original patch made it a tad too expensive (a pity, because it was very neat). Now performance is about the same as before when the pattern contains no submatches, and slightly above (< 10% slower) with one submatch. It seems worth the correctness.


[-- Attachment #2: 0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch --]
[-- Type: application/octet-stream, Size: 2929 bytes --]

From 9bc8dc80be5cee517fa53e6b8f37881d4220f162 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Wed, 25 Nov 2020 15:32:08 +0100
Subject: [PATCH] Fix replace-regexp-in-string substring match data translation

For certain patterns, re-matching the same regexp on the matched
substring does not produce correctly translated match data
(bug#15107 and bug#44861).

Reported by Kevin Ryde and Shigeru Fukaya.

* lisp/subr.el (replace-regexp-in-string): Translate the match data
by explicit manipulation instead of trusting a call to string-match on
the matched string to do the job.
* test/lisp/subr-tests.el (subr-replace-regexp-in-string):
Add test cases.
---
 lisp/subr.el            | 17 ++++++++++++-----
 test/lisp/subr-tests.el |  6 +++++-
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/lisp/subr.el b/lisp/subr.el
index 1fb0f9ab7e..0ee2199933 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -4537,7 +4537,7 @@ replace-regexp-in-string
   ;; might be reasonable to do so for long enough STRING.]
   (let ((l (length string))
 	(start (or start 0))
-	matches str mb me)
+	matches str mb me md)
     (save-match-data
       (while (and (< start l) (string-match regexp string start))
 	(setq mb (match-beginning 0)
@@ -4546,10 +4546,17 @@ replace-regexp-in-string
 	(when (= me mb) (setq me (min l (1+ mb))))
 	;; Generate a replacement for the matched substring.
 	;; Operate on only the substring to minimize string consing.
-	;; Set up match data for the substring for replacement;
-	;; presumably this is likely to be faster than munging the
-	;; match data directly in Lisp.
-	(string-match regexp (setq str (substring string mb me)))
+
+        ;; Translate the match data so that it applies to the matched substring.
+        (setq md (match-data nil md t))  ; Reuse list from previous match.
+        (let ((m md))
+          (while m
+            (when (car m)
+              (setcar m (- (car m) mb)))
+            (setq m (cdr m)))
+          (set-match-data md))
+
+        (setq str (substring string mb me))
 	(setq matches
 	      (cons (replace-match (if (stringp rep)
 				       rep
diff --git a/test/lisp/subr-tests.el b/test/lisp/subr-tests.el
index c77be511dc..67f7fc9749 100644
--- a/test/lisp/subr-tests.el
+++ b/test/lisp/subr-tests.el
@@ -545,7 +545,11 @@ subr-replace-regexp-in-string
                             (match-beginning 1) (match-end 1)))
                   "babbcaacabc")
                  "b<abbc,0,4,1,3>a<ac,0,2,1,1><abc,0,3,1,2>"))
-  )
+  ;; anchors (bug#15107, bug#44861)
+  (should (equal (replace-regexp-in-string "a\\B" "b" "a aaaa")
+                 "a bbba"))
+  (should (equal (replace-regexp-in-string "\\`\\|x" "z" "--xx--")
+                 "z--zz--")))
 
 (provide 'subr-tests)
 ;;; subr-tests.el ends here
-- 
2.21.1 (Apple Git-122.3)


  reply	other threads:[~2020-11-25 14:58 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-25  4:02 bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' Shigeru Fukaya
2020-11-25 10:58 ` Mattias Engdegård
2020-11-25 14:58   ` Mattias Engdegård [this message]
2020-11-25 21:39     ` Stefan Kangas
2020-11-26 12:57       ` Mattias Engdegård
2020-11-26 13:12         ` Lars Ingebrigtsen
2020-11-26 13:39           ` Mattias Engdegård
2020-11-26 14:03             ` Lars Ingebrigtsen
2020-11-26 14:54               ` Mattias Engdegård
2020-11-29 13:28               ` Basil L. Contovounesios
2020-11-26 13:43           ` Stefan Kangas
2020-11-26 14:03             ` Lars Ingebrigtsen
2020-11-26 14:41             ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org \
    --to=mattiase@acm.org \
    --cc=44861@debbugs.gnu.org \
    --cc=shigeru.fukaya@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).