unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#57733: 29.0.50; Fix "Invalid search bound (wrong side of point)" error in replace-*-in-region
@ 2022-09-11 15:59 Kévin Le Gouguec
  2022-09-11 16:17 ` Lars Ingebrigtsen
  0 siblings, 1 reply; 4+ messages in thread
From: Kévin Le Gouguec @ 2022-09-11 15:59 UTC (permalink / raw)
  To: 57733

[-- Attachment #1: Type: text/plain, Size: 2523 bytes --]

The replace-{string,regexp}-in-region functions can throw an error if
the replacement text is large compared to the replaced text: (point)
might move beyond END, which means the underlying {,re-}search-forward
functions are called with BOUND "on the wrong side of point".

Recipe from emacs -Q:

1. Open *scratch* and type:
(with-temp-buffer
  (insert "foo bar baz")
  (replace-regexp-in-region
   "ba." "quux corge grault" (point-min) (point-max)))
2. C-j

Backtrace attached, with suggested patch (including tests) to update END
to reflect the displacement after each replacement.

I'm assuming these are the semantics we prefer, since users can't call
these functions with (> end (point-max)) to guard against that error,
but we could also just bail as soon as (> (point) end).  That would
prevent the error, but it could cause us to return early and miss some
matches.

The patch does change the semantics in the opposite situation (and the
tests check for the new behaviour): when the replacement is shorter than
the original text, it is currently possible for matches that were
originally beyond END to become reachable; the patch prevents that.  I'm
assuming that's TRT because if users say "replace no further than
THERE", they're probably talking about a point of interest (end of a
line, a function…) rather than the numeric value of THERE.


If we want this change to go in: should we call it out in the docstrings
and/or "(elisp) Search and Replace" and/or in NEWS?

FWIW: I've seen no regression with a full "make check", but I've just
written the patch, so I haven't had the time to live with it and see
what breaks.


In GNU Emacs 29.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
 3.24.34, cairo version 1.16.0) of 2022-09-10 built on amdahl30
Repository revision: 433fc8bebfef40130bf62f17e5349560b62e566e
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12101004
System Description: openSUSE Tumbleweed

Configured using:
 'configure --with-cairo --with-gconf --with-sqlite3 --with-xinput2'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB

Important settings:
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=local
  locale-coding-system: utf-8-unix


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Update-search-bound-while-replacing-matches-in-repla.patch --]
[-- Type: text/x-patch, Size: 4218 bytes --]

From f69158a344e5c95b0dc788bcaafcf87eda21a421 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= <kevin.legouguec@gmail.com>
Date: Sun, 11 Sep 2022 16:15:38 +0200
Subject: [PATCH] Update search bound while replacing matches in
 replace-*-in-region

Not doing so can lead to two surprising results:

(1) when REPLACEMENT is much larger than the original text, point
might move beyond END, which causes the search functions to error out;

(2) when REPLACEMENT is much shorter than the original text, the
search functions might find matches that the user did not intend, as
they were beyond END before the replacements happened.

* lisp/subr.el (replace-string-in-region, replace-regexp-in-region):
Update END while replacing matches.
* test/lisp/subr-tests.el (test-replace-string-in-region): Add
regression tests.
---
 lisp/subr.el            | 18 ++++++++++++------
 test/lisp/subr-tests.el | 32 ++++++++++++++++++++++++++++++--
 2 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/lisp/subr.el b/lisp/subr.el
index 686189e69b..10b3bdcd8d 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -4220,12 +4220,15 @@ replace-string-in-region
     (setq end (point-max)))
   (save-excursion
     (let ((matches 0)
-          (case-fold-search nil))
+          (case-fold-search nil)
+          end-of-match)
       (goto-char start)
-      (while (search-forward string end t)
+      (while (setq end-of-match (search-forward string end t))
         (delete-region (match-beginning 0) (match-end 0))
         (insert replacement)
-        (setq matches (1+ matches)))
+        (setq matches (1+ matches)
+              end (+ end
+                     (- (point) end-of-match))))
       (and (not (zerop matches))
            matches))))
 
@@ -4255,11 +4258,14 @@ replace-regexp-in-region
     (setq end (point-max)))
   (save-excursion
     (let ((matches 0)
-          (case-fold-search nil))
+          (case-fold-search nil)
+          end-of-match)
       (goto-char start)
-      (while (re-search-forward regexp end t)
+      (while (setq end-of-match (re-search-forward regexp end t))
         (replace-match replacement t)
-        (setq matches (1+ matches)))
+        (setq matches (1+ matches)
+              end (+ end
+                     (- (point) end-of-match))))
       (and (not (zerop matches))
            matches))))
 
diff --git a/test/lisp/subr-tests.el b/test/lisp/subr-tests.el
index 4310b7291a..3011713210 100644
--- a/test/lisp/subr-tests.el
+++ b/test/lisp/subr-tests.el
@@ -968,7 +968,21 @@ test-replace-string-in-region
     (insert "Foo bar zot foobar")
     (should (= (replace-string-in-region "Foo" "new" (point-min))
                1))
-    (should (equal (buffer-string) "new bar zot foobar"))))
+    (should (equal (buffer-string) "new bar zot foobar")))
+
+  (with-temp-buffer
+    (insert "foo bar baz")
+    (should (= (replace-string-in-region "ba" "quux corge grault" (point-min))
+               2))
+    (should (equal (buffer-string)
+                    "foo quux corge graultr quux corge graultz")))
+
+  (with-temp-buffer
+    (insert "foo bar bar")
+    (should (= (replace-string-in-region " bar" "" (point-min) 8)
+               1))
+    (should (equal (buffer-string)
+                    "foo bar"))))
 
 (ert-deftest test-replace-regexp-in-region ()
   (with-temp-buffer
@@ -991,7 +1005,21 @@ test-replace-regexp-in-region
     (insert "Foo bar zot foobar")
     (should (= (replace-regexp-in-region "Fo+" "new" (point-min))
                1))
-    (should (equal (buffer-string) "new bar zot foobar"))))
+    (should (equal (buffer-string) "new bar zot foobar")))
+
+  (with-temp-buffer
+    (insert "foo bar baz")
+    (should (= (replace-regexp-in-region "ba." "quux corge grault" (point-min))
+               2))
+    (should (equal (buffer-string)
+                    "foo quux corge grault quux corge grault")))
+
+  (with-temp-buffer
+    (insert "foo bar bar")
+    (should (= (replace-regexp-in-region " bar" "" (point-min) 8)
+               1))
+    (should (equal (buffer-string)
+                    "foo bar"))))
 
 (ert-deftest test-with-existing-directory ()
   (let ((dir (make-temp-name "/tmp/not-exist-")))
-- 
2.37.3


[-- Attachment #3: invalid-search-bound.backtrace --]
[-- Type: text/plain, Size: 1754 bytes --]

Debugger entered--Lisp error: (wrong-type-argument sequencep 1)
  replace-regexp-in-string("ba." "quux corge grault" 1 12)
  (progn (insert "foo bar baz") (replace-regexp-in-string "ba." "quux corge grault" (point-min) (point-max)))
  (unwind-protect (progn (insert "foo bar baz") (replace-regexp-in-string "ba." "quux corge grault" (point-min) (point-max))) (and (buffer-name temp-buffer) (kill-buffer temp-buffer)))
  (save-current-buffer (set-buffer temp-buffer) (unwind-protect (progn (insert "foo bar baz") (replace-regexp-in-string "ba." "quux corge grault" (point-min) (point-max))) (and (buffer-name temp-buffer) (kill-buffer temp-buffer))))
  (let ((temp-buffer (generate-new-buffer " *temp*" t))) (save-current-buffer (set-buffer temp-buffer) (unwind-protect (progn (insert "foo bar baz") (replace-regexp-in-string "ba." "quux corge grault" (point-min) (point-max))) (and (buffer-name temp-buffer) (kill-buffer temp-buffer)))))
  (progn (let ((temp-buffer (generate-new-buffer " *temp*" t))) (save-current-buffer (set-buffer temp-buffer) (unwind-protect (progn (insert "foo bar baz") (replace-regexp-in-string "ba." "quux corge grault" (point-min) (point-max))) (and (buffer-name temp-buffer) (kill-buffer temp-buffer))))))
  eval((progn (let ((temp-buffer (generate-new-buffer " *temp*" t))) (save-current-buffer (set-buffer temp-buffer) (unwind-protect (progn (insert "foo bar baz") (replace-regexp-in-string "ba." "quux corge grault" (point-min) (point-max))) (and (buffer-name temp-buffer) (kill-buffer temp-buffer)))))) t)
  elisp--eval-last-sexp(t)
  eval-last-sexp(t)
  eval-print-last-sexp(nil)
  funcall-interactively(eval-print-last-sexp nil)
  call-interactively(eval-print-last-sexp nil nil)
  command-execute(eval-print-last-sexp)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#57733: 29.0.50; Fix "Invalid search bound (wrong side of point)" error in replace-*-in-region
  2022-09-11 15:59 bug#57733: 29.0.50; Fix "Invalid search bound (wrong side of point)" error in replace-*-in-region Kévin Le Gouguec
@ 2022-09-11 16:17 ` Lars Ingebrigtsen
  2022-09-11 17:09   ` Kévin Le Gouguec
  0 siblings, 1 reply; 4+ messages in thread
From: Lars Ingebrigtsen @ 2022-09-11 16:17 UTC (permalink / raw)
  To: Kévin Le Gouguec; +Cc: 57733

Kévin Le Gouguec <kevin.legouguec@gmail.com> writes:

> Backtrace attached, with suggested patch (including tests) to update END
> to reflect the displacement after each replacement.

Perhaps a simpler solution would to just be to use
`save-restriction'+`narrow-to-region' first and then don't use an END in
the search?






^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#57733: 29.0.50; Fix "Invalid search bound (wrong side of point)" error in replace-*-in-region
  2022-09-11 16:17 ` Lars Ingebrigtsen
@ 2022-09-11 17:09   ` Kévin Le Gouguec
  2022-09-11 17:55     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 4+ messages in thread
From: Kévin Le Gouguec @ 2022-09-11 17:09 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 57733

[-- Attachment #1: Type: text/plain, Size: 243 bytes --]

Lars Ingebrigtsen <larsi@gnus.org> writes:

> Perhaps a simpler solution would to just be to use
> `save-restriction'+`narrow-to-region' first and then don't use an END in
> the search?

That makes a lot of sense, thanks; new patch attached.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Restrict-replace-in-region-to-the-bounds-defined-by-.patch --]
[-- Type: text/x-patch, Size: 5015 bytes --]

From b5cc63cfd89f695067e9bb3a0135aded615557f0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= <kevin.legouguec@gmail.com>
Date: Sun, 11 Sep 2022 16:15:38 +0200
Subject: [PATCH] Restrict replace-*-in-region to the bounds defined by caller

And call the corresponding *search-forward functions without giving
them a bound.  The previous code can lead to two surprising outcomes:

(1) when REPLACEMENT is much larger than the original text, point
might move beyond END, which causes the search functions to error out;

(2) when REPLACEMENT is much shorter than the original text, the
search functions might find matches that the user did not intend, as
they were beyond END before the replacements happened.

Fixes bug#57733.

* lisp/subr.el (replace-string-in-region, replace-regexp-in-region):
Narrow to region before iterating over matches, instead of giving a
bound to the search functions.
* test/lisp/subr-tests.el (test-replace-string-in-region): Add
regression tests.
---
 lisp/subr.el            | 38 +++++++++++++++++++++-----------------
 test/lisp/subr-tests.el | 32 ++++++++++++++++++++++++++++++--
 2 files changed, 51 insertions(+), 19 deletions(-)

diff --git a/lisp/subr.el b/lisp/subr.el
index 686189e69b..8769fec2b9 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -4219,15 +4219,17 @@ replace-string-in-region
         (error "End after end of buffer"))
     (setq end (point-max)))
   (save-excursion
-    (let ((matches 0)
-          (case-fold-search nil))
-      (goto-char start)
-      (while (search-forward string end t)
-        (delete-region (match-beginning 0) (match-end 0))
-        (insert replacement)
-        (setq matches (1+ matches)))
-      (and (not (zerop matches))
-           matches))))
+    (goto-char start)
+    (save-restriction
+      (narrow-to-region start end)
+      (let ((matches 0)
+            (case-fold-search nil))
+        (while (search-forward string nil t)
+          (delete-region (match-beginning 0) (match-end 0))
+          (insert replacement)
+          (setq matches (1+ matches)))
+        (and (not (zerop matches))
+             matches)))))
 
 (defun replace-regexp-in-region (regexp replacement &optional start end)
   "Replace REGEXP with REPLACEMENT in the region from START to END.
@@ -4254,14 +4256,16 @@ replace-regexp-in-region
         (error "End after end of buffer"))
     (setq end (point-max)))
   (save-excursion
-    (let ((matches 0)
-          (case-fold-search nil))
-      (goto-char start)
-      (while (re-search-forward regexp end t)
-        (replace-match replacement t)
-        (setq matches (1+ matches)))
-      (and (not (zerop matches))
-           matches))))
+    (goto-char start)
+    (save-restriction
+      (narrow-to-region start end)
+      (let ((matches 0)
+            (case-fold-search nil))
+          (while (re-search-forward regexp nil t)
+          (replace-match replacement t)
+          (setq matches (1+ matches)))
+        (and (not (zerop matches))
+             matches)))))
 
 (defun yank-handle-font-lock-face-property (face start end)
   "If `font-lock-defaults' is nil, apply FACE as a `face' property.
diff --git a/test/lisp/subr-tests.el b/test/lisp/subr-tests.el
index 4310b7291a..3011713210 100644
--- a/test/lisp/subr-tests.el
+++ b/test/lisp/subr-tests.el
@@ -968,7 +968,21 @@ test-replace-string-in-region
     (insert "Foo bar zot foobar")
     (should (= (replace-string-in-region "Foo" "new" (point-min))
                1))
-    (should (equal (buffer-string) "new bar zot foobar"))))
+    (should (equal (buffer-string) "new bar zot foobar")))
+
+  (with-temp-buffer
+    (insert "foo bar baz")
+    (should (= (replace-string-in-region "ba" "quux corge grault" (point-min))
+               2))
+    (should (equal (buffer-string)
+                    "foo quux corge graultr quux corge graultz")))
+
+  (with-temp-buffer
+    (insert "foo bar bar")
+    (should (= (replace-string-in-region " bar" "" (point-min) 8)
+               1))
+    (should (equal (buffer-string)
+                    "foo bar"))))
 
 (ert-deftest test-replace-regexp-in-region ()
   (with-temp-buffer
@@ -991,7 +1005,21 @@ test-replace-regexp-in-region
     (insert "Foo bar zot foobar")
     (should (= (replace-regexp-in-region "Fo+" "new" (point-min))
                1))
-    (should (equal (buffer-string) "new bar zot foobar"))))
+    (should (equal (buffer-string) "new bar zot foobar")))
+
+  (with-temp-buffer
+    (insert "foo bar baz")
+    (should (= (replace-regexp-in-region "ba." "quux corge grault" (point-min))
+               2))
+    (should (equal (buffer-string)
+                    "foo quux corge grault quux corge grault")))
+
+  (with-temp-buffer
+    (insert "foo bar bar")
+    (should (= (replace-regexp-in-region " bar" "" (point-min) 8)
+               1))
+    (should (equal (buffer-string)
+                    "foo bar"))))
 
 (ert-deftest test-with-existing-directory ()
   (let ((dir (make-temp-name "/tmp/not-exist-")))
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* bug#57733: 29.0.50; Fix "Invalid search bound (wrong side of point)" error in replace-*-in-region
  2022-09-11 17:09   ` Kévin Le Gouguec
@ 2022-09-11 17:55     ` Lars Ingebrigtsen
  0 siblings, 0 replies; 4+ messages in thread
From: Lars Ingebrigtsen @ 2022-09-11 17:55 UTC (permalink / raw)
  To: Kévin Le Gouguec; +Cc: 57733

Kévin Le Gouguec <kevin.legouguec@gmail.com> writes:

> That makes a lot of sense, thanks; new patch attached.

Thanks; pushed to Emacs 29.





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-09-11 17:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-11 15:59 bug#57733: 29.0.50; Fix "Invalid search bound (wrong side of point)" error in replace-*-in-region Kévin Le Gouguec
2022-09-11 16:17 ` Lars Ingebrigtsen
2022-09-11 17:09   ` Kévin Le Gouguec
2022-09-11 17:55     ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).