unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
@ 2023-11-11 19:40 Gabriele Nicolardi
  2023-11-12  9:48 ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Gabriele Nicolardi @ 2023-11-11 19:40 UTC (permalink / raw)
  To: 67124

Hi, I'm resending the bug report trying to do better than the previous one.

When I run `query-replace-regexp` command if I type `?` I get this 
explanation of the function's options:

Type Space or ‘y’ to replace one match, Delete or ‘n’ to skip to next,
RET or ‘q’ to exit, Period to replace one match and exit,
Comma to replace but not move point immediately,
C-r to enter recursive edit (C-M-c to get out again),
C-w to delete match and recursive edit,
C-l to clear the screen, redisplay, and offer same replacement again,
! to replace all remaining matches in this buffer with no more questions,
^ to move point back to previous match,
u to undo previous replacement,
U to undo all replacements,
E to edit the replacement string.
In multi-buffer replacements type ‘Y’ to replace all remaining
matches in all remaining buffers with no more questions,
‘N’ to skip to the next buffer without replacing remaining matches
in the current buffer.

I often type `,` (`comma`) to check the replacement. It happens that, 
some times, I get the error (e.g.):

match-substitute-replacement: Args out of range: #<buffer *scratch*>, 
1667, 1679

The error doesn't happen if I type `y` (or `n`)

Try this:

(query-replace ",.\\footnote{" ".\\footnote{" nil)

With the string ",.\footnote{" at the end of the buffer. To see the 
error "{" must be the last char in the buffer.

(I see often this bug because I use "narrowing" a lot)

Is it a known bug?

My emacs version is GNU Emacs 26.3.


In GNU Emacs 26.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.14)
of 2020-03-26, modified by Debian built on lcy01-amd64-020
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: Ubuntu 20.04.6 LTS

Recent messages:
Mark set
next-line: End of buffer
Mark set
funcall-interactively: End of buffer
Quit
Mark set
Entering debugger...
Mark set
previous-line: Beginning of buffer
Mark set
previous-line: Beginning of buffer
Configured using:
'configure --build x86_64-linux-gnu --prefix=/usr
--sharedstatedir=/var/lib --libexecdir=/usr/lib
--localstatedir=/var/lib --infodir=/usr/share/info
--mandir=/usr/share/man --enable-libsystemd --with-pop=yes
--enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.3/site-lisp:/usr/share/emacs/site-lisp 

--with-sound=alsa --without-gconf --with-mailutils --build
x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
--libexecdir=/usr/lib --localstatedir=/var/lib
--infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd
--with-pop=yes
--enable-locallisppath=/etc/emacs:/usr/local/share/emacs/26.3/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/26.3/site-lisp:/usr/share/emacs/site-lisp 

--with-sound=alsa --without-gconf --with-mailutils --with-x=yes
--with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2
-fdebug-prefix-map=/build/emacs-mEZBk7/emacs-26.3+1=. 
-fstack-protector-strong
-Wformat -Werror=format-security -Wall' 'CPPFLAGS=-Wdate-time
-D_FORTIFY_SOURCE=2' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro''

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS GLIB
NOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM THREADS LIBSYSTEMD LCMS2

Important settings:
value of $LANG: it_IT.UTF-8
locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny seq dired
dired-loaddefs format-spec rfc822 ...)

Memory information:
((conses 16 97042 8905)
(symbols 48 20561 1)
(miscs 40 86 120)
(strings 32 28945 1187)
(string-bytes 1 762727)
(vectors 16 14159)
(vector-slots 8 504582 10748)
(floats 8 56 120)
(intervals 56 316 55)
(buffers 992 12))






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-11 19:40 bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer) Gabriele Nicolardi
@ 2023-11-12  9:48 ` Eli Zaretskii
  2023-11-13  3:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-12  9:48 UTC (permalink / raw)
  To: Gabriele Nicolardi, Stefan Monnier; +Cc: 67124

> Date: Sat, 11 Nov 2023 20:40:27 +0100
> From: Gabriele Nicolardi <gabriele@medialab.sissa.it>
> 
> Hi, I'm resending the bug report trying to do better than the previous one.

Thanks.  (In the future, please respond to the original bug number,
instead of creating a new bug report; I've merged them now.)

> I often type `,` (`comma`) to check the replacement. It happens that, 
> some times, I get the error (e.g.):
> 
> match-substitute-replacement: Args out of range: #<buffer *scratch*>, 
> 1667, 1679
> 
> The error doesn't happen if I type `y` (or `n`)
> 
> Try this:
> 
> (query-replace ",.\\footnote{" ".\\footnote{" nil)
> 
> With the string ",.\footnote{" at the end of the buffer. To see the 
> error "{" must be the last char in the buffer.
> 
> (I see often this bug because I use "narrowing" a lot)
> 
> Is it a known bug?

The kludgey solution in the patch below should fix this.  Please see
if it indeed fixes your real-life use cases.

Stefan, any ideas for a better fix, or other comments?  AFAIU, the
root cause of the bug was that we are using integers instead of
markers there, and that is because bug#31492 needs to preserve the
original match-start position when the search string is a regexp that
matches an empty string (in which case the match-start marker moves
together with the match-end marker instead of staying put).

diff --git a/lisp/replace.el b/lisp/replace.el
index eeac734..6a3f223 100644
--- a/lisp/replace.el
+++ b/lisp/replace.el
@@ -2642,8 +2642,11 @@ replace-match-maybe-edit
   (replace-match newtext fixedcase literal)
   ;; `query-replace' undo feature needs the beginning of the match position,
   ;; but `replace-match' may change it, for instance, with a regexp like "^".
-  ;; Ensure that this function preserves the match data (Bug#31492).
-  (set-match-data match-data)
+  ;; Ensure that this function preserves the beginning of the match position
+  ;; (bug#31492).  But we need to avoid clobbering the end of the match with
+  ;; the original match-end position, since `replace-match' could have made
+  ;; that incorrect or even invalid (bug#67124).
+  (set-match-data (list (car match-data) (nth 1 (match-data))))
   ;; `replace-match' leaves point at the end of the replacement text,
   ;; so move point to the beginning when replacing backward.
   (when backward (goto-char (nth 0 match-data)))





^ permalink raw reply related	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-12  9:48 ` Eli Zaretskii
@ 2023-11-13  3:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-13 14:17     ` Eli Zaretskii
  2023-11-16 15:35     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
       [not found]   ` <ed11baa2-cf89-4a72-91d0-8f26c0af4126@medialab.sissa.it>
  2023-11-15 13:18   ` Eli Zaretskii
  2 siblings, 2 replies; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-13  3:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gabriele Nicolardi, 67124

> Stefan, any ideas for a better fix, or other comments?  AFAIU, the
> root cause of the bug was that we are using integers instead of
> markers there, and that is because bug#31492 needs to preserve the
> original match-start position when the search string is a regexp that
> matches an empty string (in which case the match-start marker moves
> together with the match-end marker instead of staying put).

> diff --git a/lisp/replace.el b/lisp/replace.el
> index eeac734..6a3f223 100644
> --- a/lisp/replace.el
> +++ b/lisp/replace.el
> @@ -2642,8 +2642,11 @@ replace-match-maybe-edit
>    (replace-match newtext fixedcase literal)
>    ;; `query-replace' undo feature needs the beginning of the match position,
>    ;; but `replace-match' may change it, for instance, with a regexp like "^".
> -  ;; Ensure that this function preserves the match data (Bug#31492).
> -  (set-match-data match-data)
> +  ;; Ensure that this function preserves the beginning of the match position
> +  ;; (bug#31492).  But we need to avoid clobbering the end of the match with
> +  ;; the original match-end position, since `replace-match' could have made
> +  ;; that incorrect or even invalid (bug#67124).
> +  (set-match-data (list (car match-data) (nth 1 (match-data))))

(nth 1 (match-data)) == (match-end 0), no?

Hmm... so here we're throwing away all the subgroup info and keeping
only the start/end, right?  It's probably OK, indeed, but I think the
comment should clarify that (and should clarify that we (well,
presumably the undo feature) need the match end in addition to the
match beginning).

Also here it's not obvious which match-data is returned by (match-data).
IIUC it's the match data as adjusted by `replace-match`.

Which makes me wonder why we don't change `replace-match` so it's also
careful to preserve the match beginning just like it preserves the match
end.

>    ;; `replace-match' leaves point at the end of the replacement text,
>    ;; so move point to the beginning when replacing backward.
>    (when backward (goto-char (nth 0 match-data)))

and (nth 0 match-data) == (match-beginning 0), no?

So, I tried the patch below, which makes sense to my superficial
understanding of the problem, but it apparently doesn't fix the problem
in the OP's recipe, so I'm clearly missing something.

In any case the C part of the change seems right (if I may say so
myself :-), at least when `i==0` and more generally when `i` denotes
a subgroup which contains the subgroup that `replace-match` has
replaced.


        Stefan


diff --git a/lisp/replace.el b/lisp/replace.el
index 6b06e48c384..acf6b8a4652 100644
--- a/lisp/replace.el
+++ b/lisp/replace.el
@@ -2642,13 +2642,9 @@ replace-match-maybe-edit
 	    noedit nil)))
   (set-match-data match-data)
   (replace-match newtext fixedcase literal)
-  ;; `query-replace' undo feature needs the beginning of the match position,
-  ;; but `replace-match' may change it, for instance, with a regexp like "^".
-  ;; Ensure that this function preserves the match data (Bug#31492).
-  (set-match-data match-data)
   ;; `replace-match' leaves point at the end of the replacement text,
   ;; so move point to the beginning when replacing backward.
-  (when backward (goto-char (nth 0 match-data)))
+  (when backward (goto-char (match-beginning 0)))
   noedit)
 
 (defvar replace-update-post-hook nil
diff --git a/src/search.c b/src/search.c
index 692d8488049..0c5e81a245d 100644
--- a/src/search.c
+++ b/src/search.c
@@ -3142,9 +3142,11 @@ update_search_regs (ptrdiff_t oldstart, ptrdiff_t oldend, ptrdiff_t newend)
 
   for (i = 0; i < search_regs.num_regs; i++)
     {
-      if (search_regs.start[i] >= oldend)
+      if (search_regs.start[i] <= oldstart)
+        ;
+      else if (search_regs.start[i] >= oldend)
         search_regs.start[i] += change;
-      else if (search_regs.start[i] > oldstart)
+      else
         search_regs.start[i] = oldstart;
       if (search_regs.end[i] >= oldend)
         search_regs.end[i] += change;






^ permalink raw reply related	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
       [not found]   ` <ed11baa2-cf89-4a72-91d0-8f26c0af4126@medialab.sissa.it>
@ 2023-11-13 14:06     ` Eli Zaretskii
  2023-11-13 14:21       ` Dmitry Gutov
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-13 14:06 UTC (permalink / raw)
  To: Gabriele Nicolardi; +Cc: 67124

[Please use Reply All to reply, so that the bug tracker is CC'ed.]
> Date: Sun, 12 Nov 2023 23:44:43 +0100
> From: Gabriele Nicolardi <gabriele@medialab.sissa.it>
> 
> I tested the patch on the MWE I sent to you. Thanks!
> 
> Now I'm not sure what I should do. I write elisp code used by a production team. We use Emacs to
> format LaTeX documents of scientific papers. I don't control which version of Emacs my
> collaborators use. Do you have any suggestions?

I guess either wait for the next Emacs release or build your own
Emacs?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13  3:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-13 14:17     ` Eli Zaretskii
  2023-11-16 14:29       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-16 15:35     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-13 14:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gabriele, 67124

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gabriele Nicolardi <gabriele@medialab.sissa.it>,  67124@debbugs.gnu.org
> Date: Sun, 12 Nov 2023 22:56:16 -0500
> 
> (nth 1 (match-data)) == (match-end 0), no?

No, because the former normally returns a marker, whereas the latter
returns a number.  And here the difference is crucial.

> Hmm... so here we're throwing away all the subgroup info and keeping
> only the start/end, right?

Yes.

> It's probably OK, indeed, but I think the comment should clarify
> that (and should clarify that we (well, presumably the undo feature)
> need the match end in addition to the match beginning).

The comment before the patched part (you can see its end in the patch)
says so, no?

> Also here it's not obvious which match-data is returned by (match-data).
> IIUC it's the match data as adjusted by `replace-match`.

Yes, and that's the root cause here: replace-match updates match-data,
but the original code then clobbered it by overwriting it with the
match-data _before_ the replace-match call.

> Which makes me wonder why we don't change `replace-match` so it's also
> careful to preserve the match beginning just like it preserves the match
> end.

AFAIU, it's a general issue with markers: when you have both
match-beginning and match-end at the same buffer position (because the
matched text is an empty string, as when the search regexp is \b or
similar, then replace-match moves them both to the end of the match,
instead of leaving one of them at the beginning of the match.

> >    ;; `replace-match' leaves point at the end of the replacement text,
> >    ;; so move point to the beginning when replacing backward.
> >    (when backward (goto-char (nth 0 match-data)))
> 
> and (nth 0 match-data) == (match-beginning 0), no?

See above: not exactly.

> So, I tried the patch below, which makes sense to my superficial
> understanding of the problem, but it apparently doesn't fix the problem
> in the OP's recipe, so I'm clearly missing something.

I don't understand the fix, so cannot help you here ;-)





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13 14:06     ` Eli Zaretskii
@ 2023-11-13 14:21       ` Dmitry Gutov
  2023-11-13 14:38         ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Dmitry Gutov @ 2023-11-13 14:21 UTC (permalink / raw)
  To: Eli Zaretskii, Gabriele Nicolardi; +Cc: 67124

On 13/11/2023 16:06, Eli Zaretskii wrote:
> [Please use Reply All to reply, so that the bug tracker is CC'ed.]
>> Date: Sun, 12 Nov 2023 23:44:43 +0100
>> From: Gabriele Nicolardi<gabriele@medialab.sissa.it>
>>
>> I tested the patch on the MWE I sent to you. Thanks!
>>
>> Now I'm not sure what I should do. I write elisp code used by a production team. We use Emacs to
>> format LaTeX documents of scientific papers. I don't control which version of Emacs my
>> collaborators use. Do you have any suggestions?
> I guess either wait for the next Emacs release or build your own
> Emacs?

I haven't examined the exact problem, but in such cases some people also 
either have workarounds in their own code, or supply "advices" for the 
core functions that work around the problem there.

This kind of solution requires a certain level of understanding of the 
code in question, and might be quite brittle otherwise (breaking other 
features for the same users). Not for the faint of heart, I'd say.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13 14:21       ` Dmitry Gutov
@ 2023-11-13 14:38         ` Eli Zaretskii
  2023-11-13 14:45           ` Dmitry Gutov
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-13 14:38 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: gabriele, 67124

> Date: Mon, 13 Nov 2023 16:21:43 +0200
> Cc: 67124@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 13/11/2023 16:06, Eli Zaretskii wrote:
> > [Please use Reply All to reply, so that the bug tracker is CC'ed.]
> >> Date: Sun, 12 Nov 2023 23:44:43 +0100
> >> From: Gabriele Nicolardi<gabriele@medialab.sissa.it>
> >>
> >> I tested the patch on the MWE I sent to you. Thanks!
> >>
> >> Now I'm not sure what I should do. I write elisp code used by a production team. We use Emacs to
> >> format LaTeX documents of scientific papers. I don't control which version of Emacs my
> >> collaborators use. Do you have any suggestions?
> > I guess either wait for the next Emacs release or build your own
> > Emacs?
> 
> I haven't examined the exact problem, but in such cases some people also 
> either have workarounds in their own code, or supply "advices" for the 
> core functions that work around the problem there.

Then maybe a simple replacement for the offending function, defined in
an init file, would also be a possible solution?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13 14:38         ` Eli Zaretskii
@ 2023-11-13 14:45           ` Dmitry Gutov
  2023-11-13 15:39             ` Gabriele Nicolardi
  0 siblings, 1 reply; 24+ messages in thread
From: Dmitry Gutov @ 2023-11-13 14:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124

On 13/11/2023 16:38, Eli Zaretskii wrote:
>> Date: Mon, 13 Nov 2023 16:21:43 +0200
>> Cc:67124@debbugs.gnu.org
>> From: Dmitry Gutov<dmitry@gutov.dev>
>>
>> On 13/11/2023 16:06, Eli Zaretskii wrote:
>>> [Please use Reply All to reply, so that the bug tracker is CC'ed.]
>>>> Date: Sun, 12 Nov 2023 23:44:43 +0100
>>>> From: Gabriele Nicolardi<gabriele@medialab.sissa.it>
>>>>
>>>> I tested the patch on the MWE I sent to you. Thanks!
>>>>
>>>> Now I'm not sure what I should do. I write elisp code used by a production team. We use Emacs to
>>>> format LaTeX documents of scientific papers. I don't control which version of Emacs my
>>>> collaborators use. Do you have any suggestions?
>>> I guess either wait for the next Emacs release or build your own
>>> Emacs?
>> I haven't examined the exact problem, but in such cases some people also
>> either have workarounds in their own code, or supply "advices" for the
>> core functions that work around the problem there.
> Then maybe a simple replacement for the offending function, defined in
> an init file, would also be a possible solution?

Sure. Though it might be more reliable with advice, since they don't 
depend on the loading order.

The problems I had in mind would be more subtle and independent of the 
choice between these two methods. Stefan's latest patch includes changes 
in search.c, so it might be difficult to work around in just Lisp. 
Again, I haven't examined the problem itself.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13 14:45           ` Dmitry Gutov
@ 2023-11-13 15:39             ` Gabriele Nicolardi
  2023-11-16 14:45               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 24+ messages in thread
From: Gabriele Nicolardi @ 2023-11-13 15:39 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii; +Cc: 67124

Ok, I'll wait for the next release, Thank you.

Il 13/11/23 15:45, Dmitry Gutov ha scritto:
> On 13/11/2023 16:38, Eli Zaretskii wrote:
>>> Date: Mon, 13 Nov 2023 16:21:43 +0200
>>> Cc:67124@debbugs.gnu.org
>>> From: Dmitry Gutov<dmitry@gutov.dev>
>>>
>>> On 13/11/2023 16:06, Eli Zaretskii wrote:
>>>> [Please use Reply All to reply, so that the bug tracker is CC'ed.]
>>>>> Date: Sun, 12 Nov 2023 23:44:43 +0100
>>>>> From: Gabriele Nicolardi<gabriele@medialab.sissa.it>
>>>>>
>>>>> I tested the patch on the MWE I sent to you. Thanks!
>>>>>
>>>>> Now I'm not sure what I should do. I write elisp code used by a 
>>>>> production team. We use Emacs to
>>>>> format LaTeX documents of scientific papers. I don't control which 
>>>>> version of Emacs my
>>>>> collaborators use. Do you have any suggestions?
>>>> I guess either wait for the next Emacs release or build your own
>>>> Emacs?
>>> I haven't examined the exact problem, but in such cases some people 
>>> also
>>> either have workarounds in their own code, or supply "advices" for the
>>> core functions that work around the problem there.
>> Then maybe a simple replacement for the offending function, defined in
>> an init file, would also be a possible solution?
>
> Sure. Though it might be more reliable with advice, since they don't 
> depend on the loading order.
>
> The problems I had in mind would be more subtle and independent of the 
> choice between these two methods. Stefan's latest patch includes 
> changes in search.c, so it might be difficult to work around in just 
> Lisp. Again, I haven't examined the problem itself.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-12  9:48 ` Eli Zaretskii
  2023-11-13  3:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
       [not found]   ` <ed11baa2-cf89-4a72-91d0-8f26c0af4126@medialab.sissa.it>
@ 2023-11-15 13:18   ` Eli Zaretskii
  2 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-15 13:18 UTC (permalink / raw)
  To: gabriele; +Cc: monnier, 67124-done

> Cc: 67124@debbugs.gnu.org
> Date: Sun, 12 Nov 2023 11:48:17 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > I often type `,` (`comma`) to check the replacement. It happens that, 
> > some times, I get the error (e.g.):
> > 
> > match-substitute-replacement: Args out of range: #<buffer *scratch*>, 
> > 1667, 1679
> > 
> > The error doesn't happen if I type `y` (or `n`)
> > 
> > Try this:
> > 
> > (query-replace ",.\\footnote{" ".\\footnote{" nil)
> > 
> > With the string ",.\footnote{" at the end of the buffer. To see the 
> > error "{" must be the last char in the buffer.
> > 
> > (I see often this bug because I use "narrowing" a lot)
> > 
> > Is it a known bug?
> 
> The kludgey solution in the patch below should fix this.  Please see
> if it indeed fixes your real-life use cases.

I've now installed this on master, and I'm closing this bug.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13 14:17     ` Eli Zaretskii
@ 2023-11-16 14:29       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-16 14:55         ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-16 14:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124

[ Sorry for my late reply, somehow I missed this reply of yours when
  you sent it :-( ]

>> (nth 1 (match-data)) == (match-end 0), no?
> No, because the former normally returns a marker, whereas the latter
> returns a number.  And here the difference is crucial.

It returns a different representation of the same number, agreed.
But we pass that to `set-match-data` which won't keep that difference,
will it?

>> It's probably OK, indeed, but I think the comment should clarify
>> that (and should clarify that we (well, presumably the undo feature)
>> need the match end in addition to the match beginning).
> The comment before the patched part (you can see its end in the patch)
> says so, no?

It says that we need to adjust (match-beginning 0), while keeping
(match-end 0) but it doesn't say that it's OK that we throw away
the rest.
Probably not important, tho, you're right.

>> Also here it's not obvious which match-data is returned by (match-data).
>> IIUC it's the match data as adjusted by `replace-match`.
> Yes, and that's the root cause here: replace-match updates match-data,
> but the original code then clobbered it by overwriting it with the
> match-data _before_ the replace-match call.

I think the comment should mention it.

>> Which makes me wonder why we don't change `replace-match` so it's also
>> careful to preserve the match beginning just like it preserves the match
>> end.
> AFAIU, it's a general issue with markers:

But the match data adjusted by `replace-match` is not made of markers.
It's plain integers which are manually adjusted (by the code which
I changed in my patch).

> when you have both match-beginning and match-end at the same buffer
> position (because the matched text is an empty string, as when the
> search regexp is \b or similar, then replace-match moves them both to
> the end of the match, instead of leaving one of them at the beginning
> of the match.

I agree if they were markers that's what would happen.  But instead the
behavior we see is because the code of `update_search_regs` mimics the
behavior of "insert-before" markers both for start and end boundaries
of subgroups.  My patch tries to make it more careful.

>> >    ;; `replace-match' leaves point at the end of the replacement text,
>> >    ;; so move point to the beginning when replacing backward.
>> >    (when backward (goto-char (nth 0 match-data)))
>> 
>> and (nth 0 match-data) == (match-beginning 0), no?
> See above: not exactly.

I believe the numerical value of (nth 0 match-data) will be the same at
this point as that of (nth 0 (match-data)) because we just passed that
very value to `set-match-data`, and that is always equal
(numerically) to (match-beginning 0).
Since the only thing we do with that value is pass it immediately to
`goto-char`, it makes no difference if it's a marker or an integer, no?

What am I missing?

>> So, I tried the patch below, which makes sense to my superficial
>> understanding of the problem, but it apparently doesn't fix the problem
>> in the OP's recipe, so I'm clearly missing something.
> I don't understand the fix, so cannot help you here ;-)

:-(


        Stefan






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13 15:39             ` Gabriele Nicolardi
@ 2023-11-16 14:45               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-16 16:27                 ` Gabriele Nicolardi
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-16 14:45 UTC (permalink / raw)
  To: Gabriele Nicolardi; +Cc: Dmitry Gutov, Eli Zaretskii, 67124

> Ok, I'll wait for the next release, Thank you.

I think you might be able to use the following as a temporary fix which
should also be harmless when not useful:

    (advice-add 'replace-match-maybe-edit :before
                #'my-workaround-for-bug67124)
    (defun my-workaround-for-bug67124 (&optional _newtext _fixedcase
                                       _literal _noedit match-data
                                       &rest _)
      (when (and (integerp (car-safe match-data))
                 (integerp (car-safe (cdr-safe match-data))))
        ;; Make sure the "end" part of `match-data` is a marker
        ;; so it can't be pointing outside the buffer no matter
        ;; what the replacement does.
        (setf (nth 1 match-data)
              (copy-marker (nth 1 match-data)))))


-- Stefan


> Il 13/11/23 15:45, Dmitry Gutov ha scritto:
>> On 13/11/2023 16:38, Eli Zaretskii wrote:
>>>> Date: Mon, 13 Nov 2023 16:21:43 +0200
>>>> Cc:67124@debbugs.gnu.org
>>>> From: Dmitry Gutov<dmitry@gutov.dev>
>>>>
>>>> On 13/11/2023 16:06, Eli Zaretskii wrote:
>>>>> [Please use Reply All to reply, so that the bug tracker is CC'ed.]
>>>>>> Date: Sun, 12 Nov 2023 23:44:43 +0100
>>>>>> From: Gabriele Nicolardi<gabriele@medialab.sissa.it>
>>>>>>
>>>>>> I tested the patch on the MWE I sent to you. Thanks!
>>>>>>
>>>>>> Now I'm not sure what I should do. I write elisp code used by
>>>>>> a production team. We use Emacs to
>>>>>> format LaTeX documents of scientific papers. I don't control which
>>>>>> version of Emacs my
>>>>>> collaborators use. Do you have any suggestions?
>>>>> I guess either wait for the next Emacs release or build your own
>>>>> Emacs?
>>>> I haven't examined the exact problem, but in such cases some people also
>>>> either have workarounds in their own code, or supply "advices" for the
>>>> core functions that work around the problem there.
>>> Then maybe a simple replacement for the offending function, defined in
>>> an init file, would also be a possible solution?
>>
>> Sure. Though it might be more reliable with advice, since they don't
>> depend on the loading order.
>>
>> The problems I had in mind would be more subtle and independent of the
>> choice between these two methods. Stefan's latest patch includes changes
>> in search.c, so it might be difficult to work around in just Lisp. Again,
>> I haven't examined the problem itself.






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 14:29       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-16 14:55         ` Eli Zaretskii
  2023-11-16 16:15           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-16 14:55 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gabriele, 67124

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gabriele@medialab.sissa.it,  67124@debbugs.gnu.org
> Date: Thu, 16 Nov 2023 09:29:22 -0500
> 
> >> >    ;; `replace-match' leaves point at the end of the replacement text,
> >> >    ;; so move point to the beginning when replacing backward.
> >> >    (when backward (goto-char (nth 0 match-data)))
> >> 
> >> and (nth 0 match-data) == (match-beginning 0), no?
> > See above: not exactly.
> 
> I believe the numerical value of (nth 0 match-data) will be the same at
> this point as that of (nth 0 (match-data)) because we just passed that
> very value to `set-match-data`, and that is always equal
> (numerically) to (match-beginning 0).
> Since the only thing we do with that value is pass it immediately to
> `goto-char`, it makes no difference if it's a marker or an integer, no?
> 
> What am I missing?

What are you trying to understand?  The cause of the bug is that we
force the match data back to the buffer position before the replace.
As long as that is not removed, the bug will remain.

What my kludge did is simply use a marker, so the adjusted position is
not clobbered.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-13  3:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-13 14:17     ` Eli Zaretskii
@ 2023-11-16 15:35     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-16 16:51       ` Eli Zaretskii
  1 sibling, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-16 15:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Gabriele Nicolardi, 67124

> Which makes me wonder why we don't change `replace-match` so it's also
> careful to preserve the match beginning just like it preserves the match
> end.
[...]
> So, I tried the patch below, which makes sense to my superficial
> understanding of the problem, but it apparently doesn't fix the problem
> in the OP's recipe, so I'm clearly missing something.

Hmm... not sure what happened, because I just tried it again and it does
seem to fix the OP's problem.

So I suggest the patch below (adjusted to the new code on `master`).


        Stefan


diff --git a/lisp/replace.el b/lisp/replace.el
index 7fec54ecb27..acf6b8a4652 100644
--- a/lisp/replace.el
+++ b/lisp/replace.el
@@ -2642,16 +2642,9 @@ replace-match-maybe-edit
 	    noedit nil)))
   (set-match-data match-data)
   (replace-match newtext fixedcase literal)
-  ;; `query-replace' undo feature needs the beginning of the match position,
-  ;; but `replace-match' may change it, for instance, with a regexp like "^".
-  ;; Ensure that this function preserves the beginning of the match position
-  ;; (bug#31492).  But we need to avoid clobbering the end of the match with
-  ;; the original match-end position, since `replace-match' could have made
-  ;; that incorrect or even invalid (bug#67124).
-  (set-match-data (list (car match-data) (nth 1 (match-data))))
   ;; `replace-match' leaves point at the end of the replacement text,
   ;; so move point to the beginning when replacing backward.
-  (when backward (goto-char (nth 0 match-data)))
+  (when backward (goto-char (match-beginning 0)))
   noedit)
 
 (defvar replace-update-post-hook nil
diff --git a/src/search.c b/src/search.c
index 692d8488049..177bb720641 100644
--- a/src/search.c
+++ b/src/search.c
@@ -3142,9 +3142,16 @@ update_search_regs (ptrdiff_t oldstart, ptrdiff_t oldend, ptrdiff_t newend)
 
   for (i = 0; i < search_regs.num_regs; i++)
     {
-      if (search_regs.start[i] >= oldend)
+      if (search_regs.start[i] <= oldstart)
+        /* If the subgroup that `replace-match` is modifying encloses the
+           subgroup `i`, then its `start` position should stay unchanged.
+           That's always true for subgroup 0.
+           For other subgroups it depends on details we don't have, so
+           we optimistically assume that it also holds for them.  */
+        ;
+      else if (search_regs.start[i] >= oldend)
         search_regs.start[i] += change;
-      else if (search_regs.start[i] > oldstart)
+      else
         search_regs.start[i] = oldstart;
       if (search_regs.end[i] >= oldend)
         search_regs.end[i] += change;






^ permalink raw reply related	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 14:55         ` Eli Zaretskii
@ 2023-11-16 16:15           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-16 16:56             ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-16 16:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124

>> >> >    ;; `replace-match' leaves point at the end of the replacement text,
>> >> >    ;; so move point to the beginning when replacing backward.
>> >> >    (when backward (goto-char (nth 0 match-data)))
>> >> 
>> >> and (nth 0 match-data) == (match-beginning 0), no?
>> > See above: not exactly.
>> 
>> I believe the numerical value of (nth 0 match-data) will be the same at
>> this point as that of (nth 0 (match-data)) because we just passed that
>> very value to `set-match-data`, and that is always equal
>> (numerically) to (match-beginning 0).
>> Since the only thing we do with that value is pass it immediately to
>> `goto-char`, it makes no difference if it's a marker or an integer, no?
>> 
>> What am I missing?
>
> What are you trying to understand?

Why you're saying that in

    ;; `replace-match' leaves point at the end of the replacement text,
    ;; so move point to the beginning when replacing backward.
    (when backward (goto-char (nth 0 match-data)))

it is not true that

    and (nth 0 match-data) == (match-beginning 0), no?

Note: this is not directly related to your patch, since your patch did
not touch that line, AFAICT.

> What my kludge did is simply use a marker, so the adjusted position is
> not clobbered.

I don't see that.  E.g. if you change your code from

    (set-match-data (list (car match-data) (nth 1 (match-data))))
to
    (set-match-data (list (car match-data) (nth 1 (match-data t))))

it fixes the problem just as well, AFAICT.  And it could even be
replaced with:

    (set-match-data (cons (car match-data) (cdr (match-data t))))

My understand is that what you patch does is use the match end as
adjusted by `replace-match` rather than the match end as provided by the
`match-end` argument (because that argument contains integers and hence
ends up pointing to the wrong place after the buffer was modified by
`replace-match`).

Basically, the purpose of the above `set-match-data` originally was to
correct the (match-beginning 0) info because `replace-match` adjusted it
incorrectly (i.e. it works around a bug in `update_search_regs`), but
while doing that it ended up messing up the (match-end 0), so your patch
refines that code so as to leave (match-end 0) unchanged while
correcting (match-beginning 0).

"use a marker kludge" would sound like an appropriate description of the
`advice-add` I sent to Grabriel, on the other hand.


        Stefan






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 14:45               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-16 16:27                 ` Gabriele Nicolardi
  0 siblings, 0 replies; 24+ messages in thread
From: Gabriele Nicolardi @ 2023-11-16 16:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Dmitry Gutov, Eli Zaretskii, 67124

Il 16/11/23 15:45, Stefan Monnier ha scritto:
>> Ok, I'll wait for the next release, Thank you.
> I think you might be able to use the following as a temporary fix which
> should also be harmless when not useful:
>
>      (advice-add 'replace-match-maybe-edit :before
>                  #'my-workaround-for-bug67124)
>      (defun my-workaround-for-bug67124 (&optional _newtext _fixedcase
>                                         _literal _noedit match-data
>                                         &rest _)
>        (when (and (integerp (car-safe match-data))
>                   (integerp (car-safe (cdr-safe match-data))))
>          ;; Make sure the "end" part of `match-data` is a marker
>          ;; so it can't be pointing outside the buffer no matter
>          ;; what the replacement does.
>          (setf (nth 1 match-data)
>                (copy-marker (nth 1 match-data)))))
>
>
> -- Stefan

Thank you! I'll study the code and l'll try it!

Gabriele

>
>> Il 13/11/23 15:45, Dmitry Gutov ha scritto:
>>> On 13/11/2023 16:38, Eli Zaretskii wrote:
>>>>> Date: Mon, 13 Nov 2023 16:21:43 +0200
>>>>> Cc:67124@debbugs.gnu.org
>>>>> From: Dmitry Gutov<dmitry@gutov.dev>
>>>>>
>>>>> On 13/11/2023 16:06, Eli Zaretskii wrote:
>>>>>> [Please use Reply All to reply, so that the bug tracker is CC'ed.]
>>>>>>> Date: Sun, 12 Nov 2023 23:44:43 +0100
>>>>>>> From: Gabriele Nicolardi<gabriele@medialab.sissa.it>
>>>>>>>
>>>>>>> I tested the patch on the MWE I sent to you. Thanks!
>>>>>>>
>>>>>>> Now I'm not sure what I should do. I write elisp code used by
>>>>>>> a production team. We use Emacs to
>>>>>>> format LaTeX documents of scientific papers. I don't control which
>>>>>>> version of Emacs my
>>>>>>> collaborators use. Do you have any suggestions?
>>>>>> I guess either wait for the next Emacs release or build your own
>>>>>> Emacs?
>>>>> I haven't examined the exact problem, but in such cases some people also
>>>>> either have workarounds in their own code, or supply "advices" for the
>>>>> core functions that work around the problem there.
>>>> Then maybe a simple replacement for the offending function, defined in
>>>> an init file, would also be a possible solution?
>>> Sure. Though it might be more reliable with advice, since they don't
>>> depend on the loading order.
>>>
>>> The problems I had in mind would be more subtle and independent of the
>>> choice between these two methods. Stefan's latest patch includes changes
>>> in search.c, so it might be difficult to work around in just Lisp. Again,
>>> I haven't examined the problem itself.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 15:35     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-16 16:51       ` Eli Zaretskii
  2023-11-16 18:23         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-16 16:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gabriele, 67124

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Gabriele Nicolardi <gabriele@medialab.sissa.it>,  67124@debbugs.gnu.org
> Date: Thu, 16 Nov 2023 10:35:26 -0500
> 
> > So, I tried the patch below, which makes sense to my superficial
> > understanding of the problem, but it apparently doesn't fix the problem
> > in the OP's recipe, so I'm clearly missing something.
> 
> Hmm... not sure what happened, because I just tried it again and it does
> seem to fix the OP's problem.

Are you sure?

> So I suggest the patch below (adjusted to the new code on `master`).
> 
> 
>         Stefan
> 
> 
> diff --git a/lisp/replace.el b/lisp/replace.el
> index 7fec54ecb27..acf6b8a4652 100644
> --- a/lisp/replace.el
> +++ b/lisp/replace.el
> @@ -2642,16 +2642,9 @@ replace-match-maybe-edit
>  	    noedit nil)))
>    (set-match-data match-data)

Why do we still need this line?

>    ;; `replace-match' leaves point at the end of the replacement text,
>    ;; so move point to the beginning when replacing backward.
> -  (when backward (goto-char (nth 0 match-data)))
> +  (when backward (goto-char (match-beginning 0)))

Why this unrelated change? is something wrong with the original code?

> --- a/src/search.c
> +++ b/src/search.c
> @@ -3142,9 +3142,16 @@ update_search_regs (ptrdiff_t oldstart, ptrdiff_t oldend, ptrdiff_t newend)
>  
>    for (i = 0; i < search_regs.num_regs; i++)
>      {
> -      if (search_regs.start[i] >= oldend)
> +      if (search_regs.start[i] <= oldstart)
> +        /* If the subgroup that `replace-match` is modifying encloses the
> +           subgroup `i`, then its `start` position should stay unchanged.
> +           That's always true for subgroup 0.

I've read this part of the comment many times, and I still don't
understand what you are trying to convey there, and thus don't
understand the new 'if' clause you added.  In particular, how come
this was never something brought to our attention? the comment seems
to imply that replace-match was somehow badly broken since forever.

And please don't use Markdown's markup in our code.

Still confused...





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 16:15           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-16 16:56             ` Eli Zaretskii
  2023-11-16 18:01               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-16 16:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gabriele, 67124

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gabriele@medialab.sissa.it,  67124@debbugs.gnu.org
> Date: Thu, 16 Nov 2023 11:15:56 -0500
> 
> > What are you trying to understand?
> 
> Why you're saying that in
> 
>     ;; `replace-match' leaves point at the end of the replacement text,
>     ;; so move point to the beginning when replacing backward.
>     (when backward (goto-char (nth 0 match-data)))
> 
> it is not true that
> 
>     and (nth 0 match-data) == (match-beginning 0), no?

Because of markers vs positions, as I've tried to explain.  The
difference is minor, of course.

> > What my kludge did is simply use a marker, so the adjusted position is
> > not clobbered.
> 
> I don't see that.  E.g. if you change your code from
> 
>     (set-match-data (list (car match-data) (nth 1 (match-data))))
> to
>     (set-match-data (list (car match-data) (nth 1 (match-data t))))
> 
> it fixes the problem just as well, AFAICT.

Yes, but match-data (the function) returns updated positions, which
behave like markers across the replace-match call.

> My understand is that what you patch does is use the match end as
> adjusted by `replace-match` rather than the match end as provided by the
> `match-end` argument (because that argument contains integers and hence
> ends up pointing to the wrong place after the buffer was modified by
> `replace-match`).

Yes.

> Basically, the purpose of the above `set-match-data` originally was to
> correct the (match-beginning 0) info because `replace-match` adjusted it
> incorrectly (i.e. it works around a bug in `update_search_regs`), but
> while doing that it ended up messing up the (match-end 0), so your patch
> refines that code so as to leave (match-end 0) unchanged while
> correcting (match-beginning 0).

Yes.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 16:56             ` Eli Zaretskii
@ 2023-11-16 18:01               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-16 18:34                 ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-16 18:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124

>> > What are you trying to understand?
>> 
>> Why you're saying that in
>> 
>>     ;; `replace-match' leaves point at the end of the replacement text,
>>     ;; so move point to the beginning when replacing backward.
>>     (when backward (goto-char (nth 0 match-data)))
>> 
>> it is not true that
>> 
>>     and (nth 0 match-data) == (match-beginning 0), no?
>
> Because of markers vs positions, as I've tried to explain.  The
> difference is minor, of course.

But `goto-char` doesn't care about that difference, and no buffer will
be changed between the time we call `match-data` (thus creating the
markers) and the time we use those markers, so going through markers is
just extra work for no benefit.

IOW, I still don't understand how "markers vs positions" is relevant in

    (when backward (goto-char (nth 0 match-data)))

>> > What my kludge did is simply use a marker, so the adjusted position is
>> > not clobbered.
>> 
>> I don't see that.  E.g. if you change your code from
>> 
>>     (set-match-data (list (car match-data) (nth 1 (match-data))))
>> to
>>     (set-match-data (list (car match-data) (nth 1 (match-data t))))
>> 
>> it fixes the problem just as well, AFAICT.
>
> Yes, but match-data (the function) returns updated positions, which
> behave like markers across the replace-match call.

But the positions have already been updated.  So

    (match-data t)

would also return those updated positions, even though it doesn't
use markers.


        Stefan






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 16:51       ` Eli Zaretskii
@ 2023-11-16 18:23         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-18 10:18           ` Eli Zaretskii
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-16 18:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124

>> > So, I tried the patch below, which makes sense to my superficial
>> > understanding of the problem, but it apparently doesn't fix the problem
>> > in the OP's recipe, so I'm clearly missing something.
>> Hmm... not sure what happened, because I just tried it again and it does
>> seem to fix the OP's problem.
> Are you sure?

Well, since it didn't first time I tried, no, I'm not sure.
But the more I look at it, the more I'm sure that it was just a pilot
error first time around.

>> diff --git a/lisp/replace.el b/lisp/replace.el
>> index 7fec54ecb27..acf6b8a4652 100644
>> --- a/lisp/replace.el
>> +++ b/lisp/replace.el
>> @@ -2642,16 +2642,9 @@ replace-match-maybe-edit
>>  	    noedit nil)))
>>    (set-match-data match-data)
>
> Why do we still need this line?

IIUC this line is there because the code between the original regexp
match and the `replace-match` itself is large and thus likely to mess
the match data.

>>    ;; `replace-match' leaves point at the end of the replacement text,
>>    ;; so move point to the beginning when replacing backward.
>> -  (when backward (goto-char (nth 0 match-data)))
>> +  (when backward (goto-char (match-beginning 0)))
>
> Why this unrelated change? is something wrong with the original code?

No, indeed, it's unrelated and not really better.

>> --- a/src/search.c
>> +++ b/src/search.c
>> @@ -3142,9 +3142,16 @@ update_search_regs (ptrdiff_t oldstart, ptrdiff_t oldend, ptrdiff_t newend)
>>  
>>    for (i = 0; i < search_regs.num_regs; i++)
>>      {
>> -      if (search_regs.start[i] >= oldend)
>> +      if (search_regs.start[i] <= oldstart)
>> +        /* If the subgroup that `replace-match` is modifying encloses the
>> +           subgroup `i`, then its `start` position should stay unchanged.
>> +           That's always true for subgroup 0.
>
> I've read this part of the comment many times, and I still don't
> understand what you are trying to convey there, and thus don't
> understand the new 'if' clause you added.  In particular, how come
> this was never something brought to our attention? the comment seems
> to imply that replace-match was somehow badly broken since forever.

`replace-match` goes through the trouble of trying to keep the
match-data valid by moving the subgroup positions.

The way it does it currently is sometimes flat out incorrect and other
times debatable.  For end positions, we do:

      if (search_regs.end[i] >= oldend)
        search_regs.end[i] += change;
      else if (search_regs.end[i] > oldstart)
        search_regs.end[i] = oldstart;

which means that positions beyond (or at the end of) the change are
moved appropriately, positions before (or at the beginning of) the change
are left untouched and positions strictly within the change are moved to
the beginning of the change.
If the position is both at start and end at the same time, we move them,
assuming that they're more at the end than at the start.  For subgroup
0, this is definitely correct.  For other subgroups, it depends on the
inclusion or relative position of that subgroup with the subgroup that
`replace-match` is replacing.  E.g. if I matched
"\\(\\)\\(\\(\\)\\)\\(\\)" and I replace subgroup 3 with "hi", then the
ideal behavior would be to move the end of subgroups 0, 2, 3, and 4 but
not 1.

For start positions, we do the same:

      if (search_regs.start[i] >= oldend)
        search_regs.start[i] += change;
      else if (search_regs.start[i] > oldstart)
        search_regs.start[i] = oldstart;

When oldstart < oldend this works OK, but when they're equal, moving the
start position is definitely wrong for subgroup 0 and is often wrong for
others as well.  In the above regexp example, we should ideally move the
start position of subgroup 4 only and we should leave the other
ones unchanged.

`update_search_regs` doesn't have enough information to figure out which
subgroup is before/after/around/within some other subgroup and it isn't
even told which subgroup is being updated by `replace-match`, so the
best it can do is to either move them all or move none.  The current
code moves them all, which is always wrong for start[0].
My patch intends to make the other choice so it will get it wrong for
start[4] in the above example (just like the current code gets it wrong
for end[1]) but it will get it right for the other ones.

> And please don't use Markdown's markup in our code.

Oops, thanks for catching it.


        Stefan






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 18:01               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-16 18:34                 ` Eli Zaretskii
  0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-16 18:34 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gabriele, 67124

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gabriele@medialab.sissa.it,  67124@debbugs.gnu.org
> Date: Thu, 16 Nov 2023 13:01:27 -0500
> 
> >> > What are you trying to understand?
> >> 
> >> Why you're saying that in
> >> 
> >>     ;; `replace-match' leaves point at the end of the replacement text,
> >>     ;; so move point to the beginning when replacing backward.
> >>     (when backward (goto-char (nth 0 match-data)))
> >> 
> >> it is not true that
> >> 
> >>     and (nth 0 match-data) == (match-beginning 0), no?
> >
> > Because of markers vs positions, as I've tried to explain.  The
> > difference is minor, of course.
> 
> But `goto-char` doesn't care about that difference, and no buffer will
> be changed between the time we call `match-data` (thus creating the
> markers) and the time we use those markers, so going through markers is
> just extra work for no benefit.
> 
> IOW, I still don't understand how "markers vs positions" is relevant in
> 
>     (when backward (goto-char (nth 0 match-data)))
> 
> >> > What my kludge did is simply use a marker, so the adjusted position is
> >> > not clobbered.
> >> 
> >> I don't see that.  E.g. if you change your code from
> >> 
> >>     (set-match-data (list (car match-data) (nth 1 (match-data))))
> >> to
> >>     (set-match-data (list (car match-data) (nth 1 (match-data t))))
> >> 
> >> it fixes the problem just as well, AFAICT.
> >
> > Yes, but match-data (the function) returns updated positions, which
> > behave like markers across the replace-match call.
> 
> But the positions have already been updated.  So
> 
>     (match-data t)
> 
> would also return those updated positions, even though it doesn't
> use markers.

We are both saying the same things, just past each other.  There's no
disagreement or argument here between us.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-16 18:23         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-18 10:18           ` Eli Zaretskii
  2023-11-18 16:44             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2023-11-18 10:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: gabriele, 67124

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: gabriele@medialab.sissa.it,  67124@debbugs.gnu.org
> Date: Thu, 16 Nov 2023 13:23:48 -0500
> 
> >> --- a/src/search.c
> >> +++ b/src/search.c
> >> @@ -3142,9 +3142,16 @@ update_search_regs (ptrdiff_t oldstart, ptrdiff_t oldend, ptrdiff_t newend)
> >>  
> >>    for (i = 0; i < search_regs.num_regs; i++)
> >>      {
> >> -      if (search_regs.start[i] >= oldend)
> >> +      if (search_regs.start[i] <= oldstart)
> >> +        /* If the subgroup that `replace-match` is modifying encloses the
> >> +           subgroup `i`, then its `start` position should stay unchanged.
> >> +           That's always true for subgroup 0.
> >
> > I've read this part of the comment many times, and I still don't
> > understand what you are trying to convey there, and thus don't
> > understand the new 'if' clause you added.  In particular, how come
> > this was never something brought to our attention? the comment seems
> > to imply that replace-match was somehow badly broken since forever.
> 
> `replace-match` goes through the trouble of trying to keep the
> match-data valid by moving the subgroup positions.
> 
> The way it does it currently is sometimes flat out incorrect and other
> times debatable.  For end positions, we do:
> 
>       if (search_regs.end[i] >= oldend)
>         search_regs.end[i] += change;
>       else if (search_regs.end[i] > oldstart)
>         search_regs.end[i] = oldstart;
> 
> which means that positions beyond (or at the end of) the change are
> moved appropriately, positions before (or at the beginning of) the change
> are left untouched and positions strictly within the change are moved to
> the beginning of the change.
> If the position is both at start and end at the same time, we move them,
> assuming that they're more at the end than at the start.  For subgroup
> 0, this is definitely correct.  For other subgroups, it depends on the
> inclusion or relative position of that subgroup with the subgroup that
> `replace-match` is replacing.  E.g. if I matched
> "\\(\\)\\(\\(\\)\\)\\(\\)" and I replace subgroup 3 with "hi", then the
> ideal behavior would be to move the end of subgroups 0, 2, 3, and 4 but
> not 1.
> 
> For start positions, we do the same:
> 
>       if (search_regs.start[i] >= oldend)
>         search_regs.start[i] += change;
>       else if (search_regs.start[i] > oldstart)
>         search_regs.start[i] = oldstart;
> 
> When oldstart < oldend this works OK, but when they're equal, moving the
> start position is definitely wrong for subgroup 0 and is often wrong for
> others as well.  In the above regexp example, we should ideally move the
> start position of subgroup 4 only and we should leave the other
> ones unchanged.
> 
> `update_search_regs` doesn't have enough information to figure out which
> subgroup is before/after/around/within some other subgroup and it isn't
> even told which subgroup is being updated by `replace-match`, so the
> best it can do is to either move them all or move none.  The current
> code moves them all, which is always wrong for start[0].
> My patch intends to make the other choice so it will get it wrong for
> start[4] in the above example (just like the current code gets it wrong
> for end[1]) but it will get it right for the other ones.

Please include at least some, if not all, of this extended explanation
in the comments there.

And I think you can install your changes on master.

Thanks.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-18 10:18           ` Eli Zaretskii
@ 2023-11-18 16:44             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-18 21:35               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-18 16:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124

> Please include at least some, if not all, of this extended explanation
> in the comments there.

Will do, thanks.  Will add some tests as well, which should also clarify
the intent/problem.


        Stefan






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer)
  2023-11-18 16:44             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-18 21:35               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 24+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-18 21:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gabriele, 67124-done

>> Please include at least some, if not all, of this extended explanation
>> in the comments there.
> Will do, thanks.  Will add some tests as well, which should also clarify
> the intent/problem.

Done,


        Stefan






^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-11-18 21:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-11 19:40 bug#67124: 26.3; query-replace Arg out of range with comma option (at end-buffer) Gabriele Nicolardi
2023-11-12  9:48 ` Eli Zaretskii
2023-11-13  3:56   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-13 14:17     ` Eli Zaretskii
2023-11-16 14:29       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-16 14:55         ` Eli Zaretskii
2023-11-16 16:15           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-16 16:56             ` Eli Zaretskii
2023-11-16 18:01               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-16 18:34                 ` Eli Zaretskii
2023-11-16 15:35     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-16 16:51       ` Eli Zaretskii
2023-11-16 18:23         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-18 10:18           ` Eli Zaretskii
2023-11-18 16:44             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-18 21:35               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
     [not found]   ` <ed11baa2-cf89-4a72-91d0-8f26c0af4126@medialab.sissa.it>
2023-11-13 14:06     ` Eli Zaretskii
2023-11-13 14:21       ` Dmitry Gutov
2023-11-13 14:38         ` Eli Zaretskii
2023-11-13 14:45           ` Dmitry Gutov
2023-11-13 15:39             ` Gabriele Nicolardi
2023-11-16 14:45               ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-16 16:27                 ` Gabriele Nicolardi
2023-11-15 13:18   ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).