* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
@ 2024-12-03 10:56 Campbell Barton
2024-12-03 14:05 ` Eli Zaretskii
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 2 replies; 6+ messages in thread
From: Campbell Barton @ 2024-12-03 10:56 UTC (permalink / raw)
To: 74666
Run the following file as an executable script:
emacs -Q --batch -l ./emacs-test.el
;; BEGIN SCRIPT:
#!/usr/bin/env -S emacs --batch -l
;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
(defun printf (&rest args)
(princ (apply #'format args) #'external-debugging-output))
(defun test-me (is-forward)
(let ((result ""))
(with-temp-buffer
(insert "__B_\n")
(save-match-data
(set-match-data (list 2 4 2 2 2 4))
(cond
(is-forward
(replace-match "HELLO" t t nil 1)
(replace-match "WORLD" t t nil 2))
(t
(replace-match "WORLD" t t nil 2)
(replace-match "HELLO" t t nil 1))))
(setq result (buffer-substring-no-properties (point-min)
(point-max))))
result))
(printf "A: %s" (test-me t))
(printf "B: %s" (test-me nil))
;; END SCRIPT:
In emacs 29.4 this prints:
A: _HELLOWORLD_
B: _HELLOWORLD_
In emacs 31.0.50 this prints:
A: _WORLD_
B: _HELLOWORLD_
This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
In GNU Emacs 31.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
3.24.43, cairo version 1.18.2) of 2024-12-03 built on
campbell-trx40aorusmaster
Repository revision: 04d035acd72823e82dd61c4894f9c09113e65dd6
Repository branch: master
System Description: Arch Linux
Configured using:
'configure --prefix=/opt/emacs --with-sound=no --disable-acl
--with-file-notification=no --with-xpm --with-native-compilation
--with-jpeg --with-tiff --with-gif --with-png
--without-compress-install --without-rsvg --without-lcms2
--without-libsystemd --without-gsettings --without-harfbuzz
--without-m17n-flt --without-libotf --without-gpm --without-dbus
--without-gsettings --without-gconf --without-selinux --without-gnutls
--without-makeinfo --without-libgmp --with-zlib --with-threads
--with-x-toolkit=gtk3 --with-xft --with-pgtk --with-tree-sitter
--with-modules 'CFLAGS=-O3 -mtune=native -march=native -pipe
-fomit-frame-pointer''
Configured features:
CAIRO FREETYPE GIF GLIB JPEG LIBXML2 MODULES NATIVE_COMP PDUMPER PGTK
PNG SECCOMP SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP
XIM GTK3 ZLIB
Important settings:
value of $LC_MONETARY: en_US.UTF-8
value of $LC_NUMERIC: en_US.UTF-8
value of $LC_TIME: en_US.UTF-8
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
minibuffer-regexp-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search time-date subr-x mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/pgtk-win pgtk-win term/common-win touch-screen pgtk-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dynamic-setting
font-render-setting cairo gtk pgtk multi-tty move-toolbar
make-network-process native-compile emacs)
Memory information:
((conses 16 53518 10389) (symbols 48 5294 0) (strings 32 18752 1367)
(string-bytes 1 607905) (vectors 16 9151)
(vector-slots 8 128466 8940) (floats 8 22 2) (intervals 56 268 3)
(buffers 992 10))
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
@ 2024-12-03 14:05 ` Eli Zaretskii
2024-12-14 9:43 ` Eli Zaretskii
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2024-12-03 14:05 UTC (permalink / raw)
To: Campbell Barton, Stefan Monnier; +Cc: 74666
> Date: Tue, 3 Dec 2024 21:56:02 +1100
> From: Campbell Barton <ideasman42@gmail.com>
>
>
> Run the following file as an executable script:
>
> emacs -Q --batch -l ./emacs-test.el
>
> ;; BEGIN SCRIPT:
> #!/usr/bin/env -S emacs --batch -l
> ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
>
> (defun printf (&rest args)
> (princ (apply #'format args) #'external-debugging-output))
>
> (defun test-me (is-forward)
> (let ((result ""))
> (with-temp-buffer
> (insert "__B_\n")
> (save-match-data
> (set-match-data (list 2 4 2 2 2 4))
> (cond
> (is-forward
> (replace-match "HELLO" t t nil 1)
> (replace-match "WORLD" t t nil 2))
> (t
> (replace-match "WORLD" t t nil 2)
> (replace-match "HELLO" t t nil 1))))
> (setq result (buffer-substring-no-properties (point-min)
> (point-max))))
> result))
>
> (printf "A: %s" (test-me t))
> (printf "B: %s" (test-me nil))
> ;; END SCRIPT:
>
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_
>
> This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
Stefan, could you please take a look?
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
2024-12-03 14:05 ` Eli Zaretskii
@ 2024-12-14 9:43 ` Eli Zaretskii
0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2024-12-14 9:43 UTC (permalink / raw)
To: monnier; +Cc: 74666, ideasman42
Ping!
> Cc: 74666@debbugs.gnu.org
> Date: Tue, 03 Dec 2024 16:05:40 +0200
> From: Eli Zaretskii <eliz@gnu.org>
>
> > Date: Tue, 3 Dec 2024 21:56:02 +1100
> > From: Campbell Barton <ideasman42@gmail.com>
> >
> >
> > Run the following file as an executable script:
> >
> > emacs -Q --batch -l ./emacs-test.el
> >
> > ;; BEGIN SCRIPT:
> > #!/usr/bin/env -S emacs --batch -l
> > ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
> >
> > (defun printf (&rest args)
> > (princ (apply #'format args) #'external-debugging-output))
> >
> > (defun test-me (is-forward)
> > (let ((result ""))
> > (with-temp-buffer
> > (insert "__B_\n")
> > (save-match-data
> > (set-match-data (list 2 4 2 2 2 4))
> > (cond
> > (is-forward
> > (replace-match "HELLO" t t nil 1)
> > (replace-match "WORLD" t t nil 2))
> > (t
> > (replace-match "WORLD" t t nil 2)
> > (replace-match "HELLO" t t nil 1))))
> > (setq result (buffer-substring-no-properties (point-min)
> > (point-max))))
> > result))
> >
> > (printf "A: %s" (test-me t))
> > (printf "B: %s" (test-me nil))
> > ;; END SCRIPT:
> >
> > In emacs 29.4 this prints:
> >
> > A: _HELLOWORLD_
> > B: _HELLOWORLD_
> >
> > In emacs 31.0.50 this prints:
> >
> > A: _WORLD_
> > B: _HELLOWORLD_
> >
> > This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
>
> Stefan, could you please take a look?
>
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
2024-12-03 14:05 ` Eli Zaretskii
@ 2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-12-15 1:10 ` Campbell Barton
1 sibling, 1 reply; 6+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-14 16:11 UTC (permalink / raw)
To: Campbell Barton; +Cc: 74666
> (defun test-me (is-forward)
> (let ((result ""))
> (with-temp-buffer
> (insert "__B_\n")
> (save-match-data
> (set-match-data (list 2 4 2 2 2 4))
> (cond
> (is-forward
> (replace-match "HELLO" t t nil 1)
> (replace-match "WORLD" t t nil 2))
> (t
> (replace-match "WORLD" t t nil 2)
> (replace-match "HELLO" t t nil 1))))
> (setq result (buffer-substring-no-properties (point-min)
> (point-max))))
> result))
[...]
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_
The problem is that the `set-match-data` doesn't give us any information
about the intended inclusion relationship between the subgroups.
I agree that the behavior you see is not the one you want if it's the
result of:
(goto-char (point-min))
(looking-at "_\\(\\)\\(_B\\)")
But OTOH it is the one we want if it is the result of:
(goto-char (point-min))
(looking-at "_\\(?2:\\(?1:\\)_B\\)")
We can try and guess the inclusion relationship based on circumstantial
evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
"_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
`update_search_regs` tricky, with various heuristics.
And we'll never handle all cases right unless we make significant
changes to the match-data (and the regexp compiler) to keep track of
inclusion relationships.
Could you give us some information about the larger context in which you
bumped into this problem?
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-12-15 1:10 ` Campbell Barton
2024-12-17 3:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 6+ messages in thread
From: Campbell Barton @ 2024-12-15 1:10 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 74666
On 24-12-15 3:11 AM, Stefan Monnier wrote:
>> (defun test-me (is-forward)
>> (let ((result ""))
>> (with-temp-buffer
>> (insert "__B_\n")
>> (save-match-data
>> (set-match-data (list 2 4 2 2 2 4))
>> (cond
>> (is-forward
>> (replace-match "HELLO" t t nil 1)
>> (replace-match "WORLD" t t nil 2))
>> (t
>> (replace-match "WORLD" t t nil 2)
>> (replace-match "HELLO" t t nil 1))))
>> (setq result (buffer-substring-no-properties (point-min)
>> (point-max))))
>> result))
> [...]
>> In emacs 29.4 this prints:
>>
>> A: _HELLOWORLD_
>> B: _HELLOWORLD_
>>
>> In emacs 31.0.50 this prints:
>>
>> A: _WORLD_
>> B: _HELLOWORLD_
>
> The problem is that the `set-match-data` doesn't give us any information
> about the intended inclusion relationship between the subgroups.
>
> I agree that the behavior you see is not the one you want if it's the
> result of:
>
> (goto-char (point-min))
> (looking-at "_\\(\\)\\(_B\\)")
>
> But OTOH it is the one we want if it is the result of:
>
> (goto-char (point-min))
> (looking-at "_\\(?2:\\(?1:\\)_B\\)")
>
> We can try and guess the inclusion relationship based on circumstantial
> evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
> "_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
> `update_search_regs` tricky, with various heuristics.
> And we'll never handle all cases right unless we make significant
> changes to the match-data (and the regexp compiler) to keep track of
> inclusion relationships.
>
> Could you give us some information about the larger context in which you
> bumped into this problem?
On the user side - I ran into this bug when decrementing numbers broke
for me in the evil-numbers package [0]. Numbers would fail to become
negative. Decrementing 0 would become 1.
In this case, the match data is set with `set-match-data' using
calculated ranges.
Since this used to work I think it's reasonable to consider it a regression.
I've since committed a workaround to evil-numbers [1], although I'd
suspect this would impact others.
[0]: https://melpa.org/#/evil-numbers
[1]:
https://github.com/juliapath/evil-numbers/commit/f93258b706fa5cf9259e815c2d8258fcc6262804
^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
2024-12-15 1:10 ` Campbell Barton
@ 2024-12-17 3:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 0 replies; 6+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-17 3:18 UTC (permalink / raw)
To: Campbell Barton; +Cc: 74666
> In this case, the match data is set with `set-match-data' using
> calculated ranges.
I guess we can take this as good thing: it means complexifying the
regexp code would be wasted. 🙂
> Since this used to work I think it's reasonable to consider it a regression.
I was not trying to say it's not a regression. Just pouting because the
old behavior was just a lucky accident and recovering it without losing
the other improvement isn't completely straightforward.
> I've since committed a workaround to evil-numbers [1], although I'd suspect
> this would impact others.
[ And the workaround relies on another lucky accident: the "easiest"
fix for the problem would break your workaround. 🙁 ]
Stefan
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-12-17 3:18 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
2024-12-03 14:05 ` Eli Zaretskii
2024-12-14 9:43 ` Eli Zaretskii
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-12-15 1:10 ` Campbell Barton
2024-12-17 3:18 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.