all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
@ 2024-12-03 10:56 Campbell Barton
  2024-12-03 14:05 ` Eli Zaretskii
  2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 6+ messages in thread
From: Campbell Barton @ 2024-12-03 10:56 UTC (permalink / raw)
  To: 74666


Run the following file as an executable script:

emacs -Q --batch -l ./emacs-test.el

;; BEGIN SCRIPT:
#!/usr/bin/env -S emacs --batch -l
;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-

(defun printf (&rest args)
   (princ (apply #'format args) #'external-debugging-output))

(defun test-me (is-forward)
   (let ((result ""))
     (with-temp-buffer
       (insert "__B_\n")
       (save-match-data
         (set-match-data (list 2 4 2 2 2 4))
         (cond
          (is-forward
           (replace-match "HELLO" t t nil 1)
           (replace-match "WORLD" t t nil 2))
          (t
           (replace-match "WORLD" t t nil 2)
           (replace-match "HELLO" t t nil 1))))
       (setq result (buffer-substring-no-properties (point-min) 
(point-max))))
     result))

(printf "A: %s" (test-me t))
(printf "B: %s" (test-me nil))
;; END SCRIPT:

In emacs 29.4 this prints:

A: _HELLOWORLD_
B: _HELLOWORLD_

In emacs 31.0.50 this prints:

A: _WORLD_
B: _HELLOWORLD_

This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.

In GNU Emacs 31.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
  3.24.43, cairo version 1.18.2) of 2024-12-03 built on
  campbell-trx40aorusmaster
Repository revision: 04d035acd72823e82dd61c4894f9c09113e65dd6
Repository branch: master
System Description: Arch Linux

Configured using:
  'configure --prefix=/opt/emacs --with-sound=no --disable-acl
  --with-file-notification=no --with-xpm --with-native-compilation
  --with-jpeg --with-tiff --with-gif --with-png
  --without-compress-install --without-rsvg --without-lcms2
  --without-libsystemd --without-gsettings --without-harfbuzz
  --without-m17n-flt --without-libotf --without-gpm --without-dbus
  --without-gsettings --without-gconf --without-selinux --without-gnutls
  --without-makeinfo --without-libgmp --with-zlib --with-threads
  --with-x-toolkit=gtk3 --with-xft --with-pgtk --with-tree-sitter
  --with-modules 'CFLAGS=-O3 -mtune=native -march=native -pipe
  -fomit-frame-pointer''

Configured features:
CAIRO FREETYPE GIF GLIB JPEG LIBXML2 MODULES NATIVE_COMP PDUMPER PGTK
PNG SECCOMP SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP
XIM GTK3 ZLIB

Important settings:
   value of $LC_MONETARY: en_US.UTF-8
   value of $LC_NUMERIC: en_US.UTF-8
   value of $LC_TIME: en_US.UTF-8
   value of $LANG: en_US.UTF-8
   value of $XMODIFIERS: @im=ibus
   locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
   tooltip-mode: t
   global-eldoc-mode: t
   eldoc-mode: t
   show-paren-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   minibuffer-regexp-mode: t
   line-number-mode: t
   indent-tabs-mode: t
   transient-mark-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search time-date subr-x mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/pgtk-win pgtk-win term/common-win touch-screen pgtk-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dynamic-setting
font-render-setting cairo gtk pgtk multi-tty move-toolbar
make-network-process native-compile emacs)

Memory information:
((conses 16 53518 10389) (symbols 48 5294 0) (strings 32 18752 1367)
  (string-bytes 1 607905) (vectors 16 9151)
  (vector-slots 8 128466 8940) (floats 8 22 2) (intervals 56 268 3)
  (buffers 992 10))





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
@ 2024-12-03 14:05 ` Eli Zaretskii
  2024-12-14  9:43   ` Eli Zaretskii
  2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2024-12-03 14:05 UTC (permalink / raw)
  To: Campbell Barton, Stefan Monnier; +Cc: 74666

> Date: Tue, 3 Dec 2024 21:56:02 +1100
> From: Campbell Barton <ideasman42@gmail.com>
> 
> 
> Run the following file as an executable script:
> 
> emacs -Q --batch -l ./emacs-test.el
> 
> ;; BEGIN SCRIPT:
> #!/usr/bin/env -S emacs --batch -l
> ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
> 
> (defun printf (&rest args)
>    (princ (apply #'format args) #'external-debugging-output))
> 
> (defun test-me (is-forward)
>    (let ((result ""))
>      (with-temp-buffer
>        (insert "__B_\n")
>        (save-match-data
>          (set-match-data (list 2 4 2 2 2 4))
>          (cond
>           (is-forward
>            (replace-match "HELLO" t t nil 1)
>            (replace-match "WORLD" t t nil 2))
>           (t
>            (replace-match "WORLD" t t nil 2)
>            (replace-match "HELLO" t t nil 1))))
>        (setq result (buffer-substring-no-properties (point-min) 
> (point-max))))
>      result))
> 
> (printf "A: %s" (test-me t))
> (printf "B: %s" (test-me nil))
> ;; END SCRIPT:
> 
> In emacs 29.4 this prints:
> 
> A: _HELLOWORLD_
> B: _HELLOWORLD_
> 
> In emacs 31.0.50 this prints:
> 
> A: _WORLD_
> B: _HELLOWORLD_
> 
> This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.

Stefan, could you please take a look?





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-03 14:05 ` Eli Zaretskii
@ 2024-12-14  9:43   ` Eli Zaretskii
  0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2024-12-14  9:43 UTC (permalink / raw)
  To: monnier; +Cc: 74666, ideasman42

Ping!

> Cc: 74666@debbugs.gnu.org
> Date: Tue, 03 Dec 2024 16:05:40 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Date: Tue, 3 Dec 2024 21:56:02 +1100
> > From: Campbell Barton <ideasman42@gmail.com>
> > 
> > 
> > Run the following file as an executable script:
> > 
> > emacs -Q --batch -l ./emacs-test.el
> > 
> > ;; BEGIN SCRIPT:
> > #!/usr/bin/env -S emacs --batch -l
> > ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
> > 
> > (defun printf (&rest args)
> >    (princ (apply #'format args) #'external-debugging-output))
> > 
> > (defun test-me (is-forward)
> >    (let ((result ""))
> >      (with-temp-buffer
> >        (insert "__B_\n")
> >        (save-match-data
> >          (set-match-data (list 2 4 2 2 2 4))
> >          (cond
> >           (is-forward
> >            (replace-match "HELLO" t t nil 1)
> >            (replace-match "WORLD" t t nil 2))
> >           (t
> >            (replace-match "WORLD" t t nil 2)
> >            (replace-match "HELLO" t t nil 1))))
> >        (setq result (buffer-substring-no-properties (point-min) 
> > (point-max))))
> >      result))
> > 
> > (printf "A: %s" (test-me t))
> > (printf "B: %s" (test-me nil))
> > ;; END SCRIPT:
> > 
> > In emacs 29.4 this prints:
> > 
> > A: _HELLOWORLD_
> > B: _HELLOWORLD_
> > 
> > In emacs 31.0.50 this prints:
> > 
> > A: _WORLD_
> > B: _HELLOWORLD_
> > 
> > This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
> 
> Stefan, could you please take a look?
> 
> 
> 
> 





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
  2024-12-03 14:05 ` Eli Zaretskii
@ 2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-12-15  1:10   ` Campbell Barton
  1 sibling, 1 reply; 6+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-14 16:11 UTC (permalink / raw)
  To: Campbell Barton; +Cc: 74666

> (defun test-me (is-forward)
>   (let ((result ""))
>     (with-temp-buffer
>       (insert "__B_\n")
>       (save-match-data
>         (set-match-data (list 2 4 2 2 2 4))
>         (cond
>          (is-forward
>           (replace-match "HELLO" t t nil 1)
>           (replace-match "WORLD" t t nil 2))
>          (t
>           (replace-match "WORLD" t t nil 2)
>           (replace-match "HELLO" t t nil 1))))
>       (setq result (buffer-substring-no-properties (point-min)
>        (point-max))))
>     result))
[...]
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_

The problem is that the `set-match-data` doesn't give us any information
about the intended inclusion relationship between the subgroups.

I agree that the behavior you see is not the one you want if it's the
result of:

    (goto-char (point-min))
    (looking-at "_\\(\\)\\(_B\\)")

But OTOH it is the one we want if it is the result of:

    (goto-char (point-min))
    (looking-at "_\\(?2:\\(?1:\\)_B\\)")

We can try and guess the inclusion relationship based on circumstantial
evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
"_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
`update_search_regs` tricky, with various heuristics.
And we'll never handle all cases right unless we make significant
changes to the match-data (and the regexp compiler) to keep track of
inclusion relationships.

Could you give us some information about the larger context in which you
bumped into this problem?


        Stefan






^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-12-15  1:10   ` Campbell Barton
  2024-12-17  3:18     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 6+ messages in thread
From: Campbell Barton @ 2024-12-15  1:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 74666



On 24-12-15 3:11 AM, Stefan Monnier wrote:
>> (defun test-me (is-forward)
>>    (let ((result ""))
>>      (with-temp-buffer
>>        (insert "__B_\n")
>>        (save-match-data
>>          (set-match-data (list 2 4 2 2 2 4))
>>          (cond
>>           (is-forward
>>            (replace-match "HELLO" t t nil 1)
>>            (replace-match "WORLD" t t nil 2))
>>           (t
>>            (replace-match "WORLD" t t nil 2)
>>            (replace-match "HELLO" t t nil 1))))
>>        (setq result (buffer-substring-no-properties (point-min)
>>         (point-max))))
>>      result))
> [...]
>> In emacs 29.4 this prints:
>>
>> A: _HELLOWORLD_
>> B: _HELLOWORLD_
>>
>> In emacs 31.0.50 this prints:
>>
>> A: _WORLD_
>> B: _HELLOWORLD_
> 
> The problem is that the `set-match-data` doesn't give us any information
> about the intended inclusion relationship between the subgroups.
> 
> I agree that the behavior you see is not the one you want if it's the
> result of:
> 
>      (goto-char (point-min))
>      (looking-at "_\\(\\)\\(_B\\)")
> 
> But OTOH it is the one we want if it is the result of:
> 
>      (goto-char (point-min))
>      (looking-at "_\\(?2:\\(?1:\\)_B\\)")
> 
> We can try and guess the inclusion relationship based on circumstantial
> evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
> "_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
> `update_search_regs` tricky, with various heuristics.
> And we'll never handle all cases right unless we make significant
> changes to the match-data (and the regexp compiler) to keep track of
> inclusion relationships.
> 
> Could you give us some information about the larger context in which you
> bumped into this problem?

On the user side - I ran into this bug when decrementing numbers broke 
for me in the evil-numbers package [0]. Numbers would fail to become 
negative. Decrementing 0 would become 1.

In this case, the match data is set with `set-match-data' using 
calculated ranges.

Since this used to work I think it's reasonable to consider it a regression.

I've since committed a workaround to evil-numbers [1], although I'd 
suspect this would impact others.

[0]: https://melpa.org/#/evil-numbers
[1]: 
https://github.com/juliapath/evil-numbers/commit/f93258b706fa5cf9259e815c2d8258fcc6262804









^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-15  1:10   ` Campbell Barton
@ 2024-12-17  3:18     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-17  3:18 UTC (permalink / raw)
  To: Campbell Barton; +Cc: 74666

> In this case, the match data is set with `set-match-data' using
> calculated ranges.

I guess we can take this as good thing: it means complexifying the
regexp code would be wasted.  🙂

> Since this used to work I think it's reasonable to consider it a regression.

I was not trying to say it's not a regression.  Just pouting because the
old behavior was just a lucky accident and recovering it without losing
the other improvement isn't completely straightforward.

> I've since committed a workaround to evil-numbers [1], although I'd suspect
> this would impact others.

[ And the workaround relies on another lucky accident: the "easiest"
  fix for the problem would break your workaround. 🙁  ]


        Stefan






^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-12-17  3:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
2024-12-03 14:05 ` Eli Zaretskii
2024-12-14  9:43   ` Eli Zaretskii
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-12-15  1:10   ` Campbell Barton
2024-12-17  3:18     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.