unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
@ 2024-12-03 10:56 Campbell Barton
  2024-12-03 14:05 ` Eli Zaretskii
  2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 4+ messages in thread
From: Campbell Barton @ 2024-12-03 10:56 UTC (permalink / raw)
  To: 74666


Run the following file as an executable script:

emacs -Q --batch -l ./emacs-test.el

;; BEGIN SCRIPT:
#!/usr/bin/env -S emacs --batch -l
;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-

(defun printf (&rest args)
   (princ (apply #'format args) #'external-debugging-output))

(defun test-me (is-forward)
   (let ((result ""))
     (with-temp-buffer
       (insert "__B_\n")
       (save-match-data
         (set-match-data (list 2 4 2 2 2 4))
         (cond
          (is-forward
           (replace-match "HELLO" t t nil 1)
           (replace-match "WORLD" t t nil 2))
          (t
           (replace-match "WORLD" t t nil 2)
           (replace-match "HELLO" t t nil 1))))
       (setq result (buffer-substring-no-properties (point-min) 
(point-max))))
     result))

(printf "A: %s" (test-me t))
(printf "B: %s" (test-me nil))
;; END SCRIPT:

In emacs 29.4 this prints:

A: _HELLOWORLD_
B: _HELLOWORLD_

In emacs 31.0.50 this prints:

A: _WORLD_
B: _HELLOWORLD_

This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.

In GNU Emacs 31.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
  3.24.43, cairo version 1.18.2) of 2024-12-03 built on
  campbell-trx40aorusmaster
Repository revision: 04d035acd72823e82dd61c4894f9c09113e65dd6
Repository branch: master
System Description: Arch Linux

Configured using:
  'configure --prefix=/opt/emacs --with-sound=no --disable-acl
  --with-file-notification=no --with-xpm --with-native-compilation
  --with-jpeg --with-tiff --with-gif --with-png
  --without-compress-install --without-rsvg --without-lcms2
  --without-libsystemd --without-gsettings --without-harfbuzz
  --without-m17n-flt --without-libotf --without-gpm --without-dbus
  --without-gsettings --without-gconf --without-selinux --without-gnutls
  --without-makeinfo --without-libgmp --with-zlib --with-threads
  --with-x-toolkit=gtk3 --with-xft --with-pgtk --with-tree-sitter
  --with-modules 'CFLAGS=-O3 -mtune=native -march=native -pipe
  -fomit-frame-pointer''

Configured features:
CAIRO FREETYPE GIF GLIB JPEG LIBXML2 MODULES NATIVE_COMP PDUMPER PGTK
PNG SECCOMP SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP
XIM GTK3 ZLIB

Important settings:
   value of $LC_MONETARY: en_US.UTF-8
   value of $LC_NUMERIC: en_US.UTF-8
   value of $LC_TIME: en_US.UTF-8
   value of $LANG: en_US.UTF-8
   value of $XMODIFIERS: @im=ibus
   locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
   tooltip-mode: t
   global-eldoc-mode: t
   eldoc-mode: t
   show-paren-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   minibuffer-regexp-mode: t
   line-number-mode: t
   indent-tabs-mode: t
   transient-mark-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search time-date subr-x mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric
uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/pgtk-win pgtk-win term/common-win touch-screen pgtk-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dynamic-setting
font-render-setting cairo gtk pgtk multi-tty move-toolbar
make-network-process native-compile emacs)

Memory information:
((conses 16 53518 10389) (symbols 48 5294 0) (strings 32 18752 1367)
  (string-bytes 1 607905) (vectors 16 9151)
  (vector-slots 8 128466 8940) (floats 8 22 2) (intervals 56 268 3)
  (buffers 992 10))





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
@ 2024-12-03 14:05 ` Eli Zaretskii
  2024-12-14  9:43   ` Eli Zaretskii
  2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 4+ messages in thread
From: Eli Zaretskii @ 2024-12-03 14:05 UTC (permalink / raw)
  To: Campbell Barton, Stefan Monnier; +Cc: 74666

> Date: Tue, 3 Dec 2024 21:56:02 +1100
> From: Campbell Barton <ideasman42@gmail.com>
> 
> 
> Run the following file as an executable script:
> 
> emacs -Q --batch -l ./emacs-test.el
> 
> ;; BEGIN SCRIPT:
> #!/usr/bin/env -S emacs --batch -l
> ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
> 
> (defun printf (&rest args)
>    (princ (apply #'format args) #'external-debugging-output))
> 
> (defun test-me (is-forward)
>    (let ((result ""))
>      (with-temp-buffer
>        (insert "__B_\n")
>        (save-match-data
>          (set-match-data (list 2 4 2 2 2 4))
>          (cond
>           (is-forward
>            (replace-match "HELLO" t t nil 1)
>            (replace-match "WORLD" t t nil 2))
>           (t
>            (replace-match "WORLD" t t nil 2)
>            (replace-match "HELLO" t t nil 1))))
>        (setq result (buffer-substring-no-properties (point-min) 
> (point-max))))
>      result))
> 
> (printf "A: %s" (test-me t))
> (printf "B: %s" (test-me nil))
> ;; END SCRIPT:
> 
> In emacs 29.4 this prints:
> 
> A: _HELLOWORLD_
> B: _HELLOWORLD_
> 
> In emacs 31.0.50 this prints:
> 
> A: _WORLD_
> B: _HELLOWORLD_
> 
> This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.

Stefan, could you please take a look?





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-03 14:05 ` Eli Zaretskii
@ 2024-12-14  9:43   ` Eli Zaretskii
  0 siblings, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2024-12-14  9:43 UTC (permalink / raw)
  To: monnier; +Cc: 74666, ideasman42

Ping!

> Cc: 74666@debbugs.gnu.org
> Date: Tue, 03 Dec 2024 16:05:40 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> 
> > Date: Tue, 3 Dec 2024 21:56:02 +1100
> > From: Campbell Barton <ideasman42@gmail.com>
> > 
> > 
> > Run the following file as an executable script:
> > 
> > emacs -Q --batch -l ./emacs-test.el
> > 
> > ;; BEGIN SCRIPT:
> > #!/usr/bin/env -S emacs --batch -l
> > ;;; emacs-git-log --- Batch check emacs-lisp. -*- lexical-binding: t -*-
> > 
> > (defun printf (&rest args)
> >    (princ (apply #'format args) #'external-debugging-output))
> > 
> > (defun test-me (is-forward)
> >    (let ((result ""))
> >      (with-temp-buffer
> >        (insert "__B_\n")
> >        (save-match-data
> >          (set-match-data (list 2 4 2 2 2 4))
> >          (cond
> >           (is-forward
> >            (replace-match "HELLO" t t nil 1)
> >            (replace-match "WORLD" t t nil 2))
> >           (t
> >            (replace-match "WORLD" t t nil 2)
> >            (replace-match "HELLO" t t nil 1))))
> >        (setq result (buffer-substring-no-properties (point-min) 
> > (point-max))))
> >      result))
> > 
> > (printf "A: %s" (test-me t))
> > (printf "B: %s" (test-me nil))
> > ;; END SCRIPT:
> > 
> > In emacs 29.4 this prints:
> > 
> > A: _HELLOWORLD_
> > B: _HELLOWORLD_
> > 
> > In emacs 31.0.50 this prints:
> > 
> > A: _WORLD_
> > B: _HELLOWORLD_
> > 
> > This is a regression in 47b497b4dac91e5ea56102018223bdeb5e21a93b.
> 
> Stefan, could you please take a look?
> 
> 
> 
> 





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
  2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
  2024-12-03 14:05 ` Eli Zaretskii
@ 2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 0 replies; 4+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-12-14 16:11 UTC (permalink / raw)
  To: Campbell Barton; +Cc: 74666

> (defun test-me (is-forward)
>   (let ((result ""))
>     (with-temp-buffer
>       (insert "__B_\n")
>       (save-match-data
>         (set-match-data (list 2 4 2 2 2 4))
>         (cond
>          (is-forward
>           (replace-match "HELLO" t t nil 1)
>           (replace-match "WORLD" t t nil 2))
>          (t
>           (replace-match "WORLD" t t nil 2)
>           (replace-match "HELLO" t t nil 1))))
>       (setq result (buffer-substring-no-properties (point-min)
>        (point-max))))
>     result))
[...]
> In emacs 29.4 this prints:
>
> A: _HELLOWORLD_
> B: _HELLOWORLD_
>
> In emacs 31.0.50 this prints:
>
> A: _WORLD_
> B: _HELLOWORLD_

The problem is that the `set-match-data` doesn't give us any information
about the intended inclusion relationship between the subgroups.

I agree that the behavior you see is not the one you want if it's the
result of:

    (goto-char (point-min))
    (looking-at "_\\(\\)\\(_B\\)")

But OTOH it is the one we want if it is the result of:

    (goto-char (point-min))
    (looking-at "_\\(?2:\\(?1:\\)_B\\)")

We can try and guess the inclusion relationship based on circumstantial
evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
"_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
`update_search_regs` tricky, with various heuristics.
And we'll never handle all cases right unless we make significant
changes to the match-data (and the regexp compiler) to keep track of
inclusion relationships.

Could you give us some information about the larger context in which you
bumped into this problem?


        Stefan






^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-14 16:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
2024-12-03 14:05 ` Eli Zaretskii
2024-12-14  9:43   ` Eli Zaretskii
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).