From: Campbell Barton <ideasman42@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: 74666@debbugs.gnu.org
Subject: bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups
Date: Sun, 15 Dec 2024 12:10:26 +1100 [thread overview]
Message-ID: <8340bf95-c716-4ebe-994e-07554b6e2165@gmail.com> (raw)
In-Reply-To: <jwvseqqwbc6.fsf-monnier+emacs@gnu.org>
On 24-12-15 3:11 AM, Stefan Monnier wrote:
>> (defun test-me (is-forward)
>> (let ((result ""))
>> (with-temp-buffer
>> (insert "__B_\n")
>> (save-match-data
>> (set-match-data (list 2 4 2 2 2 4))
>> (cond
>> (is-forward
>> (replace-match "HELLO" t t nil 1)
>> (replace-match "WORLD" t t nil 2))
>> (t
>> (replace-match "WORLD" t t nil 2)
>> (replace-match "HELLO" t t nil 1))))
>> (setq result (buffer-substring-no-properties (point-min)
>> (point-max))))
>> result))
> [...]
>> In emacs 29.4 this prints:
>>
>> A: _HELLOWORLD_
>> B: _HELLOWORLD_
>>
>> In emacs 31.0.50 this prints:
>>
>> A: _WORLD_
>> B: _HELLOWORLD_
>
> The problem is that the `set-match-data` doesn't give us any information
> about the intended inclusion relationship between the subgroups.
>
> I agree that the behavior you see is not the one you want if it's the
> result of:
>
> (goto-char (point-min))
> (looking-at "_\\(\\)\\(_B\\)")
>
> But OTOH it is the one we want if it is the result of:
>
> (goto-char (point-min))
> (looking-at "_\\(?2:\\(?1:\\)_B\\)")
>
> We can try and guess the inclusion relationship based on circumstantial
> evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than
> "_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of
> `update_search_regs` tricky, with various heuristics.
> And we'll never handle all cases right unless we make significant
> changes to the match-data (and the regexp compiler) to keep track of
> inclusion relationships.
>
> Could you give us some information about the larger context in which you
> bumped into this problem?
On the user side - I ran into this bug when decrementing numbers broke
for me in the evil-numbers package [0]. Numbers would fail to become
negative. Decrementing 0 would become 1.
In this case, the match data is set with `set-match-data' using
calculated ranges.
Since this used to work I think it's reasonable to consider it a regression.
I've since committed a workaround to evil-numbers [1], although I'd
suspect this would impact others.
[0]: https://melpa.org/#/evil-numbers
[1]:
https://github.com/juliapath/evil-numbers/commit/f93258b706fa5cf9259e815c2d8258fcc6262804
prev parent reply other threads:[~2024-12-15 1:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-03 10:56 bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Campbell Barton
2024-12-03 14:05 ` Eli Zaretskii
2024-12-14 9:43 ` Eli Zaretskii
2024-12-14 16:11 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-12-15 1:10 ` Campbell Barton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8340bf95-c716-4ebe-994e-07554b6e2165@gmail.com \
--to=ideasman42@gmail.com \
--cc=74666@debbugs.gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).