From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#74666: 31.0.50; Regression in replace-match with empty-adjacent groups Date: Sat, 14 Dec 2024 11:11:36 -0500 Message-ID: References: <5aad7547-5fd7-4eba-a6eb-38b1b4753dd8@gmail.com> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5858"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 74666@debbugs.gnu.org To: Campbell Barton Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Dec 14 17:12:14 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tMUkX-0001P6-NP for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 14 Dec 2024 17:12:14 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tMUkO-0003er-1B; Sat, 14 Dec 2024 11:12:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tMUkM-0003eG-D1 for bug-gnu-emacs@gnu.org; Sat, 14 Dec 2024 11:12:02 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tMUkM-0004JW-4F for bug-gnu-emacs@gnu.org; Sat, 14 Dec 2024 11:12:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:References:In-Reply-To:From:To:Subject; bh=mrBeEOzh3Vv8ykAEW1K2FUll99v6T2WIGCxjA1Yp3rE=; b=q76VSgdWGcXeWAMANgujxDjdOFM6tl5jZpLGDhjak0Ec0ns9dlpHn1GCqde2Nu7Vf8iJEFoG2uLhNvdp8RQFrc1sQv7qyAdlbvw7Td4nnAQwJESp/aWp0wKtq681jnPMRcVd9gBYSmsSnO4pye6F+BF/qbZNy4/xdVeZ8518yzc6EF5aWFfQXGhay1DnF3v9OyYQdnKNrDeNbEw6X6PK78CLYZieyCn304mZ6HgwewQwY/dP+VGAFOE+owsNPZj3H01tsbyYjl7d07JKGbHMug+uiigbgoFrg7X7pw9U7+pBRDLEaLWCpQ/nW239bV7GqWBxV+/vSMfhmdoZZSB6Ew==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tMUkL-0001Tk-UU for bug-gnu-emacs@gnu.org; Sat, 14 Dec 2024 11:12:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 14 Dec 2024 16:12:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 74666 X-GNU-PR-Package: emacs Original-Received: via spool by 74666-submit@debbugs.gnu.org id=B74666.17341927155597 (code B ref 74666); Sat, 14 Dec 2024 16:12:01 +0000 Original-Received: (at 74666) by debbugs.gnu.org; 14 Dec 2024 16:11:55 +0000 Original-Received: from localhost ([127.0.0.1]:48099 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tMUkE-0001SD-Lj for submit@debbugs.gnu.org; Sat, 14 Dec 2024 11:11:54 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:25880) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tMUkB-0001Ql-0b for 74666@debbugs.gnu.org; Sat, 14 Dec 2024 11:11:52 -0500 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 749378093F; Sat, 14 Dec 2024 11:11:43 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1734192698; bh=02hX6dqmjnsdTo5le/ohH3+dbi7vHRlj01YU+GmWstE=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=H+Hp/Ix5N9ssVSMH4twHTz2QnFFA39q75XhLlueDZOgMKXE89yvvl4N2BsoyBkzvD CpYWrE2TMSYThenEPLtt0eRrmWtatERQv5Qz39idOUBzR/9kLVSUUl+X8/FCxen8I/ sV4NEpHcMBFsG3i+xI9VRwHks+WXESM5oshBYg93kxTZIEX4t8SeRhn9fQBeCBAEWh Fvr/k9JIT/8uQt9NtXnX9d47k9P/rSSz2Dh2Wi/iHCyI4NYlD/MV7DFjT1sPMlSmlQ 5UsRJzObUhl8A4ZTfKh4InHS8ZE/UrPmwsCtiSIn6iKy1y9eHw94MA2Gs256qvakUY g+VcbCaog4/eg== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 67A4C801B7; Sat, 14 Dec 2024 11:11:38 -0500 (EST) Original-Received: from pastel (104-195-225-43.cpe.teksavvy.com [104.195.225.43]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 366981204C5; Sat, 14 Dec 2024 11:11:38 -0500 (EST) In-Reply-To: <5aad7547-5fd7-4eba-a6eb-38b1b4753dd8@gmail.com> (Campbell Barton's message of "Tue, 3 Dec 2024 21:56:02 +1100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:297057 Archived-At: > (defun test-me (is-forward) > (let ((result "")) > (with-temp-buffer > (insert "__B_\n") > (save-match-data > (set-match-data (list 2 4 2 2 2 4)) > (cond > (is-forward > (replace-match "HELLO" t t nil 1) > (replace-match "WORLD" t t nil 2)) > (t > (replace-match "WORLD" t t nil 2) > (replace-match "HELLO" t t nil 1)))) > (setq result (buffer-substring-no-properties (point-min) > (point-max)))) > result)) [...] > In emacs 29.4 this prints: > > A: _HELLOWORLD_ > B: _HELLOWORLD_ > > In emacs 31.0.50 this prints: > > A: _WORLD_ > B: _HELLOWORLD_ The problem is that the `set-match-data` doesn't give us any information about the intended inclusion relationship between the subgroups. I agree that the behavior you see is not the one you want if it's the result of: (goto-char (point-min)) (looking-at "_\\(\\)\\(_B\\)") But OTOH it is the one we want if it is the result of: (goto-char (point-min)) (looking-at "_\\(?2:\\(?1:\\)_B\\)") We can try and guess the inclusion relationship based on circumstantial evidence (e.g. a "_\\(\\)\\(_B\\)" regexp is more likely than "_\\(?2:\\(?1:\\)_B\\)"), but that would make the code of `update_search_regs` tricky, with various heuristics. And we'll never handle all cases right unless we make significant changes to the match-data (and the regexp compiler) to keep track of inclusion relationships. Could you give us some information about the larger context in which you bumped into this problem? Stefan