From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#44861: 27.1; [PATCH] signal in `replace-regexp-in-string' Date: Wed, 25 Nov 2020 15:58:22 +0100 Message-ID: <97535AF5-D542-4267-A5A9-1483C32A61AC@acm.org> References: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14231"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 44861@debbugs.gnu.org To: Shigeru Fukaya Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Nov 25 15:59:22 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1khwGN-0003W4-4y for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 25 Nov 2020 15:59:19 +0100 Original-Received: from localhost ([::1]:50874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1khwGL-0000rJ-Vp for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 25 Nov 2020 09:59:18 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38008) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1khwG5-0000pc-W7 for bug-gnu-emacs@gnu.org; Wed, 25 Nov 2020 09:59:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:53335) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1khwG5-0000Fj-Oc for bug-gnu-emacs@gnu.org; Wed, 25 Nov 2020 09:59:01 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1khwG5-0006gv-N9 for bug-gnu-emacs@gnu.org; Wed, 25 Nov 2020 09:59:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 25 Nov 2020 14:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44861 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 44861-submit@debbugs.gnu.org id=B44861.160631631325685 (code B ref 44861); Wed, 25 Nov 2020 14:59:01 +0000 Original-Received: (at 44861) by debbugs.gnu.org; 25 Nov 2020 14:58:33 +0000 Original-Received: from localhost ([127.0.0.1]:36648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khwFc-0006gD-Vu for submit@debbugs.gnu.org; Wed, 25 Nov 2020 09:58:33 -0500 Original-Received: from mail33c50.megamailservers.eu ([91.136.10.43]:38316) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1khwFX-0006fr-2u; Wed, 25 Nov 2020 09:58:28 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1606316305; bh=KgQ7vsrvrRieaUxw0hGKexXZ5ZQtp+ZgIogmG8OeuEw=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=coRSS68n7BPlmU+oT4ltbhydtaCQ3KXPgtWdHIUivYHz/rG8soVoaYwaYiITyYnaN wXVHk6atVA+qfv7TTiHWAIH4k8QMEl/nG61VGMVHGkX0jnaIq96mdFxoOCp8ASwFdF a0UGWvbvNiBoNviww8VgDW03T56PxGrVk0LDle2c= Feedback-ID: mattiase@acm.or Original-Received: from stanniol.lan (c-064ae655.032-75-73746f71.bbcust.telenor.se [85.230.74.6]) (authenticated bits=0) by mail33c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0APEwNbf004969; Wed, 25 Nov 2020 14:58:24 +0000 In-Reply-To: <6F768DED-2E1B-4D06-A776-FFA162AC32AD@acm.org> X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F23.5FBE7111.004D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=C6KXNjH+ c=1 sm=1 tr=0 a=Ni+dBsiEfW2GqKMPYZim9A==:117 a=Ni+dBsiEfW2GqKMPYZim9A==:17 a=M51BFTxLslgA:10 a=4GVXjkHO3qP_L8fHQkQA:9 a=CjuIK1q_8ugA:10 a=neCVTB8LpOVk-CJV-C0A:9 a=B2y7HmGcmWMA:10 X-Origin-Country: SE X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:194196 Archived-At: --Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii forcemerge 15107 44861 stop Suggested patch attached. A small test suite for = replace-regexp-in-string has already been pushed to master -- very = rudimentary, but better than nothing -- and the patch amends it with = some new relevant cases that didn't work before. It is basically your patch but slightly optimised; it turned out that = the function call and allocation overhead of the original patch made it = a tad too expensive (a pity, because it was very neat). Now performance = is about the same as before when the pattern contains no submatches, and = slightly above (< 10% slower) with one submatch. It seems worth the = correctness. --Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583 Content-Disposition: attachment; filename=0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-replace-regexp-in-string-substring-match-data-tr.patch" Content-Transfer-Encoding: quoted-printable =46rom=209bc8dc80be5cee517fa53e6b8f37881d4220f162=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Wed,=2025=20Nov=202020=2015:32:08=20+0100=0A= Subject:=20[PATCH]=20Fix=20replace-regexp-in-string=20substring=20match=20= data=20translation=0A=0AFor=20certain=20patterns,=20re-matching=20the=20= same=20regexp=20on=20the=20matched=0Asubstring=20does=20not=20produce=20= correctly=20translated=20match=20data=0A(bug#15107=20and=20bug#44861).=0A= =0AReported=20by=20Kevin=20Ryde=20and=20Shigeru=20Fukaya.=0A=0A*=20= lisp/subr.el=20(replace-regexp-in-string):=20Translate=20the=20match=20= data=0Aby=20explicit=20manipulation=20instead=20of=20trusting=20a=20call=20= to=20string-match=20on=0Athe=20matched=20string=20to=20do=20the=20job.=0A= *=20test/lisp/subr-tests.el=20(subr-replace-regexp-in-string):=0AAdd=20= test=20cases.=0A---=0A=20lisp/subr.el=20=20=20=20=20=20=20=20=20=20=20=20= |=2017=20++++++++++++-----=0A=20test/lisp/subr-tests.el=20|=20=206=20= +++++-=0A=202=20files=20changed,=2017=20insertions(+),=206=20= deletions(-)=0A=0Adiff=20--git=20a/lisp/subr.el=20b/lisp/subr.el=0Aindex=20= 1fb0f9ab7e..0ee2199933=20100644=0A---=20a/lisp/subr.el=0A+++=20= b/lisp/subr.el=0A@@=20-4537,7=20+4537,7=20@@=20replace-regexp-in-string=0A= =20=20=20;;=20might=20be=20reasonable=20to=20do=20so=20for=20long=20= enough=20STRING.]=0A=20=20=20(let=20((l=20(length=20string))=0A=20=09= (start=20(or=20start=200))=0A-=09matches=20str=20mb=20me)=0A+=09matches=20= str=20mb=20me=20md)=0A=20=20=20=20=20(save-match-data=0A=20=20=20=20=20=20= =20(while=20(and=20(<=20start=20l)=20(string-match=20regexp=20string=20= start))=0A=20=09(setq=20mb=20(match-beginning=200)=0A@@=20-4546,10=20= +4546,17=20@@=20replace-regexp-in-string=0A=20=09(when=20(=3D=20me=20mb)=20= (setq=20me=20(min=20l=20(1+=20mb))))=0A=20=09;;=20Generate=20a=20= replacement=20for=20the=20matched=20substring.=0A=20=09;;=20Operate=20on=20= only=20the=20substring=20to=20minimize=20string=20consing.=0A-=09;;=20= Set=20up=20match=20data=20for=20the=20substring=20for=20replacement;=0A-=09= ;;=20presumably=20this=20is=20likely=20to=20be=20faster=20than=20munging=20= the=0A-=09;;=20match=20data=20directly=20in=20Lisp.=0A-=09(string-match=20= regexp=20(setq=20str=20(substring=20string=20mb=20me)))=0A+=0A+=20=20=20=20= =20=20=20=20;;=20Translate=20the=20match=20data=20so=20that=20it=20= applies=20to=20the=20matched=20substring.=0A+=20=20=20=20=20=20=20=20= (setq=20md=20(match-data=20nil=20md=20t))=20=20;=20Reuse=20list=20from=20= previous=20match.=0A+=20=20=20=20=20=20=20=20(let=20((m=20md))=0A+=20=20=20= =20=20=20=20=20=20=20(while=20m=0A+=20=20=20=20=20=20=20=20=20=20=20=20= (when=20(car=20m)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20(setcar=20= m=20(-=20(car=20m)=20mb)))=0A+=20=20=20=20=20=20=20=20=20=20=20=20(setq=20= m=20(cdr=20m)))=0A+=20=20=20=20=20=20=20=20=20=20(set-match-data=20md))=0A= +=0A+=20=20=20=20=20=20=20=20(setq=20str=20(substring=20string=20mb=20= me))=0A=20=09(setq=20matches=0A=20=09=20=20=20=20=20=20(cons=20= (replace-match=20(if=20(stringp=20rep)=0A=20=09=09=09=09=20=20=20=20=20=20= =20rep=0Adiff=20--git=20a/test/lisp/subr-tests.el=20= b/test/lisp/subr-tests.el=0Aindex=20c77be511dc..67f7fc9749=20100644=0A= ---=20a/test/lisp/subr-tests.el=0A+++=20b/test/lisp/subr-tests.el=0A@@=20= -545,7=20+545,11=20@@=20subr-replace-regexp-in-string=0A=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (match-beginning=201)=20(match-end=201)))=0A=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20"babbcaacabc")=0A=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20"ba"))=0A= -=20=20)=0A+=20=20;;=20anchors=20(bug#15107,=20bug#44861)=0A+=20=20= (should=20(equal=20(replace-regexp-in-string=20"a\\B"=20"b"=20"a=20= aaaa")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20"a=20= bbba"))=0A+=20=20(should=20(equal=20(replace-regexp-in-string=20= "\\`\\|x"=20"z"=20"--xx--")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"z--zz--")))=0A=20=0A=20(provide=20'subr-tests)=0A=20;;;=20= subr-tests.el=20ends=20here=0A--=20=0A2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_00941A56-F79B-4BEA-9413-E45E955F9583--