From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#33205: 26.1; unibyte/multibyte missing in rx.el Date: Mon, 19 Nov 2018 21:07:39 +0100 Message-ID: <5203F729-8090-4453-80CC-1249DB064631@acm.org> References: <83pnvrjqec.fsf@gnu.org> <160755c702f9b4dfc80be8b5664eb3919804bb84.camel@acm.org> <83wopyi00z.fsf@gnu.org> <83pnvjcvwc.fsf@gnu.org> <834lcsd7qu.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Content-Type: multipart/mixed; boundary="Apple-Mail=_359A902C-A630-45A7-8607-01EF37A86B92" X-Trace: blaine.gmane.org 1542657966 16769 195.159.176.226 (19 Nov 2018 20:06:06 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 19 Nov 2018 20:06:06 +0000 (UTC) Cc: 33205@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Nov 19 21:06:02 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gOpo1-0004F2-VY for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Nov 2018 21:06:02 +0100 Original-Received: from localhost ([::1]:58852 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gOpq8-0002dG-CW for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Nov 2018 15:08:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37874) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gOpq2-0002dB-1F for bug-gnu-emacs@gnu.org; Mon, 19 Nov 2018 15:08:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gOppy-0005Lg-Cx for bug-gnu-emacs@gnu.org; Mon, 19 Nov 2018 15:08:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:56996) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gOppy-0005LY-9T for bug-gnu-emacs@gnu.org; Mon, 19 Nov 2018 15:08:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gOppy-0004we-0e for bug-gnu-emacs@gnu.org; Mon, 19 Nov 2018 15:08:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 19 Nov 2018 20:08:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33205 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 33205-submit@debbugs.gnu.org id=B33205.154265806618983 (code B ref 33205); Mon, 19 Nov 2018 20:08:01 +0000 Original-Received: (at 33205) by debbugs.gnu.org; 19 Nov 2018 20:07:46 +0000 Original-Received: from localhost ([127.0.0.1]:33021 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gOpph-0004w5-Qt for submit@debbugs.gnu.org; Mon, 19 Nov 2018 15:07:46 -0500 Original-Received: from mail72c50.megamailservers.eu ([91.136.10.82]:51852 helo=mail92c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gOppf-0004vu-Ke for 33205@debbugs.gnu.org; Mon, 19 Nov 2018 15:07:44 -0500 X-Authenticated-User: mattiase@bredband.net Original-Received: from [192.168.1.64] (c-4c20e455.032-75-73746f71.bbcust.telenor.se [85.228.32.76]) (authenticated bits=0) by mail92c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id wAJK7eWV026670; Mon, 19 Nov 2018 20:07:42 +0000 In-Reply-To: <834lcsd7qu.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.9.1) X-CTCH-RefID: str=0001.0A0B0206.5BF3180E.001B, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=EN4oLWRC c=1 sm=1 tr=0 a=O/kugHGubKq+ag9os7v1Dw==:117 a=O/kugHGubKq+ag9os7v1Dw==:17 a=7ENGd9fJGyLTfl90xvIA:9 a=CjuIK1q_8ugA:10 a=Ht3DGANWuDKZ7qB08NEA:9 a=B2y7HmGcmWMA:10 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:152543 Archived-At: --Apple-Mail=_359A902C-A630-45A7-8607-01EF37A86B92 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii I tried using rx to match raw bytes. (rx (any (?\200 . ?\377))) doesn't = work, since that is translated to the corresponding Unicode range; (any = (#x3fff80 . #x3fffff)) must be used instead. Maybe that is evident, or = would it merit a mention in the doc string? The alternative formulation (rx (any "\200-\377")) doesn't work either, = and this seems to be a bug. Looking at rx-check-any-string, a second bug = is revealed: the code uses the regex ".-." to pick out ranges, which = means that \n cannot be a range endpoint. Perhaps you want me to open a new bug for the above? I'm attaching a = patch all the same, but you may prefer doing it differently. --Apple-Mail=_359A902C-A630-45A7-8607-01EF37A86B92 Content-Disposition: attachment; filename=rx-any-raw-bytes.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="rx-any-raw-bytes.patch" Content-Transfer-Encoding: quoted-printable =20lisp/emacs-lisp/rx.el=20=20=20=20=20=20=20=20=20=20=20=20|=2049=20= ++++++++++++++++++++++------------------=0A=20= test/lisp/emacs-lisp/rx-tests.el=20|=2022=20++++++++++++++++++=0A=202=20= files=20changed,=2049=20insertions(+),=2022=20deletions(-)=0A=0Adiff=20= --git=20a/lisp/emacs-lisp/rx.el=20b/lisp/emacs-lisp/rx.el=0Aindex=20= 1230df4f15..04069e8d50=20100644=0A---=20a/lisp/emacs-lisp/rx.el=0A+++=20= b/lisp/emacs-lisp/rx.el=0A@@=20-449,28=20+449,33=20@@=20Only=20both=20= edges=20of=20each=20range=20is=20checked."=0A=20=0A=20=0A=20(defun=20= rx-check-any-string=20(str)=0A-=20=20"Check=20string=20argument=20STR=20= for=20Rx=20`any'."=0A-=20=20(let=20((i=200)=0A-=09c1=20c2=20l)=0A-=20=20=20= =20(if=20(=3D=200=20(length=20str))=0A-=09(error=20"String=20arg=20for=20= Rx=20`any'=20must=20not=20be=20empty"))=0A-=20=20=20=20(while=20= (string-match=20".-."=20str=20i)=0A-=20=20=20=20=20=20;;=20string=20= before=20range:=20convert=20it=20to=20characters=0A-=20=20=20=20=20=20= (if=20(<=20i=20(match-beginning=200))=0A-=09=20=20(setq=20l=20(nconc=0A-=09= =09=20=20=20l=0A-=09=09=20=20=20(append=20(substring=20str=20i=20= (match-beginning=200))=20nil))))=0A-=20=20=20=20=20=20;;=20range=0A-=20=20= =20=20=20=20(setq=20i=20(match-end=200)=0A-=09=20=20=20=20c1=20(aref=20= str=20(match-beginning=200))=0A-=09=20=20=20=20c2=20(aref=20str=20(1-=20= i)))=0A-=20=20=20=20=20=20(cond=0A-=20=20=20=20=20=20=20((<=20c1=20c2)=20= (setq=20l=20(nconc=20l=20(list=20(cons=20c1=20c2)))))=0A-=20=20=20=20=20=20= =20((=3D=20c1=20c2)=20(setq=20l=20(nconc=20l=20(list=20c1))))))=0A-=20=20= =20=20;;=20rest?=0A-=20=20=20=20(if=20(<=20i=20(length=20str))=0A-=09= (setq=20l=20(nconc=20l=20(append=20(substring=20str=20i)=20nil))))=0A-=20= =20=20=20l))=0A+=20=20"Turn=20a=20string=20argument=20to=20`any'=20into=20= a=20list=20of=20characters=20and,=20representing=0A+ranges,=20dotted=20= pairs=20of=20characters.=20The=20original=20order=20is=20not=20= preserved."=0A+=20=20(let=20((decode-char=0A+=20=20=20=20=20=20=20=20=20= ;;=20Make=20sure=20raw=20bytes=20are=20decoded=20as=20such,=20to=20avoid=20= confusion=20with=0A+=20=20=20=20=20=20=20=20=20;;=20U+0080..U+00FF.=0A+=20= =20=20=20=20=20=20=20=20(if=20(multibyte-string-p=20str)=0A+=20=20=20=20=20= =20=20=20=20=20=20=20=20#'identity=0A+=20=20=20=20=20=20=20=20=20=20=20= (lambda=20(c)=20(if=20(and=20(>=3D=20c=20#x80)=20(<=3D=20c=20#xff))=0A+=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20(+=20c=20#x3fff00)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20c))))=0A+=20=20=20=20=20=20=20=20(len=20= (length=20str))=0A+=20=20=20=20=20=20=20=20(i=200)=0A+=20=20=20=20=20=20=20= =20(ret=20nil))=0A+=20=20=20=20(while=20(<=20i=20len)=0A+=20=20=20=20=20=20= (cond=20((and=20(<=20i=20(-=20len=202))=0A+=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20(=3D=20(aref=20str=20(+=20i=201))=20?-))=0A+=20=20= =20=20=20=20=20=20=20=20=20=20=20;;=20Range.=0A+=20=20=20=20=20=20=20=20=20= =20=20=20=20(let=20((start=20(funcall=20decode-char=20(aref=20str=20i)))=0A= +=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(end=20=20=20= (funcall=20decode-char=20(aref=20str=20(+=20i=202)))))=0A+=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20(cond=20((<=20start=20end)=20(push=20(cons=20= start=20end)=20ret))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20((=3D=20start=20end)=20(push=20start=20ret)))=0A+=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20(setq=20i=20(+=20i=203))))=0A+=20=20=20=20= =20=20=20=20=20=20=20=20(t=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20;;=20= Single=20character.=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20(push=20= (funcall=20decode-char=20(aref=20str=20i))=20ret)=0A+=20=20=20=20=20=20=20= =20=20=20=20=20=20(setq=20i=20(+=20i=201)))))=0A+=20=20=20=20ret))=0A=20=0A= =20=0A=20(defun=20rx-check-any=20(arg)=0Adiff=20--git=20= a/test/lisp/emacs-lisp/rx-tests.el=20b/test/lisp/emacs-lisp/rx-tests.el=0A= index=20d15e3d7719..fb268c58f9=20100644=0A---=20= a/test/lisp/emacs-lisp/rx-tests.el=0A+++=20= b/test/lisp/emacs-lisp/rx-tests.el=0A@@=20-33,6=20+33,28=20@@=0A=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20(number-sequence=20?<=20?\])=0A=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20(number-sequence=20?-=20?:))))))=0A=20=0A+(ert-deftest=20= rx-char-any-range-nl=20()=0A+=20=20"Character=20alternatives=20with=20\n=20= as=20a=20range=20endpoint."=0A+=20=20(should=20(equal=20(rx=20(any=20= "\n-\r"))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= "[\n-\r]"))=0A+=20=20(should=20(equal=20(rx=20(any=20"\a-\n"))=0A+=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20"[\a-\n]")))=0A+=0A= +(ert-deftest=20rx-char-any-raw-byte=20()=0A+=20=20"Raw=20bytes=20in=20= character=20alternatives."=0A+=20=20;;=20Separate=20raw=20characters.=0A= +=20=20(should=20(equal=20(string-match-p=20(rx=20(any=20"\326A\333B"))=0A= +=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20"X\326\333")=0A+=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=201))=0A+=20=20;;=20Range=20of=20raw=20characters,=20= unibyte.=0A+=20=20(should=20(equal=20(string-match-p=20(rx=20(any=20= "\200-\377"))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20"=C3=BFA\310B")=0A+=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=202))=0A+=20=20;;=20Range=20of=20= raw=20characters,=20multibyte.=0A+=20=20(should=20(equal=20= (string-match-p=20(rx=20(any=20"=C3=85\211\326-\377\177"))=0A+=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20"XY\355\177\327")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=202)))=0A+=0A=20(ert-deftest=20rx-pcase=20()=0A=20=20=20(should=20= (equal=20(pcase=20"a=201=202=203=201=201=20b"=0A=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20((rx=20(let=20u=20(+=20digit))=20space=0A= --Apple-Mail=_359A902C-A630-45A7-8607-01EF37A86B92--