From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: :alnum: broken? Date: Wed, 26 Feb 2020 10:48:36 -0500 Message-ID: References: <86wo8flqct.fsf@stephe-leake.org> <86sgj3ljf0.fsf@stephe-leake.org> <5fecc0e1-1ee2-5a89-9297-b0b9aa4a8e9c@cs.ucla.edu> <03A37C4B-9FE8-4A25-9851-79BC8265455E@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="120956"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: Paul Eggert , Stephen Leake , emacs-devel To: Mattias =?windows-1252?Q?Engdeg=E5rd?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Feb 26 16:50:21 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1j6yx3-000VJM-7p for ged-emacs-devel@m.gmane-mx.org; Wed, 26 Feb 2020 16:50:21 +0100 Original-Received: from localhost ([::1]:46114 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j6yx2-0000Ni-9n for ged-emacs-devel@m.gmane-mx.org; Wed, 26 Feb 2020 10:50:20 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59617) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j6yvT-0007JW-6M for emacs-devel@gnu.org; Wed, 26 Feb 2020 10:48:44 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j6yvR-0006RD-9a for emacs-devel@gnu.org; Wed, 26 Feb 2020 10:48:42 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:13143) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j6yvR-0006KH-1P for emacs-devel@gnu.org; Wed, 26 Feb 2020 10:48:41 -0500 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 093D744EDC8; Wed, 26 Feb 2020 10:48:40 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id DBDAE44ED79; Wed, 26 Feb 2020 10:48:37 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1582732117; bh=tigUNrExRruVAbEY541jGJhe5xriNP2aAB2myE44pd4=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=XgLNBTpP7OVnYRRvWFyuL98rW1MjqFOoxK4YmzZ8QgKeaUuLMtrdOFXw0xXrwQUOU ALuaNQQvZUo9J3VpEQS8sOlLG+xjmPUtjZJd9hb3vhxaNpitjntDfury8IsM+SigzB 7UequcI0XblGpaaSxx2LdYSvSPY4x8Eim39bfMsTIldiKMJS2BnRnWInk/odbyXDiZ u5i1B7ChxkdoD0mMo+TnBMSZEYMuwzNIs+/OAdssYrVFvzs8aIbs2B2rKoa/bYK9eR nvfFcz+6JVpib7T9ncb+9VHCzKnRKxYZVnHodI9TCFM1IAi3VjZZRnBG3ud9TFCeux /FDpYky6ioQvw== Original-Received: from lechazo (lechon.iro.umontreal.ca [132.204.27.242]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 8CFE8120737; Wed, 26 Feb 2020 10:48:37 -0500 (EST) In-Reply-To: <03A37C4B-9FE8-4A25-9851-79BC8265455E@acm.org> ("Mattias =?windows-1252?Q?Engdeg=E5rd=22's?= message of "Wed, 26 Feb 2020 15:10:46 +0100") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 132.204.25.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:245075 Archived-At: > Patch attached. It was written for master, but I would suggest it go in e= macs-27. FWIW, you have a +1 from me, tho I don't see any urgency so I'd keep it for `master`. Stefan Mattias Engdeg=E5rd [2020-02-26 15:10:46] wrote: > I just made this very mistake while adding a new regexp-error checking > feature to xr. Needless to say I now am strongly in favour of turning it > into a hard error. > > > The error message could be improved. For the benefit of > isearch-forward-regexp, it's probably a good idea if it doesn't start or = end > in a square bracket. > > From 014a7a7dce5ae23b8a47dd68eaaef0a5cb985b46 Mon Sep 17 00:00:00 2001 > From: =3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D > Date: Wed, 26 Feb 2020 14:46:01 +0100 > Subject: [PATCH] Signal an error for the regexp "[:alnum:]" > > Omitting the extra brackets is a common mistake; see discussion at > https://lists.gnu.org/archive/html/emacs-devel/2020-02/msg00215.html > > * src/regex-emacs.c (reg_errcode_t, re_error_msgid): Add REG_ECLASSBR. > (regex_compile): Check for the mistake. > * test/src/regex-emacs-tests.el (regexp-invalid): Test. > * etc/NEWS: Announce. > --- > etc/NEWS | 5 +++++ > src/regex-emacs.c | 21 ++++++++++++++++++++- > test/src/regex-emacs-tests.el | 4 ++++ > 3 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/etc/NEWS b/etc/NEWS > index 54aab1a5b6..404b4b9ebd 100644 > --- a/etc/NEWS > +++ b/etc/NEWS > @@ -190,6 +190,11 @@ Emacs now supports bignums so this old glitch is no = longer needed. > 'previous-system-time-locale' have been removed, as they were created > by mistake and were not useful to Lisp code. >=20=20 > +** The regexp mistake '[:digit:]' is now an error. > +The correct syntax is '[[:digit:]]'. Previously, forgetting the extra > +brackets silently resulted in a regexp that did not at all work as > +intended. > + > > * Lisp Changes in Emacs 28.1 >=20=20 > diff --git a/src/regex-emacs.c b/src/regex-emacs.c > index 694431c95e..2648e1d6ae 100644 > --- a/src/regex-emacs.c > +++ b/src/regex-emacs.c > @@ -818,7 +818,8 @@ print_double_string (re_char *where, re_char *string1= , ptrdiff_t size1, > REG_ESIZE, /* Compiled pattern bigger than 2^16 bytes. */ > REG_ERPAREN, /* Unmatched ) or \); not returned from regcomp. */ > REG_ERANGEX, /* Range striding over charsets. */ > - REG_ESIZEBR /* n or m too big in \{n,m\} */ > + REG_ESIZEBR, /* n or m too big in \{n,m\} */ > + REG_ECLASSBR, /* Missing [] around [:class:]. */ > } reg_errcode_t; >=20=20 > static const char *re_error_msgid[] =3D > @@ -842,6 +843,7 @@ print_double_string (re_char *where, re_char *string1= , ptrdiff_t size1, > [REG_ERPAREN] =3D "Unmatched ) or \\)", > [REG_ERANGEX ] =3D "Range striding over charsets", > [REG_ESIZEBR ] =3D "Invalid content of \\{\\}", > + [REG_ECLASSBR] =3D "Class syntax is [[:digit:]], not [:digit:]", > }; >=20=20 > /* For 'regs_allocated'. */ > @@ -2000,6 +2002,23 @@ regex_compile (re_char *pattern, ptrdiff_t size, >=20=20 > laststart =3D b; >=20=20 > + /* Check for the mistake of forgetting the extra square brac= kets, > + as in "[:alpha:]". */ > + if (*p =3D=3D ':') > + { > + re_char *q =3D p + 1; > + while (q !=3D pend && *q !=3D ']') > + { > + if (*q =3D=3D ':') > + { > + if (q + 1 !=3D pend && q[1] =3D=3D ']' && q > p = + 1) > + FREE_STACK_RETURN (REG_ECLASSBR); > + break; > + } > + q++; > + } > + } > + > /* Test '*p =3D=3D '^' twice, instead of using an if > statement, so we need only one BUF_PUSH. */ > BUF_PUSH (*p =3D=3D '^' ? charset_not : charset); > diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el > index f9372e37b1..d268b97080 100644 > --- a/test/src/regex-emacs-tests.el > +++ b/test/src/regex-emacs-tests.el > @@ -803,4 +803,8 @@ regexp-multibyte-unibyte > (should-not (string-match "=E5" "\xe5")) > (should-not (string-match "[=E5]" "\xe5"))) >=20=20 > +(ert-deftest regexp-invalid () > + (should-error (string-match "[:space:]" "") > + :type 'invalid-regexp)) > + > ;;; regex-emacs-tests.el ends here