From: Stefan Monnier <monnier@iro.umontreal.ca>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: Paul Eggert <eggert@cs.ucla.edu>,
Stephen Leake <stephen_leake@stephe-leake.org>,
emacs-devel <emacs-devel@gnu.org>
Subject: Re: :alnum: broken?
Date: Wed, 26 Feb 2020 10:48:36 -0500 [thread overview]
Message-ID: <jwvpne1xsef.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <03A37C4B-9FE8-4A25-9851-79BC8265455E@acm.org> ("Mattias Engdegård"'s message of "Wed, 26 Feb 2020 15:10:46 +0100")
> Patch attached. It was written for master, but I would suggest it go in emacs-27.
FWIW, you have a +1 from me, tho I don't see any urgency so I'd keep it
for `master`.
Stefan
Mattias Engdegård [2020-02-26 15:10:46] wrote:
> I just made this very mistake while adding a new regexp-error checking
> feature to xr. Needless to say I now am strongly in favour of turning it
> into a hard error.
>
>
> The error message could be improved. For the benefit of
> isearch-forward-regexp, it's probably a good idea if it doesn't start or end
> in a square bracket.
>
> From 014a7a7dce5ae23b8a47dd68eaaef0a5cb985b46 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
> Date: Wed, 26 Feb 2020 14:46:01 +0100
> Subject: [PATCH] Signal an error for the regexp "[:alnum:]"
>
> Omitting the extra brackets is a common mistake; see discussion at
> https://lists.gnu.org/archive/html/emacs-devel/2020-02/msg00215.html
>
> * src/regex-emacs.c (reg_errcode_t, re_error_msgid): Add REG_ECLASSBR.
> (regex_compile): Check for the mistake.
> * test/src/regex-emacs-tests.el (regexp-invalid): Test.
> * etc/NEWS: Announce.
> ---
> etc/NEWS | 5 +++++
> src/regex-emacs.c | 21 ++++++++++++++++++++-
> test/src/regex-emacs-tests.el | 4 ++++
> 3 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/etc/NEWS b/etc/NEWS
> index 54aab1a5b6..404b4b9ebd 100644
> --- a/etc/NEWS
> +++ b/etc/NEWS
> @@ -190,6 +190,11 @@ Emacs now supports bignums so this old glitch is no longer needed.
> 'previous-system-time-locale' have been removed, as they were created
> by mistake and were not useful to Lisp code.
>
> +** The regexp mistake '[:digit:]' is now an error.
> +The correct syntax is '[[:digit:]]'. Previously, forgetting the extra
> +brackets silently resulted in a regexp that did not at all work as
> +intended.
> +
> \f
> * Lisp Changes in Emacs 28.1
>
> diff --git a/src/regex-emacs.c b/src/regex-emacs.c
> index 694431c95e..2648e1d6ae 100644
> --- a/src/regex-emacs.c
> +++ b/src/regex-emacs.c
> @@ -818,7 +818,8 @@ print_double_string (re_char *where, re_char *string1, ptrdiff_t size1,
> REG_ESIZE, /* Compiled pattern bigger than 2^16 bytes. */
> REG_ERPAREN, /* Unmatched ) or \); not returned from regcomp. */
> REG_ERANGEX, /* Range striding over charsets. */
> - REG_ESIZEBR /* n or m too big in \{n,m\} */
> + REG_ESIZEBR, /* n or m too big in \{n,m\} */
> + REG_ECLASSBR, /* Missing [] around [:class:]. */
> } reg_errcode_t;
>
> static const char *re_error_msgid[] =
> @@ -842,6 +843,7 @@ print_double_string (re_char *where, re_char *string1, ptrdiff_t size1,
> [REG_ERPAREN] = "Unmatched ) or \\)",
> [REG_ERANGEX ] = "Range striding over charsets",
> [REG_ESIZEBR ] = "Invalid content of \\{\\}",
> + [REG_ECLASSBR] = "Class syntax is [[:digit:]], not [:digit:]",
> };
>
> /* For 'regs_allocated'. */
> @@ -2000,6 +2002,23 @@ regex_compile (re_char *pattern, ptrdiff_t size,
>
> laststart = b;
>
> + /* Check for the mistake of forgetting the extra square brackets,
> + as in "[:alpha:]". */
> + if (*p == ':')
> + {
> + re_char *q = p + 1;
> + while (q != pend && *q != ']')
> + {
> + if (*q == ':')
> + {
> + if (q + 1 != pend && q[1] == ']' && q > p + 1)
> + FREE_STACK_RETURN (REG_ECLASSBR);
> + break;
> + }
> + q++;
> + }
> + }
> +
> /* Test '*p == '^' twice, instead of using an if
> statement, so we need only one BUF_PUSH. */
> BUF_PUSH (*p == '^' ? charset_not : charset);
> diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el
> index f9372e37b1..d268b97080 100644
> --- a/test/src/regex-emacs-tests.el
> +++ b/test/src/regex-emacs-tests.el
> @@ -803,4 +803,8 @@ regexp-multibyte-unibyte
> (should-not (string-match "å" "\xe5"))
> (should-not (string-match "[å]" "\xe5")))
>
> +(ert-deftest regexp-invalid ()
> + (should-error (string-match "[:space:]" "")
> + :type 'invalid-regexp))
> +
> ;;; regex-emacs-tests.el ends here
next prev parent reply other threads:[~2020-02-26 15:48 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-21 18:58 :alnum: broken? Stephen Leake
2020-02-21 19:00 ` Paul Eggert
2020-02-21 19:32 ` Mattias Engdegård
2020-02-21 21:28 ` Stephen Leake
2020-02-22 1:09 ` Paul Eggert
2020-02-22 7:48 ` Eli Zaretskii
2020-02-22 21:28 ` Paul Eggert
2020-02-23 3:28 ` Eli Zaretskii
2020-02-23 10:21 ` Mattias Engdegård
2020-02-23 18:13 ` Paul Eggert
2020-02-23 18:27 ` Eli Zaretskii
2020-02-23 19:34 ` Óscar Fuentes
2020-02-23 22:12 ` Drew Adams
2020-02-25 3:57 ` Richard Stallman
2020-02-25 14:37 ` Stefan Monnier
2020-02-25 15:45 ` Drew Adams
2020-02-25 15:40 ` Drew Adams
2020-02-25 9:33 ` Andreas Schwab
2020-02-25 13:53 ` Clément Pit-Claudel
2020-02-25 15:40 ` Drew Adams
2020-02-23 18:40 ` Mattias Engdegård
2020-02-26 14:10 ` Mattias Engdegård
2020-02-26 14:54 ` Drew Adams
2020-02-26 15:48 ` Stefan Monnier [this message]
2020-02-26 21:00 ` Paul Eggert
2020-02-26 21:18 ` Mattias Engdegård
2020-02-26 21:24 ` Clément Pit-Claudel
2020-02-26 22:01 ` Mattias Engdegård
2020-02-26 22:38 ` Eli Zaretskii
2020-02-27 17:57 ` Mattias Engdegård
2020-02-27 23:17 ` Óscar Fuentes
2020-02-28 8:09 ` Eli Zaretskii
2020-02-28 8:48 ` Paul Eggert
2020-02-28 13:11 ` Eli Zaretskii
2020-02-28 17:41 ` Paul Eggert
2020-02-28 20:09 ` Eli Zaretskii
2020-02-28 20:25 ` Paul Eggert
2020-02-28 20:38 ` Eli Zaretskii
2020-02-28 21:04 ` Mattias Engdegård
2020-02-28 21:40 ` Eli Zaretskii
2020-02-29 11:43 ` Mattias Engdegård
2020-02-29 12:07 ` Eli Zaretskii
2020-02-29 14:24 ` Stefan Monnier
2020-02-29 14:14 ` Stefan Monnier
2020-02-29 17:33 ` Óscar Fuentes
2020-02-29 19:52 ` Stefan Monnier
2020-02-29 21:12 ` Óscar Fuentes
2020-02-29 22:22 ` Marcin Borkowski
2020-02-29 22:34 ` Clément Pit-Claudel
2020-03-01 22:44 ` Marcin Borkowski
2020-03-02 3:07 ` Stefan Monnier
2020-03-02 7:15 ` Marcin Borkowski
2020-03-02 7:41 ` Emanuel Berg via Emacs development discussions.
2020-03-02 16:14 ` Drew Adams
2020-03-02 16:51 ` Emanuel Berg via Emacs development discussions.
2020-03-02 7:56 ` Joost Kremers
2020-03-02 9:44 ` Štěpán Němec
2020-03-02 10:43 ` Joost Kremers
2020-03-02 13:37 ` Clément Pit-Claudel
2020-03-02 17:03 ` Stefan Monnier
2020-03-02 18:23 ` Clément Pit-Claudel
2020-02-29 23:02 ` Andrea Corallo
2020-03-01 22:41 ` Marcin Borkowski
2020-02-29 22:58 ` Stefan Monnier
2020-02-29 23:28 ` Óscar Fuentes
2020-02-27 1:33 ` Paul Eggert
2020-02-26 16:01 ` Andreas Schwab
2020-02-26 21:06 ` Mattias Engdegård
2020-02-27 8:43 ` Andreas Schwab
2020-02-27 18:05 ` Mattias Engdegård
2020-02-22 9:09 ` Eli Zaretskii
2020-02-23 3:49 ` Richard Stallman
2020-02-23 7:51 ` Paul Eggert
2020-02-23 15:07 ` Eli Zaretskii
2020-02-21 19:01 ` Noam Postavsky
2020-02-21 19:04 ` Andreas Schwab
[not found] <<86wo8flqct.fsf@stephe-leake.org>
[not found] ` <<f89e340f-9a19-50e1-4421-57fd3e235548@cs.ucla.edu>
[not found] ` <<86sgj3ljf0.fsf@stephe-leake.org>
[not found] ` <<E1j5iGS-0000r6-Id@fencepost.gnu.org>
[not found] ` <<18bd2b64-1c2f-055f-4fa0-092bdb1da531@cs.ucla.edu>
[not found] ` <<838sktibpu.fsf@gnu.org>
2020-02-23 16:52 ` Drew Adams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwvpne1xsef.fsf-monnier+emacs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=eggert@cs.ucla.edu \
--cc=emacs-devel@gnu.org \
--cc=mattiase@acm.org \
--cc=stephen_leake@stephe-leake.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).