unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: "Mattias Engdegård" <mattiase@acm.org>
Cc: Paul Eggert <eggert@cs.ucla.edu>,
	Stephen Leake <stephen_leake@stephe-leake.org>,
	emacs-devel <emacs-devel@gnu.org>
Subject: Re: :alnum: broken?
Date: Wed, 26 Feb 2020 10:48:36 -0500	[thread overview]
Message-ID: <jwvpne1xsef.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <03A37C4B-9FE8-4A25-9851-79BC8265455E@acm.org> ("Mattias Engdegård"'s message of "Wed, 26 Feb 2020 15:10:46 +0100")

> Patch attached. It was written for master, but I would suggest it go in emacs-27.

FWIW, you have a +1 from me, tho I don't see any urgency so I'd keep it
for `master`.


        Stefan


Mattias Engdegård [2020-02-26 15:10:46] wrote:

> I just made this very mistake while adding a new regexp-error checking
> feature to xr. Needless to say I now am strongly in favour of turning it
> into a hard error.
>
>
> The error message could be improved. For the benefit of
> isearch-forward-regexp, it's probably a good idea if it doesn't start or end
> in a square bracket.
>
> From 014a7a7dce5ae23b8a47dd68eaaef0a5cb985b46 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
> Date: Wed, 26 Feb 2020 14:46:01 +0100
> Subject: [PATCH] Signal an error for the regexp "[:alnum:]"
>
> Omitting the extra brackets is a common mistake; see discussion at
> https://lists.gnu.org/archive/html/emacs-devel/2020-02/msg00215.html
>
> * src/regex-emacs.c (reg_errcode_t, re_error_msgid): Add REG_ECLASSBR.
> (regex_compile): Check for the mistake.
> * test/src/regex-emacs-tests.el (regexp-invalid): Test.
> * etc/NEWS: Announce.
> ---
>  etc/NEWS                      |  5 +++++
>  src/regex-emacs.c             | 21 ++++++++++++++++++++-
>  test/src/regex-emacs-tests.el |  4 ++++
>  3 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/etc/NEWS b/etc/NEWS
> index 54aab1a5b6..404b4b9ebd 100644
> --- a/etc/NEWS
> +++ b/etc/NEWS
> @@ -190,6 +190,11 @@ Emacs now supports bignums so this old glitch is no longer needed.
>  'previous-system-time-locale' have been removed, as they were created
>  by mistake and were not useful to Lisp code.
>  
> +** The regexp mistake '[:digit:]' is now an error.
> +The correct syntax is '[[:digit:]]'.  Previously, forgetting the extra
> +brackets silently resulted in a regexp that did not at all work as
> +intended.
> +
>  \f
>  * Lisp Changes in Emacs 28.1
>  
> diff --git a/src/regex-emacs.c b/src/regex-emacs.c
> index 694431c95e..2648e1d6ae 100644
> --- a/src/regex-emacs.c
> +++ b/src/regex-emacs.c
> @@ -818,7 +818,8 @@ print_double_string (re_char *where, re_char *string1, ptrdiff_t size1,
>    REG_ESIZE,		/* Compiled pattern bigger than 2^16 bytes.  */
>    REG_ERPAREN,		/* Unmatched ) or \); not returned from regcomp.  */
>    REG_ERANGEX,		/* Range striding over charsets.  */
> -  REG_ESIZEBR           /* n or m too big in \{n,m\} */
> +  REG_ESIZEBR,          /* n or m too big in \{n,m\} */
> +  REG_ECLASSBR,         /* Missing [] around [:class:].  */
>  } reg_errcode_t;
>  
>  static const char *re_error_msgid[] =
> @@ -842,6 +843,7 @@ print_double_string (re_char *where, re_char *string1, ptrdiff_t size1,
>     [REG_ERPAREN] = "Unmatched ) or \\)",
>     [REG_ERANGEX ] = "Range striding over charsets",
>     [REG_ESIZEBR ] = "Invalid content of \\{\\}",
> +   [REG_ECLASSBR] = "Class syntax is [[:digit:]], not [:digit:]",
>    };
>  
>  /* For 'regs_allocated'.  */
> @@ -2000,6 +2002,23 @@ regex_compile (re_char *pattern, ptrdiff_t size,
>  
>  	    laststart = b;
>  
> +            /* Check for the mistake of forgetting the extra square brackets,
> +               as in "[:alpha:]".  */
> +            if (*p == ':')
> +              {
> +                re_char *q = p + 1;
> +                while (q != pend && *q != ']')
> +                  {
> +                    if (*q == ':')
> +                      {
> +                        if (q + 1 != pend && q[1] == ']' && q > p + 1)
> +                          FREE_STACK_RETURN (REG_ECLASSBR);
> +                        break;
> +                      }
> +                    q++;
> +                  }
> +              }
> +
>  	    /* Test '*p == '^' twice, instead of using an if
>  	       statement, so we need only one BUF_PUSH.  */
>  	    BUF_PUSH (*p == '^' ? charset_not : charset);
> diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el
> index f9372e37b1..d268b97080 100644
> --- a/test/src/regex-emacs-tests.el
> +++ b/test/src/regex-emacs-tests.el
> @@ -803,4 +803,8 @@ regexp-multibyte-unibyte
>    (should-not (string-match "å" "\xe5"))
>    (should-not (string-match "[å]" "\xe5")))
>  
> +(ert-deftest regexp-invalid ()
> +  (should-error (string-match "[:space:]" "")
> +                :type 'invalid-regexp))
> +
>  ;;; regex-emacs-tests.el ends here




  parent reply	other threads:[~2020-02-26 15:48 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-21 18:58 :alnum: broken? Stephen Leake
2020-02-21 19:00 ` Paul Eggert
2020-02-21 19:32   ` Mattias Engdegård
2020-02-21 21:28   ` Stephen Leake
2020-02-22  1:09     ` Paul Eggert
2020-02-22  7:48       ` Eli Zaretskii
2020-02-22 21:28         ` Paul Eggert
2020-02-23  3:28           ` Eli Zaretskii
2020-02-23 10:21       ` Mattias Engdegård
2020-02-23 18:13         ` Paul Eggert
2020-02-23 18:27           ` Eli Zaretskii
2020-02-23 19:34             ` Óscar Fuentes
2020-02-23 22:12               ` Drew Adams
2020-02-25  3:57                 ` Richard Stallman
2020-02-25 14:37                   ` Stefan Monnier
2020-02-25 15:45                     ` Drew Adams
2020-02-25 15:40                   ` Drew Adams
2020-02-25  9:33                 ` Andreas Schwab
2020-02-25 13:53                   ` Clément Pit-Claudel
2020-02-25 15:40                   ` Drew Adams
2020-02-23 18:40           ` Mattias Engdegård
2020-02-26 14:10           ` Mattias Engdegård
2020-02-26 14:54             ` Drew Adams
2020-02-26 15:48             ` Stefan Monnier [this message]
2020-02-26 21:00               ` Paul Eggert
2020-02-26 21:18                 ` Mattias Engdegård
2020-02-26 21:24                   ` Clément Pit-Claudel
2020-02-26 22:01                     ` Mattias Engdegård
2020-02-26 22:38                       ` Eli Zaretskii
2020-02-27 17:57                         ` Mattias Engdegård
2020-02-27 23:17                           ` Óscar Fuentes
2020-02-28  8:09                           ` Eli Zaretskii
2020-02-28  8:48                             ` Paul Eggert
2020-02-28 13:11                               ` Eli Zaretskii
2020-02-28 17:41                                 ` Paul Eggert
2020-02-28 20:09                                   ` Eli Zaretskii
2020-02-28 20:25                                     ` Paul Eggert
2020-02-28 20:38                                       ` Eli Zaretskii
2020-02-28 21:04                                         ` Mattias Engdegård
2020-02-28 21:40                                           ` Eli Zaretskii
2020-02-29 11:43                                             ` Mattias Engdegård
2020-02-29 12:07                                               ` Eli Zaretskii
2020-02-29 14:24                                                 ` Stefan Monnier
2020-02-29 14:14                                               ` Stefan Monnier
2020-02-29 17:33                                                 ` Óscar Fuentes
2020-02-29 19:52                                                   ` Stefan Monnier
2020-02-29 21:12                                                     ` Óscar Fuentes
2020-02-29 22:22                                                       ` Marcin Borkowski
2020-02-29 22:34                                                         ` Clément Pit-Claudel
2020-03-01 22:44                                                           ` Marcin Borkowski
2020-03-02  3:07                                                             ` Stefan Monnier
2020-03-02  7:15                                                               ` Marcin Borkowski
2020-03-02  7:41                                                                 ` Emanuel Berg via Emacs development discussions.
2020-03-02 16:14                                                                   ` Drew Adams
2020-03-02 16:51                                                                     ` Emanuel Berg via Emacs development discussions.
2020-03-02  7:56                                                                 ` Joost Kremers
2020-03-02  9:44                                                                   ` Štěpán Němec
2020-03-02 10:43                                                                     ` Joost Kremers
2020-03-02 13:37                                                                 ` Clément Pit-Claudel
2020-03-02 17:03                                                                 ` Stefan Monnier
2020-03-02 18:23                                                                   ` Clément Pit-Claudel
2020-02-29 23:02                                                         ` Andrea Corallo
2020-03-01 22:41                                                           ` Marcin Borkowski
2020-02-29 22:58                                                       ` Stefan Monnier
2020-02-29 23:28                                                         ` Óscar Fuentes
2020-02-27  1:33                       ` Paul Eggert
2020-02-26 16:01             ` Andreas Schwab
2020-02-26 21:06               ` Mattias Engdegård
2020-02-27  8:43                 ` Andreas Schwab
2020-02-27 18:05                   ` Mattias Engdegård
2020-02-22  9:09     ` Eli Zaretskii
2020-02-23  3:49     ` Richard Stallman
2020-02-23  7:51       ` Paul Eggert
2020-02-23 15:07         ` Eli Zaretskii
2020-02-21 19:01 ` Noam Postavsky
2020-02-21 19:04 ` Andreas Schwab
     [not found] <<86wo8flqct.fsf@stephe-leake.org>
     [not found] ` <<f89e340f-9a19-50e1-4421-57fd3e235548@cs.ucla.edu>
     [not found]   ` <<86sgj3ljf0.fsf@stephe-leake.org>
     [not found]     ` <<E1j5iGS-0000r6-Id@fencepost.gnu.org>
     [not found]       ` <<18bd2b64-1c2f-055f-4fa0-092bdb1da531@cs.ucla.edu>
     [not found]         ` <<838sktibpu.fsf@gnu.org>
2020-02-23 16:52           ` Drew Adams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvpne1xsef.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=mattiase@acm.org \
    --cc=stephen_leake@stephe-leake.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).