all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "MON KEY" <monkey@sandpframing.com>
To: "Miles Bader" <miles@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: regexp-quote missing escapes in grouping constructs - Bug?
Date: Fri, 13 Jun 2008 13:36:07 -0400	[thread overview]
Message-ID: <d2afcfda0806131036n58d584bal495cd7341bcf5c22@mail.gmail.com> (raw)
In-Reply-To: <buofxrhn9j5.fsf@dhapc248.dev.necel.com>

Y. but why are the "?" a"[" and "+" getting escaped regardless of the
presence of a preceding \ whereas the alternative "|" inside the
grouping construct isn't?

e.g.

example 1)

(regexp-quote "[0-9]{2,4}(-|/)[0-9]?+(-|/)[0-9]{2,4}")
---> "\\[0-9]{2,4}(-|/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}"

as compared to 2);

(regexp-quote "[0-9]{2,4}(-?+/)[0-9]?+(-|/)[0-9]{2,4}")
---> "\\[0-9]{2,4}(-\\?\\+/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}"

in the second case the ?+ nested inside the group is getting escaped.

Is the "|" not considered a special operator or emacs regexp
metacharacter in the regexp-quote situation?

And if so, why not?
The issue is that regexp-opt.el is calling regexp-quote

If i understand the implications of regexp-opt  it is meant as a
helper function for passing regexps to font-lock-keywords and isn't
intended to accept or 'optimize' existing regexp "words".  So, to feed
a well-formed regexp to font-lock-add-keywords I need to build the
regexp by hand. Likewise, that regexp needs to be passed as a string
with all special characters properly escaped e.g.

(defconst stupid-mode-keywords
'("^[A-z]\\?\\+\\(some\\|stupid\\|regexp\\)\\{2,4\\}" . my-stupid-mode-face))

Am I to understand that the ? and + should be escaped but the |
shouldn't be in order for the regexp to work with font-lock?

FWIW my epierience is otherwise, and the previous case doesn't work,
whereas the following does:

(defconst stupid-mode-keywords
'("^[A-z]?+\\(some\\|stupid\\|regexp\\)\\{2,4\\}" . my-stupid-mode-face))

For my purposes, the larger issue is that I can't find a sensible way
to cons or append a well formed regexp to an existing one without
running into regexp-quote and regexp-opt confusion esp. as I am
unclear as to the correctness of the quoting and escaping of regexps
for font-locking by the two respective functions.

The only solution that seems approachable is to make a new defconst
defvar and defface for each new regexp i wish to font-lock.  This
approach is not really particularly maintanable over the longterm.

On Fri, Jun 13, 2008 at 2:17 AM, Miles Bader <miles.bader@necel.com> wrote:
> "St/n_P/rm/n" <Stan@SandPframing.com> writes:
>> (regexp-quote "[0-9]\{2,4\}\(-\|/\)[0-9]?+\(-\|/\)[0-9]\{2,4\}")
>>
>> ---> "\\[0-9]{2,4}(-|/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}"
>>
>> Am I misunderstanding something?
>
> The backslashes you entered in the original lisp string were eaten by
> the lisp reader, so there are no backslashes in the string.  Since (, ),
> |, etc., are not emacs regexp metacharacters (without a preceding
> backslash), there's no need to quote them.
>
> Here's what you probably meant:
>
> (regexp-quote "[0-9]\\{2,4\\}\\(-\\|/\\)[0-9]?+\\(-\\|/\\)[0-9]\\{2,4\\}")
> => "\\[0-9]\\\\{2,4\\\\}\\\\(-\\\\|/\\\\)\\[0-9]\\?\\+\\\\(-\\\\|/\\\\)\\[0-9]\\\\{2,4\\\\}"
>
> -Miles
>
> --
> Joy, n. An emotion variously excited, but in its highest degree arising from
> the contemplation of grief in another.
>




  reply	other threads:[~2008-06-13 17:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-12 23:39 regexp-quote missing escapes in grouping constructs - Bug? St/n_P/rm/n
2008-06-13  6:17 ` Miles Bader
2008-06-13 17:36   ` MON KEY [this message]
2008-06-13 22:21     ` Stefan Monnier
2008-06-14  4:16     ` tomas
2008-06-13  6:20 ` Herbert Euler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2afcfda0806131036n58d584bal495cd7341bcf5c22@mail.gmail.com \
    --to=monkey@sandpframing.com \
    --cc=emacs-devel@gnu.org \
    --cc=miles@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.