From: Thorsten Jolitz <tjolitz@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: Re: regexp question: match anything but not a group?
Date: Wed, 02 Apr 2014 09:21:23 +0200 [thread overview]
Message-ID: <87y4zodwng.fsf@gmail.com> (raw)
In-Reply-To: 87zjk4louc.fsf@gmail.com
Thorsten Jolitz <tjolitz@gmail.com> writes:
> Hi List,
>
> how can I write a regexp that acts like e.g.
>
> ,------
> | ".*?"
> `------
>
> but does not match a group like e.g.
>
> ,---------------------
> | (regexp-quote "\\)")
> `---------------------
>
> ?
>
> This works more or less but does not seem to be very robust
>
> ,---------
> | "[^)]*?"
> `---------
>
> since ')' could appear in other contexts than the group. How can I
> negate a specific group of characters and not only any occurence of
> single characters?
I figured that I actually need something even smarter, because what I
really want is a regexp A that matches a given other regexp B if it is a
regexp-group, or not otherwise.
The best version of that regexp A I can come up with right now is
something like this:
#+begin_src emacs-lisp
(concat "^" ; BOL
(regexp-quote "\\(") ; group begins
"\\(\\?[[:digit:]]*:\\)?" ; shy or explicitly numbered group?
"[^\\000]+?" ; any char, idea copied from org-mode
(regexp-quote "\\)") ; group ends
"[*+]?[?]?" ; quantifier
"$") ; EOL
#+end_src
The problem is that in the content part
,------------------------------------------------------------------
| "[^\\000]+?" ; any char, idea copied from org-mode
`------------------------------------------------------------------
anything can happen, and any number of opening and/or closing parents
and sub-groups can appear, so I really need to determine if
,-----------------------------------------
| (regexp-quote "\\)") ; group ends
`-----------------------------------------
closes
,-------------------------------------------
| (regexp-quote "\\(") ; group begins
`-------------------------------------------
and thats kind of hard to do with regexp syntax.
I know now that I could simulate *look-ahead-assertions* for my original
problem, which aren't implemented in Emacs AFAIK, something on the line
of:
#+begin_src emacs-lisp
(progn
(and (looking-at ".*")
(not (eq (char-after) ?\))
(not (eq (char-after (+ 1 (point)) MY-CHAR)))
(not (eq (char-after (+ 2 (point)) MY-CHAR)))
[...]
))
#+end_src
but counting and bookkeeping of opening and closing parens in regexp B
looks too difficult to me.
I can only imagine to check parens with lisp first (e.g. by using
`forward-sexp' or so) and then use a regexp like above that does not
care what is inside the group enclosing parens.
Then I could drop this part from the regexp too
,----------------------------------------------------------------
| "\\(\\?[[:digit:]]*:\\)?" ; shy or explicitly numbered group?
`----------------------------------------------------------------
because all that counts are the matching parens.
Any ideas how to best check if a given regexp is a regexp group or not?
--
cheers,
Thorsten
next prev parent reply other threads:[~2014-04-02 7:21 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-01 21:30 regexp question: match anything but not a group? Thorsten Jolitz
2014-04-01 23:24 ` Pascal J. Bourguignon
2014-04-03 19:48 ` Thorsten Jolitz
[not found] ` <mailman.18864.1396554426.10748.help-gnu-emacs@gnu.org>
2014-04-03 20:24 ` Pascal J. Bourguignon
2014-04-03 21:36 ` Thorsten Jolitz
2014-04-02 5:55 ` Andreas Röhler
2014-04-03 19:36 ` Thorsten Jolitz
2014-04-03 22:19 ` Stefan Monnier
2014-04-04 8:39 ` Thorsten Jolitz
[not found] ` <mailman.18896.1396600696.10748.help-gnu-emacs@gnu.org>
2014-04-04 9:04 ` Loris Bennett
2014-04-04 9:37 ` Andreas Röhler
2014-04-04 10:40 ` Thorsten Jolitz
2014-04-04 14:11 ` Andreas Röhler
2014-04-04 6:37 ` Andreas Röhler
2014-04-04 8:53 ` Thorsten Jolitz
2014-04-02 7:21 ` Thorsten Jolitz [this message]
2014-04-02 8:07 ` Andreas Röhler
2014-04-03 10:50 ` Thorsten Jolitz
2014-04-03 23:46 ` Bob Proulx
2014-04-04 8:43 ` Thorsten Jolitz
2014-04-03 14:37 ` Stefan Monnier
2014-04-03 19:32 ` Thorsten Jolitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y4zodwng.fsf@gmail.com \
--to=tjolitz@gmail.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).