* Regular expressions and user-escaped characters
@ 2024-12-02 22:04 Christopher Howard
2024-12-02 22:32 ` Joost Kremers
2024-12-03 14:01 ` Stefan Monnier via Users list for the GNU Emacs text editor
0 siblings, 2 replies; 5+ messages in thread
From: Christopher Howard @ 2024-12-02 22:04 UTC (permalink / raw)
To: Help Gnu Emacs Mailing List
Hi, what do you do in a regular expression if you want to match a character, but not a the same character that has been escaped by the user. E.g., if I want my regular expression to look for ?\[ (ASCII 91), matching string "[" and "a[a" but not string "\\[" or "a\\[a", if you follow me. Is this possible with just a regular expression?
If not, what is a good workaround? I was wondering about, say, replacing all the escaped characters first with some uncommon character (like a control code) and then converting back afterwards. But then I suppose I would need to do a check for that uncommon character first.
--
📛 Christopher Howard
🚀 gemini://gem.librehacker.com
🌐 http://gem.librehacker.com
בראשית ברא אלהים את השמים ואת הארץ
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Regular expressions and user-escaped characters
2024-12-02 22:04 Regular expressions and user-escaped characters Christopher Howard
@ 2024-12-02 22:32 ` Joost Kremers
2024-12-02 22:50 ` Joost Kremers
2024-12-03 14:01 ` Stefan Monnier via Users list for the GNU Emacs text editor
1 sibling, 1 reply; 5+ messages in thread
From: Joost Kremers @ 2024-12-02 22:32 UTC (permalink / raw)
To: Christopher Howard; +Cc: Help Gnu Emacs Mailing List
On Mon, Dec 02 2024, Christopher Howard wrote:
> Hi, what do you do in a regular expression if you want to match a
> character, but not a the same character that has been escaped by the user.
> E.g., if I want my regular expression to look for ?\[ (ASCII 91), matching
> string "[" and "a[a" but not string "\\[" or "a\\[a", if you follow me. Is
> this possible with just a regular expression?
You may get away with something like "[^\\][[]", though keep in mind that
that does not match a ?[ not preceded by a backslash, but rather a ?[
preceded by a character that is not a backslash. Depending on your use
case, that might suffice, though, esp. if you use a capturing group:
```
(let ((str "a[a"))
(when (string-match "[^\\]\\([[]\\)" str)
(match-string 1 str)))
=> "["
```
vs.:
```
(let ((str "a\\[a"))
(when (string-match "[^\\]\\([[]\\)" str)
(match-string 1 str)))
=> nil
```
The "proper" way to do this would be to use negative lookbehind,
`"(?<!\\)[[])"`, but Emacs' regexp engine does not support that.
> If not, what is a good workaround? I was wondering about, say, replacing
> all the escaped characters first with some uncommon character (like a
> control code) and then converting back afterwards. But then I suppose I
> would need to do a check for that uncommon character first.
That would probably work.
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Regular expressions and user-escaped characters
2024-12-02 22:32 ` Joost Kremers
@ 2024-12-02 22:50 ` Joost Kremers
2024-12-02 23:09 ` Joost Kremers
0 siblings, 1 reply; 5+ messages in thread
From: Joost Kremers @ 2024-12-02 22:50 UTC (permalink / raw)
To: Christopher Howard; +Cc: Help Gnu Emacs Mailing List
On Mon, Dec 02 2024, Joost Kremers wrote:
> You may get away with something like "[^\\][[]", though keep in mind that
> that does not match a ?[ not preceded by a backslash, but rather a ?[
> preceded by a character that is not a backslash.
Mind you, what I forgot to mention: this means that a ?[ at the start of a
string won't be found. A possible solution to that might be to prepend some
character to the string before matching.
--
Joost Kremers
Life has its moments
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Regular expressions and user-escaped characters
2024-12-02 22:50 ` Joost Kremers
@ 2024-12-02 23:09 ` Joost Kremers
0 siblings, 0 replies; 5+ messages in thread
From: Joost Kremers @ 2024-12-02 23:09 UTC (permalink / raw)
To: Christopher Howard; +Cc: Help Gnu Emacs Mailing List
On 2 December 2024 23:51:49 Joost Kremers <joostkremers@fastmail.fm> wrote:
> On Mon, Dec 02 2024, Joost Kremers wrote:
>> You may get away with something like "[^\\][[]", though keep in mind that
>> that does not match a ?[ not preceded by a backslash, but rather a ?[
>> preceded by a character that is not a backslash.
>
> Mind you, what I forgot to mention: this means that a ?[ at the start of a
> string won't be found. A possible solution to that might be to prepend some
> character to the string before matching.
Or, try usimg \\| to match either a ?[ at the start ot the string or a ?[
preceded by a character other than a backslash...
\\(?:^[[]\\|[^\\][[]\\)
Whew...
> --
> Joost Kremers
> Life has its moments
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Regular expressions and user-escaped characters
2024-12-02 22:04 Regular expressions and user-escaped characters Christopher Howard
2024-12-02 22:32 ` Joost Kremers
@ 2024-12-03 14:01 ` Stefan Monnier via Users list for the GNU Emacs text editor
1 sibling, 0 replies; 5+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2024-12-03 14:01 UTC (permalink / raw)
To: help-gnu-emacs
> Hi, what do you do in a regular expression if you want to match a character,
> but not a the same character that has been escaped by the user. E.g., if
> I want my regular expression to look for ?\[ (ASCII 91), matching string "["
> and "a[a" but not string "\\[" or "a\\[a", if you follow me. Is this
> possible with just a regular expression?
The "usual" way we do that is with the godawful:
"\\(?:^\\|[^\\]\\(?:\\\\\\\\\\)*\\)\\["
This is careful to match the [ if it's preceded by an even number
of backslashes. But beware that it makes more than the actual [, so if
you start the search from a point that's looking at a [, it won't find
it (except if it's at the beginning of the line).
> If not, what is a good workaround?
Just use a regexp which matches all [ (regardless of any previous
backslashes) and then check afterwards, in ELisp, whether it's preceded
by an odd number of backslashes, e.g. with something like
(save-excursion
(goto-char <FOO>)
(zerop (% (skip-chars-backward "\\") 2)))
- Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-12-03 14:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-02 22:04 Regular expressions and user-escaped characters Christopher Howard
2024-12-02 22:32 ` Joost Kremers
2024-12-02 22:50 ` Joost Kremers
2024-12-02 23:09 ` Joost Kremers
2024-12-03 14:01 ` Stefan Monnier via Users list for the GNU Emacs text editor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).