From: Thorsten Jolitz <tjolitz@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: Re: Is it valid to use the zero-byte "^@" in regexps?
Date: Wed, 18 Jun 2014 12:22:35 +0200 [thread overview]
Message-ID: <87fvj2zfdg.fsf@gmail.com> (raw)
In-Reply-To: 8761jysfxw.fsf@geodiff-mac3.ulb.ac.be
Nicolas Richard <theonewiththeevillook@yahoo.fr> writes:
> Thorsten Jolitz <tjolitz@gmail.com> writes:
>> To rule out a fundamental problem - is it valid to have the zero-byte
>> (inserted with C-q C-@) appear in a regexp like this?
>>
>> ,--------------------------------------------------------
>> | "^#\\+begin_src[[:space:]]+emacs-lisp[^^@]*\n#\\+end_src"
>> `--------------------------------------------------------
>
> I don't see why it wouldn't be valid, but I don't know. If it is
> desirable is another question : it would be better to search for the
> beginning, then search for the end with another regexp.
That what I did initially, and what is of course much easier, but took
twice (?) as long too ...
>> If so, this regexp should reliably match any
>>
>> ,-----------------------
>> | #+begin_src emacs-lisp
>> | [...]
>> | #+end_src
>> `-----------------------
>
> From the first occurrence of
> #+begin_src emacs-lisp
> ;; after point to the last occurence of
> #+end_src
> in the buffer. If there's more than one, they'll be part of the match
> too. e.g. if there's another block in the same document :
> #+begin_src sh
> echo whatever.
> #+end_src
> it'll be part of the match too. If you don't want that, make the star
> non-greedy by appending a question mark to it:
> (re-search-forward
> "^#\\+begin_src[[:space:]]+emacs-lisp[^^@]*?\n#\\+end_src")
yes, thanks for the hint, in my real sources I do use the non-greedy *?
(otherwise it would not work), but forgot about it when writing the
mail.
>> no matter whats inside the block, right?
>
> Except NUL characters of course.
i.e. zero-byte "^@"?
But Emacs can differentiate between NUL characters and the @ character -
or not? NUL chars have blue fonts, and message-mode complains when
trying to send them via email, but e.g. this mail has many @ chars that
are just normal text (just like my test-file) and they are recognized as
such.
Often, but not always, the not matched source-blocks contain @
characters (but not NUL chars). The strange thing is that the failed
matching happens with these blocks being part of a really big
testfile. When I isolate and copy them to a temp buffer and try to match
them there, it just works.
That makes testing/bisecting a bit difficult - whenever I find the
problem and isolate it, its gone ...
Therefore my question - is this technique with negated zero-bytes in
regexps supposed to work, or maybe problematic from the beginning?
--
cheers,
Thorsten
next prev parent reply other threads:[~2014-06-18 10:22 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-18 9:14 Is it valid to use the zero-byte "^@" in regexps? Thorsten Jolitz
2014-06-18 9:52 ` Nicolas Richard
2014-06-18 10:22 ` Thorsten Jolitz [this message]
2014-06-18 10:55 ` Nicolas Richard
2014-06-18 11:16 ` Thorsten Jolitz
2014-06-18 11:38 ` Michael Albinus
2014-06-18 12:15 ` Nicolas Richard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fvj2zfdg.fsf@gmail.com \
--to=tjolitz@gmail.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.