From: pjb@informatimago.com (Pascal J. Bourguignon)
To: help-gnu-emacs@gnu.org
Subject: Re: Stream-based Scanning of File-Buffers
Date: Fri, 25 Sep 2009 15:16:31 +0200 [thread overview]
Message-ID: <7cfxabmafk.fsf@pbourguignon.lefevre.anevia.com> (raw)
In-Reply-To: 5f0abd5c-159d-4d59-bd13-62eb178c7633@g6g2000vbr.googlegroups.com
Nordlöw <per.nordlow@gmail.com> writes:
> Is there a way to perform string/regexp scanning (using search-forward
> or re-search-forward) in a file-buffer whose contents is loaded only
> when it's needed, kind of like file streams in C, so that I only do a
> physical read on the blocks that I actually scan and then skip the
> rest of the file as soon as I get a hit.
There are emacs lisp primitives that allow you to load a file on
demand. However, the regular expressions functions will only work on
a string or a buffer (faster on a buffer), so when you reach the end
of the buffer without a match, you will have to further load a chunk
and retry the regexp. In general, for a regexp such as "a.*z", if
your file contains a 'a' in the first position, a lot of 'b's, and a
'z' in last position, you will have to load the whole file to match it
with the provided regexp functions.
Of course, you may write your own regexp compiler. A regexp such as
"a.*z" would be compiled to something like:
(lambda (stream)
(loop
for ch = (read-char stream nil)
while (and ch (char/= ch ?a))
finally (return (if ch
;; got a 'a'
(loop for ch = (read-char stream nil)
while (and ch (char/= ch ?z))
finally (return (if ch
'found-match
nil)))
;; got eof
nil))))
In such a case, you need only a one character buffer: ch.
Other regular expressions may need bigger buffers.
--
__Pascal Bourguignon__
next prev parent reply other threads:[~2009-09-25 13:16 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-25 12:12 Stream-based Scanning of File-Buffers Nordlöw
2009-09-25 12:55 ` Juanma Barranquero
2009-09-25 13:16 ` Pascal J. Bourguignon [this message]
[not found] ` <mailman.7515.1253883365.2239.help-gnu-emacs@gnu.org>
2009-09-25 20:07 ` Ted Zlatanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cfxabmafk.fsf@pbourguignon.lefevre.anevia.com \
--to=pjb@informatimago.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.