* Stream-based Scanning of File-Buffers
@ 2009-09-25 12:12 Nordlöw
2009-09-25 12:55 ` Juanma Barranquero
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Nordlöw @ 2009-09-25 12:12 UTC (permalink / raw)
To: help-gnu-emacs
Is there a way to perform string/regexp scanning (using search-forward
or re-search-forward) in a file-buffer whose contents is loaded only
when it's needed, kind of like file streams in C, so that I only do a
physical read on the blocks that I actually scan and then skip the
rest of the file as soon as I get a hit.
Thanks in advance,
Nordlöw
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Stream-based Scanning of File-Buffers
2009-09-25 12:12 Stream-based Scanning of File-Buffers Nordlöw
@ 2009-09-25 12:55 ` Juanma Barranquero
2009-09-25 13:16 ` Pascal J. Bourguignon
[not found] ` <mailman.7515.1253883365.2239.help-gnu-emacs@gnu.org>
2 siblings, 0 replies; 4+ messages in thread
From: Juanma Barranquero @ 2009-09-25 12:55 UTC (permalink / raw)
To: Nordlöw; +Cc: help-gnu-emacs
On Fri, Sep 25, 2009 at 14:12, Nordlöw <per.nordlow@gmail.com> wrote:
> Is there a way to perform string/regexp scanning (using search-forward
> or re-search-forward) in a file-buffer whose contents is loaded only
> when it's needed, kind of like file streams in C, so that I only do a
> physical read on the blocks that I actually scan and then skip the
> rest of the file as soon as I get a hit.
You can use `insert-file-contents' to implement something like that;
presumably, the only tricky part is taking care of matches across
block boundaries.
Juanma
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Stream-based Scanning of File-Buffers
2009-09-25 12:12 Stream-based Scanning of File-Buffers Nordlöw
2009-09-25 12:55 ` Juanma Barranquero
@ 2009-09-25 13:16 ` Pascal J. Bourguignon
[not found] ` <mailman.7515.1253883365.2239.help-gnu-emacs@gnu.org>
2 siblings, 0 replies; 4+ messages in thread
From: Pascal J. Bourguignon @ 2009-09-25 13:16 UTC (permalink / raw)
To: help-gnu-emacs
Nordlöw <per.nordlow@gmail.com> writes:
> Is there a way to perform string/regexp scanning (using search-forward
> or re-search-forward) in a file-buffer whose contents is loaded only
> when it's needed, kind of like file streams in C, so that I only do a
> physical read on the blocks that I actually scan and then skip the
> rest of the file as soon as I get a hit.
There are emacs lisp primitives that allow you to load a file on
demand. However, the regular expressions functions will only work on
a string or a buffer (faster on a buffer), so when you reach the end
of the buffer without a match, you will have to further load a chunk
and retry the regexp. In general, for a regexp such as "a.*z", if
your file contains a 'a' in the first position, a lot of 'b's, and a
'z' in last position, you will have to load the whole file to match it
with the provided regexp functions.
Of course, you may write your own regexp compiler. A regexp such as
"a.*z" would be compiled to something like:
(lambda (stream)
(loop
for ch = (read-char stream nil)
while (and ch (char/= ch ?a))
finally (return (if ch
;; got a 'a'
(loop for ch = (read-char stream nil)
while (and ch (char/= ch ?z))
finally (return (if ch
'found-match
nil)))
;; got eof
nil))))
In such a case, you need only a one character buffer: ch.
Other regular expressions may need bigger buffers.
--
__Pascal Bourguignon__
^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <mailman.7515.1253883365.2239.help-gnu-emacs@gnu.org>]
* Re: Stream-based Scanning of File-Buffers
[not found] ` <mailman.7515.1253883365.2239.help-gnu-emacs@gnu.org>
@ 2009-09-25 20:07 ` Ted Zlatanov
0 siblings, 0 replies; 4+ messages in thread
From: Ted Zlatanov @ 2009-09-25 20:07 UTC (permalink / raw)
To: help-gnu-emacs
On Fri, 25 Sep 2009 14:55:34 +0200 Juanma Barranquero <lekktu@gmail.com> wrote:
JB> On Fri, Sep 25, 2009 at 14:12, Nordlöw <per.nordlow@gmail.com> wrote:
>> Is there a way to perform string/regexp scanning (using search-forward
>> or re-search-forward) in a file-buffer whose contents is loaded only
>> when it's needed, kind of like file streams in C, so that I only do a
>> physical read on the blocks that I actually scan and then skip the
>> rest of the file as soon as I get a hit.
JB> You can use `insert-file-contents' to implement something like that;
JB> presumably, the only tricky part is taking care of matches across
JB> block boundaries.
Unicode character boundaries are also a little nasty, since the offsets
for insert-file-contents are always byte-based. A good stream layer
would abstract all of that away, but I think it has to be at least
partly implemented at the C level to be efficient.
Ted
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-09-25 20:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-25 12:12 Stream-based Scanning of File-Buffers Nordlöw
2009-09-25 12:55 ` Juanma Barranquero
2009-09-25 13:16 ` Pascal J. Bourguignon
[not found] ` <mailman.7515.1253883365.2239.help-gnu-emacs@gnu.org>
2009-09-25 20:07 ` Ted Zlatanov
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.