From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thierry Volpiatto Newsgroups: gmane.emacs.devel Subject: Re: map-file-lines Date: Tue, 03 Feb 2009 11:45:50 +0100 Message-ID: <874ozb4x41.fsf@tux.homenetwork> References: <86wsc87o3c.fsf@lifelogs.com> <86k5887eg6.fsf@lifelogs.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1233658389 11665 80.91.229.12 (3 Feb 2009 10:53:09 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 3 Feb 2009 10:53:09 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 03 11:54:20 2009 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1LUIv3-0007h6-DB for ged-emacs-devel@m.gmane.org; Tue, 03 Feb 2009 11:54:18 +0100 Original-Received: from localhost ([127.0.0.1]:53562 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LUItk-00005H-Om for ged-emacs-devel@m.gmane.org; Tue, 03 Feb 2009 05:52:56 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LUItf-00005C-0s for emacs-devel@gnu.org; Tue, 03 Feb 2009 05:52:51 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LUIte-000050-17 for emacs-devel@gnu.org; Tue, 03 Feb 2009 05:52:50 -0500 Original-Received: from [199.232.76.173] (port=58419 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LUItd-00004x-Qk for emacs-devel@gnu.org; Tue, 03 Feb 2009 05:52:49 -0500 Original-Received: from main.gmane.org ([80.91.229.2]:39203 helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LUItd-0007YS-4o for emacs-devel@gnu.org; Tue, 03 Feb 2009 05:52:49 -0500 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1LUIta-0002gW-VP for emacs-devel@gnu.org; Tue, 03 Feb 2009 10:52:46 +0000 Original-Received: from 90.211.85-79.rev.gaoland.net ([79.85.211.90]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 03 Feb 2009 10:52:46 +0000 Original-Received: from thierry.volpiatto by 90.211.85-79.rev.gaoland.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 03 Feb 2009 10:52:46 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 99 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 90.211.85-79.rev.gaoland.net User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.0.90 (gnu/linux) Cancel-Lock: sha1:jCNb+axWFcTDctD+y708i+raZEQ= X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:108670 Archived-At: Hi Ted! Ted Zlatanov writes: > On Mon, 02 Feb 2009 11:20:07 -0600 Ted Zlatanov wrote: > > TZ> Emacs Lisp lacks a good way to iterate over all the lines of a file, > TZ> especially for a large file. The following code tries to provide a > TZ> solution, concentrating on reading a block of data in one shot and then > TZ> processing it line by line. It may be more efficient to write this in > TZ> C. Also, it does not deal with cases where the first line read is > TZ> bigger than the buffer size, and may have other bugs, but it works for > TZ> me so I thought I'd post it for comments and criticism. > > Updated: > > - line count 0-based now, logic is cleaner > - buffer size 128K by default > - accept start line and count > - abort when the lambda func returns nil > - renamed endline to line-end for clarity > > Thanks > Ted Can you try `tve-flines-iterator' that work a little like python iterators work. I plan to use it in futures versions of traverselisp if it is faster than actual code (traverselisp don't use this code actually). ,----[ C-h f tve-flines-iterator RET ] | tve-flines-iterator is a Lisp function in `traverselisp.el'. | | (tve-flines-iterator file &optional nlines startpos bufsize) | | Return an iterator on `nlines' lines of file. | `startpos' and `bufsize' are the byte options to give to | `insert-file-contents'. | | [back] | | ===*===*===*===*===*===*===*===*===*===*=== | Example: | ,---- | | ;; create an elisp-iterator object that | | ;; record the first 1024 bytes of my .emacs | | (setq A (tve-flines-iterator "~/.emacs.el" nil 0 1024)) | | | | ;; eval as many times as needed or launch it in a loop | | (tve-next A) | `---- `---- You can get it with hg here: hg clone http://freehg.org/u/thiedlecques/traverselisp/ > (defun map-file-lines (file func &optional startline count bufsize) > (let ((filepos 0) > (linenum 0) > (bufsize (or bufsize (* 128 1024)))) > (with-temp-buffer > (while > (let* > ((inserted (insert-file-contents > file nil > filepos (+ filepos bufsize) > t)) > (numlines (count-lines (point-min) (point-max))) > (read (nth 1 inserted)) > (done (< 1 read)) > result line-end) > (dotimes (n (count-lines (point-min) (point-max))) > (goto-char (point-min)) > (setq line-end (line-end-position) > result (if (and startline (< linenum startline)) > () > (if (and > count > (>= (- linenum startline) count)) > (return) > (funcall func > (buffer-substring > (line-beginning-position) > line-end) > linenum))) > done (and done result)) > (incf filepos line-end) > (forward-line) > (incf linenum)) > done))) > linenum)) > > ;;(map-file-lines "/tmp/test" (lambda (line num) (message "%d: %s" num line))) > ;;(map-file-lines "/tmp/test" (lambda (line num) (message "%d: %s" num line)) 100) > ;;(map-file-lines "/tmp/test" (lambda (line num) (message "%d: %s" num line)) 100 10) > -- A + Thierry Volpiatto Location: Saint-Cyr-Sur-Mer - France