From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Jambunathan K Newsgroups: gmane.emacs.help Subject: Re: Handling large files with emacs lisp? Date: Tue, 04 Jun 2013 20:16:51 +0530 Message-ID: <87d2s26ihw.fsf@gmail.com> References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1370357214 5050 80.91.229.3 (4 Jun 2013 14:46:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 4 Jun 2013 14:46:54 +0000 (UTC) Cc: help-gnu-emacs@gnu.org To: Klaus-Dieter Bauer Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Jun 04 16:46:56 2013 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UjsVk-0002Vg-CR for geh-help-gnu-emacs@m.gmane.org; Tue, 04 Jun 2013 16:46:56 +0200 Original-Received: from localhost ([::1]:53434 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UjsVk-0007rH-1g for geh-help-gnu-emacs@m.gmane.org; Tue, 04 Jun 2013 10:46:56 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55750) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UjsVY-0007qw-CC for help-gnu-emacs@gnu.org; Tue, 04 Jun 2013 10:46:45 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UjsVW-0002ih-Qn for help-gnu-emacs@gnu.org; Tue, 04 Jun 2013 10:46:44 -0400 Original-Received: from mail-pa0-x233.google.com ([2607:f8b0:400e:c03::233]:49544) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UjsVW-0002ia-Kt for help-gnu-emacs@gnu.org; Tue, 04 Jun 2013 10:46:42 -0400 Original-Received: by mail-pa0-f51.google.com with SMTP id lf11so373548pab.38 for ; Tue, 04 Jun 2013 07:46:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; bh=oRxOwqAeCdtnG3QKL9uzftG1DGi+A1Pz02QVB/iZC7s=; b=X7iebjq3OFbxbA3Yx+AIpskstyQ+Z6666/llha1WWsZBvOzKoR/PTB2CtNUdsSUJB0 PWY65GFn3gZmg8ruq9T8I/lKnx3OLex2Q5+rAsKQRRzxI187Dk2H/q4iyVfcMGA0jyph Ce0IkL+ioeHS6fg5rG+cbqdoELFcmo138QQDpnuUAdL2jRgByP76tueLK+JqyFc5FoZL mbDgj1acWG/NJV9HV796WeN3HX0zJZqby+cCuMYV9BeWEh+bJuWvBlPGqRvAUeQTBgu6 ONLrRJgF+wdbkp/Zbhl6PGShG76SRvD7PM5UtIN3kRoO+cp4i/9+IQduK9AY6QJMfXXI 4V0w== X-Received: by 10.68.241.135 with SMTP id wi7mr17784354pbc.88.1370357201647; Tue, 04 Jun 2013 07:46:41 -0700 (PDT) Original-Received: from debian-6.05 ([115.242.215.85]) by mx.google.com with ESMTPSA id pm7sm24585116pbb.31.2013.06.04.07.46.39 for (version=TLSv1.1 cipher=RC4-SHA bits=128/128); Tue, 04 Jun 2013 07:46:40 -0700 (PDT) In-Reply-To: (Klaus-Dieter Bauer's message of "Tue, 4 Jun 2013 14:52:55 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400e:c03::233 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:91292 Archived-At: May be you can steal some stuff from here. http://elpa.gnu.org/packages/vlf.html It is a GNU ELPA package that you can install with M-x list-packages RET. Klaus-Dieter Bauer writes: > Hello! > > Is there a method in emacs lisp to handle large files (hundreds of MB) > efficiently? I am looking specifically for a function that allows > processing file contents either sequentially or (better) with random > access. > > Looking through the code of `find-file' I found that > `insert-file-contents' and `insert-file-contents-literally' seem to be > pretty much the most low-level functions available to emacs-lisp. When > files go towards GB size however, inserting file contents is > undesirable even assuming 32bit emacs were able to handle such large > buffers. > > Using the BEG and END parameters of `insert-file-contents' however has > a linear time-dependence on BEG. So implementing buffered file > processing for large files by keeping only parts of the file in a > temporary buffer doesn't seem feasible either. > > I'd also be interested why there is this linear time dependence. Is > this a limitation of how fseek works or of how `insert-file-contents' > is implemented? I've read[1] that fseek "just updates pointers", so > random reads in a large file, especially on an SSD, should be > constant-time, but I couldn't find further verification. > > kind regards, Klaus > > PS: I'm well aware that I'm asking for something, that likely wasn't > within the design goals of emacs lisp. It is interesting to push > the limits though ;) > > ------------------------------------------------------------ > > [1] https://groups.google.com/d/msg/comp.unix.aix/AXInTbcjsKo/qt-XnL12upgJ