From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?ISO-8859-1?Q?Nordl=F6w?= Newsgroups: gmane.emacs.help Subject: Re: Efficiently checking the initial contents of a file Date: Mon, 19 May 2008 00:20:26 -0700 (PDT) Organization: http://groups.google.com Message-ID: <1d68e58f-5183-49a5-a588-aa7826dee7c0@l64g2000hse.googlegroups.com> References: <5f8b772a-e2ed-468c-89b3-2d9e40ed132b@m3g2000hsc.googlegroups.com> <36d03408-0373-4e0e-8d59-ea540ce55119@p25g2000hsf.googlegroups.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1211182845 29622 80.91.229.12 (19 May 2008 07:40:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 19 May 2008 07:40:45 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon May 19 09:41:24 2008 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Jxzzo-0000p6-2z for geh-help-gnu-emacs@m.gmane.org; Mon, 19 May 2008 09:41:24 +0200 Original-Received: from localhost ([127.0.0.1]:34717 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Jxzz4-0000eE-7b for geh-help-gnu-emacs@m.gmane.org; Mon, 19 May 2008 03:40:38 -0400 Original-Path: news.stanford.edu!newsfeed.stanford.edu!postnews.google.com!l64g2000hse.googlegroups.com!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 61 Original-NNTP-Posting-Host: 150.227.15.253 Original-X-Trace: posting.google.com 1211181626 12736 127.0.0.1 (19 May 2008 07:20:26 GMT) Original-X-Complaints-To: groups-abuse@google.com Original-NNTP-Posting-Date: Mon, 19 May 2008 07:20:26 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: l64g2000hse.googlegroups.com; posting-host=150.227.15.253; posting-account=ytJKAgoAAAA1tg4ScoRszebXiIldA5vg User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux i686; en; rv:1.8.1.14) Gecko/20080418 Epiphany/2.20 Firefox/2.0.0.14,gzip(gfe),gzip(gfe) X-HTTP-Via: 1.1 netcache (NetCache NetApp/6.1.1RC1) Original-Xref: news.stanford.edu gnu.emacs.help:158760 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:54127 Archived-At: On 16 Maj, 16:03, "Juanma Barranquero" wrote: > On Fri, May 16, 2008 at 2:52 PM, Nordl=F6w wrote: > > (defun file-begin-p (filename beg) > > "Determine if FILENAME begins with BEG." > > (interactive "fFile to investigate: ") > > (if (and (file-exists-p filename) > > (file-readable-p filename)) > > (with-temp-buffer > > (let ((width (string-width beg))) > > (insert-file-contents-literally filename nil 0 width) > > (looking-at beg) > > )))) > > A few additional comments: > > - BEG can be a regular expression, in which case the length of it can > be a red herring; for example (file-begin-p "[ABC]\\{20\\}") will > always return nil. Perhaps you could do > > (defun file-begin-p (filename beg &optional len) > ... > (let ((width (or len (string-width beg)))) > ... > > so you can pass a length if needed. > > - If you don't want to pass a regexp, it is advisable to remember > using regexp-quote, otherwise (file-begin-p "A*") is always going to > return t. > > - Mixing `insert-file-contents-literally' and `string-width' does not > seem like a good idea. Better use `string-bytes', or, if BEG can > contain non-ASCII chars, use `insert-file-contents' and `length'. I'd > recommend that second route. > > Hope this helps, > > Juanma Hey again! Is I see it the most general and efficient solution to this problem would be to Make the looking-at() logic stream based as we want to prevent the logic from requiring the whole buffer to be read from file into memory regardless of the length of BEG. Is there some way of opening a file into a buffer without actually reading the whole contents of the file into memory before it is actually used by, in our case, looking-at() ? A less optimal solution could make use of a function say regexp-max- match-length(REGEXP) the determines the longest possible pattern a regexp can match, possibly infinity. The return value from this function could then be used as length-argument to insert-file-contents- literally(). By the way I am surprised that my sought-of-function does not already exist in GNU Emacs. Can it be because it is difficult to design a solution that satisfies *all* of the needs given above. /Nordl=F6w