From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Sebastian Tennant Newsgroups: gmane.lisp.guile.user Subject: Re: Uploading Word documents, PDFs, PNG files etc Date: Wed, 13 May 2009 19:09:35 +0000 Message-ID: <7i0kzuog.fsf@vps203.linuxvps.org> References: <87vdo7au56.fsf@ambire.localdomain> <87vdo5qc52.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1242241820 15317 80.91.229.12 (13 May 2009 19:10:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 13 May 2009 19:10:20 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Wed May 13 21:10:10 2009 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1M4JqD-0000oF-5s for guile-user@m.gmane.org; Wed, 13 May 2009 21:10:10 +0200 Original-Received: from localhost ([127.0.0.1]:44988 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1M4JqC-0006RG-8W for guile-user@m.gmane.org; Wed, 13 May 2009 15:10:08 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1M4Jpv-0006Ph-Mt for guile-user@gnu.org; Wed, 13 May 2009 15:09:51 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1M4Jpq-0006Mu-Ex for guile-user@gnu.org; Wed, 13 May 2009 15:09:50 -0400 Original-Received: from [199.232.76.173] (port=35014 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1M4Jpq-0006Mr-8R for guile-user@gnu.org; Wed, 13 May 2009 15:09:46 -0400 Original-Received: from main.gmane.org ([80.91.229.2]:56714 helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1M4Jpp-0006bF-MQ for guile-user@gnu.org; Wed, 13 May 2009 15:09:45 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1M4Jpk-00007T-5O for guile-user@gnu.org; Wed, 13 May 2009 19:09:40 +0000 Original-Received: from vps203.linuxvps.org ([91.186.7.203]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 13 May 2009 19:09:40 +0000 Original-Received: from sebyte by vps203.linuxvps.org with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 13 May 2009 19:09:40 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 40 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: vps203.linuxvps.org X-Composed-In: Gnus User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/22.2 (gnu/linux) Cancel-Lock: sha1:eTQUeMjj089PShGTkIRiSFYfpgI= X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:7294 Archived-At: Quoth ludo@gnu.org (Ludovic Courtès): > Hello, > > Sebastian Tennant writes: > >> (info "(guile-1.8)Regexp Functions") >> >> "Zero bytes (`#\nul') cannot be used in regex patterns or input >> strings, since the underlying C functions treat that as the end of >> string. If there's a zero byte an error is thrown." > > I think it makes sense to explicitly restrict regexps to actual text as > opposed to binary data. On second thoughts, I'm not so sure... shouldn't users have the option (either at compile time, or better still, in userland)? Restricting regexps to actual text is fine... until you need to grep binary data, or, as in this case, a combination of text and binary data. I thought it was going to be trivial to replace the call to regexp-exec in cgi.scm that extracted the uploaded (possibly binary) file, because the pattern identifying the beginning of the file in the raw data string is simple ("\n\r\n\r") - but I now realise that many calls to regexp-exec in cgi.scm will need to be replaced, some with complex matching patterns, so I can't see how this can be done without using regexps, hence my changed opinion. The only thing I can think of doing now is replacing calls to regexp-exec with system calls to grep (which can accept binary data) - clearly sub-optimal and non-trivial. Anyone have any other ideas? How easy would it be to build a guile with a regex feature that doesn't implement this restriction on binary data? Seb -- Emacs' AlsaPlayer - Music Without Jolts Lightweight, full-featured and mindful of your idyllic happiness. http://home.gna.org/eap