From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Linas Vepstas Newsgroups: gmane.lisp.guile.user Subject: Re: Uploading Word documents, PDFs, PNG files etc Date: Wed, 13 May 2009 14:23:21 -0500 Message-ID: <3ae3aa420905131223i3c7b83b0tf5a6ec9b200a8704@mail.gmail.com> References: <87vdo7au56.fsf@ambire.localdomain> <87vdo5qc52.fsf@gnu.org> <7i0kzuog.fsf@vps203.linuxvps.org> Reply-To: linasvepstas@gmail.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1242242633 17962 80.91.229.12 (13 May 2009 19:23:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 13 May 2009 19:23:53 +0000 (UTC) Cc: guile-user@gnu.org To: Sebastian Tennant Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Wed May 13 21:23:42 2009 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1M4K3A-0007Fd-LL for guile-user@m.gmane.org; Wed, 13 May 2009 21:23:32 +0200 Original-Received: from localhost ([127.0.0.1]:39776 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1M4K39-00026d-T2 for guile-user@m.gmane.org; Wed, 13 May 2009 15:23:31 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1M4K35-00026I-UB for guile-user@gnu.org; Wed, 13 May 2009 15:23:27 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1M4K31-00025q-Ka for guile-user@gnu.org; Wed, 13 May 2009 15:23:27 -0400 Original-Received: from [199.232.76.173] (port=40226 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1M4K31-00025n-HN for guile-user@gnu.org; Wed, 13 May 2009 15:23:23 -0400 Original-Received: from mail-gx0-f176.google.com ([209.85.217.176]:55403) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1M4K31-0008US-9X for guile-user@gnu.org; Wed, 13 May 2009 15:23:23 -0400 Original-Received: by gxk24 with SMTP id 24so1655546gxk.18 for ; Wed, 13 May 2009 12:23:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=sFrNybAz31NaVCYt9nVUn7Jf0+OGwFMHq6k5gEEncVg=; b=A3HCWJCi/HQEmGFhw1EALRuth7b9vNvhjA+PeYQWcSHCAm4sUX9tws4C1upbMXEJBx xnQCJl/Cdy2VPkOXPqgA4OtXXsPgakFtfaIJuvMx1Pd8VsBEOP09pdbmDpRQvkuPQq1N WAkCxByuiEzQ+/X4Q3RmZSnM100XqDL0FiTOI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding; b=WtCb9s2GB16s3LUYKlZAzkeEkP3Zmu53BE4sF0iAhegg5lWFW1M6YYtjgDhDlCbPS0 fc3GRTXG6QHoMm74fcaYX7n9AXKio77omb5FdUKHGDZm32YE+kr8eKvK8sDBdQSekKN3 LZtoAlPWOgVLMkHGzx8HCGCbCm/kWG6Vea+KE= Original-Received: by 10.231.39.141 with SMTP id g13mr1174175ibe.34.1242242602061; Wed, 13 May 2009 12:23:22 -0700 (PDT) In-Reply-To: <7i0kzuog.fsf@vps203.linuxvps.org> X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:7295 Archived-At: 2009/5/13 Sebastian Tennant : > Restricting regexps to actual text is fine... until you need to grep > binary data, or, as in this case, a combination of text and binary data. Last I looked, standard c-library posix/gnu/perl/java regex only worked on strings, not on binary data. You'll have trouble finding a binary-data regex implementation in C (or any other language). > in cgi.scm that extracted the uploaded (possibly binary) file, because > the pattern identifying the beginning of the file in the raw data string > is simple ("\n\r\n\r") - No, this sounds somehow broken. If I remember correctly, binary mime-parts should have a ConentLength header so you can skip over them. If ContentLength is absent, then the part should bee ascii-encoded (e.g. base64) yeah, grapping large blocks of ascii sucks, which is why the ContetnLength should be used. -- linas