From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.user Subject: Re: Path: (web uri) - split-and-decode-uri-path must preserve plus characters Date: Mon, 20 Jun 2016 14:49:28 +0200 Message-ID: <87inx4hvon.fsf@pobox.com> References: <53C75E1E.7020805@zamail.co.za> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1466427722 23321 80.91.229.3 (20 Jun 2016 13:02:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 20 Jun 2016 13:02:02 +0000 (UTC) Cc: guile-user@gnu.org To: Brent Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Jun 20 15:01:47 2016 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1bEypC-0005bY-4j for guile-user@m.gmane.org; Mon, 20 Jun 2016 15:01:10 +0200 Original-Received: from localhost ([::1]:43443 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bEyef-0003Yx-Bm for guile-user@m.gmane.org; Mon, 20 Jun 2016 08:50:17 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51754) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bEyeD-0003XT-Ay for guile-user@gnu.org; Mon, 20 Jun 2016 08:49:50 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bEye8-0006BM-GW for guile-user@gnu.org; Mon, 20 Jun 2016 08:49:49 -0400 Original-Received: from pb-sasl2.pobox.com ([64.147.108.67]:56820 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bEye8-0006As-8O for guile-user@gnu.org; Mon, 20 Jun 2016 08:49:44 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 61DEC21E83; Mon, 20 Jun 2016 08:49:42 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=sasl; bh=CWtUVHUqh4GA jE/TB/mQx2h5u0g=; b=iqhMDuNUzFke2mY2/Hw4+96O+JXPMV46JKLbD+56pgXJ 9/UjBFcvEoHFPed8hH9kxmFMAVsC79mYJztpqvo1HmaD20vchge/W68JAoKl5RvU E3SGPl2VHEl2ZldRR+oKD2cZzaLTduKi22Qtu4MNiaDJNo0aUzQx0URQ8JhmF5c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=LSYEqP nhtmDViryonsK3jbwV0QolLaZPEaPmhVNEMBAiyJL6QFabOfxg4jUw0//eFiE5iq RbiGVrA/v0eYIutxtGq0+DHU7ay5wMhiV/iPtIzimoavZNIJWqYWJ8Jgypsx9jkf BVKPYeymOZNR+b2xskoj+Pq4Qiw+xVvLMnXWY= Original-Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 5AAED21E82; Mon, 20 Jun 2016 08:49:42 -0400 (EDT) Original-Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 72F3721E81; Mon, 20 Jun 2016 08:49:41 -0400 (EDT) In-Reply-To: <53C75E1E.7020805@zamail.co.za> (brent@tomski.co.za's message of "Thu, 17 Jul 2014 07:24:46 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) X-Pobox-Relay-ID: 74DEDD48-36E5-11E6-956C-28A6F1301B6D-02397024!pb-sasl2.pobox.com X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.108.67 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:12661 Archived-At: Applied, in a slightly reworked form. This will be in 2.1.4. Only took two years :/ Thanks for the patch! Andy On Thu 17 Jul 2014 07:24, Brent writes: > Hi all, > > Attached is a patch to to correct the behaviour of > split-and-decode-uri-path in uri.scm from guile 2.0.9. > > Fault > ------- > The faulty behaviour is: > > (use-modules (web uri)) > (split-and-decode-uri-path "xxx/abc+def/yyy") =E2=86=92 ("xxx" "= abc > def" "yyy") > > As can be seen, the plus has been erroneously converted to a space. > > The correct behaviour is: > > (split-and-decode-uri-path "xxx/abc+def/yyy") =E2=86=92 ("xxx" > "abc+def" "yyy") > > Analysis > ------------ > The fault is actually in the uri-decode function invoked by > split-and-decode-uri-path: > it has special logic to check for a #\+ character and convert it to a s= pace. > > The reason for this is that uri-decode is also used to decode query > string for > application/x-www-form-urlencoded requests where spaces are encoded by > the client > on submission to plus characters. > > (see http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1) > > Fix > --- > The fix is to extend uri-decode with a new keyword argument #:form? > that is defaulted > to #t. This provides backward compatibility. > > split-and-decode-uri-path is modified to call uri-decode with #:form? > set to #f. > > Motivation > --------------- > This patch is motivated by the need to embed ISO 8601 timestamps with > time zones > into URLs: > > eg. GET http:/foo.org/aaa/bbb/20140717T072233+0200/xxx/yyy > > > Please consider this for inclusion. > > > Thanks, > > Brent > > > --- guile-2.0.9/module/web/uri.scm.orig 2013-03-18 23:30:13.000000000 += 0200 > +++ guile-2.0.9/module/web/uri.scm 2014-07-15 16:08:47.677521012 +0200 > @@ -304,7 +304,7 @@ > (define hex-chars > (string->char-set "0123456789abcdefABCDEF")) > =20 > -(define* (uri-decode str #:key (encoding "utf-8")) > +(define* (uri-decode str #:key (encoding "utf-8") (form? #t)) > "Percent-decode the given STR, according to ENCODING, > which should be the name of a character encoding. > =20 > @@ -330,7 +330,7 @@ > (if (< i len) > (let ((ch (string-ref str i))) > (cond > - ((eqv? ch #\+) > + ((and (eqv? ch #\+) form?) > (put-u8 port (char->integer #\space)) > (lp (1+ i))) > ((and (< (+ i 2) len) (eqv? ch #\%) > @@ -412,7 +412,8 @@ > For example, =80=98\"/foo/bar%20baz/\"=80=99 decodes to the two-elemen= t list, > =80=98(\"foo\" \"bar baz\")=80=99." > (filter (lambda (x) (not (string-null? x))) > - (map uri-decode (string-split path #\/)))) > + (map (lambda (s) (uri-decode s #:form? #f)) > + (string-split path #\/)))) > =20 > (define (encode-and-join-uri-path parts) > "URI-encode each element of PARTS, which should be a list of