unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#12827: [2.0.6] web client: fails to parse 404 header
@ 2012-11-07 20:40 Ludovic Courtès
  2012-11-08  5:52 ` Daniel Hartwig
  2013-02-23  8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig
  0 siblings, 2 replies; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-07 20:40 UTC (permalink / raw)
  To: 12827

scheme@(guile-user)> (use-modules (web client) (web uri))
scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist"))
web/http.scm:191:11: In procedure read-header:
web/http.scm:191:11: Bad uri header component: gnu-404.html

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> ,bt
In web/client.scm:
    127:4  3 (http-get #<<uri> scheme: http userinfo: #f host: "www.gnu.org" port: #f path: "/does-not-exist" query: #f fragment: #f> #:port #<input-output: socket 9> #:version (1 . 1) #:keep-alive? #f #:extra-headers () # #)
In web/response.scm:
    188:6  2 (read-response #<input-output: socket 9>)
In web/http.scm:
   225:33  1 (lp ((server . "Apache/2.2.14") (date . #<date nanosecond: 0 second: 12 minute: 36 hour: 20 day: 7 month: 11 year: 2012 zone-offset: 0>)))
   191:11  0 (read-header #<input-output: socket 9>)
scheme@(guile-user) [1]> ,locals
  Local variables:
  $5 = port = #<input-output: socket 9>
  $6 = line = "Content-Location: gnu-404.html"
  $7 = delim = 16
  $8 = sym = content-location

Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès
@ 2012-11-08  5:52 ` Daniel Hartwig
  2012-11-08 20:10   ` Ludovic Courtès
  2013-02-23  8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-08  5:52 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

[-- Attachment #1: Type: text/plain, Size: 1333 bytes --]

On 8 November 2012 04:40, Ludovic Courtès <ludo@gnu.org> wrote:
> scheme@(guile-user)> (use-modules (web client) (web uri))
> scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist"))
> web/http.scm:191:11: In procedure read-header:
> web/http.scm:191:11: Bad uri header component: gnu-404.html

Some headers are supposed to support these “URI-references”.

I have just updated a patch I worked on a while back that adds support
for such things, including “partial-refs”.  It is not yet ready for
inclusion, some points to consider:
* DONE (web uri) support for relative-refs
* DONE (web http) support for relative-refs in headers
* TODO (resolve-ref uri uri-reference) → absolute-uri
* TODO documentation
* TODO maybe use more scheme-ish names instead of RFC 3986
This latest RFC makes a clear distinction between an actual URI
and a reference to such.  I thought it best to reflect that
distinction, but maybe it does pollute the namespace a bit:
- absolute-uri? → uri-absolute?
- relative-ref? → uri-relative?
* TODO build-uri validation is broken/less strict and will now pass
relative-refs, so maybe introduce build-uri-reference instead

Anyway, this may provide a useful base to work from, if not something
suitable for inclusion almost immediately.

Regards

[-- Attachment #2: 0001-uri-reference-support.patch --]
[-- Type: application/octet-stream, Size: 11426 bytes --]

From fc7f4dbe9fbd855af4d47990de31bcfccb55dfe1 Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Thu, 8 Nov 2012 13:11:52 +0800
Subject: [PATCH] uri-reference support

* module/web/uri.scm (build-uri, validate-uri, string->uri,
  uri->string): Allow uri-scheme to be #f.  Any such object is a
  relative-ref.
  (absolute-uri?, relative-ref?, uri-reference?): New predicates
  to distinguish between kinds of URI reference.
* module/web/http.scm (declare-uri-header!): Use absolute-uri? to
  validate these headers.
  (declare-uri-reference-header!): New type of header accepting any
  URI-reference.
  ("Content-Location", "Referer"): Change to URI-reference headers
  to support relative references.
* test-suite/tests/web-uri.test, test-suite/tests/web-http.test:
  Add tests for the above.
---
 module/web/http.scm            |   13 ++++-
 module/web/uri.scm             |   46 +++++++++++----
 test-suite/tests/web-http.test |    4 ++
 test-suite/tests/web-uri.test  |  125 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 174 insertions(+), 14 deletions(-)

diff --git a/module/web/http.scm b/module/web/http.scm
index cc5dd5a..1a54cc6 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1185,7 +1185,14 @@ treated specially, and is just returned as a plain string."
 (define (declare-uri-header! name)
   (declare-header! name
     (lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
-    uri?
+    absolute-uri?
+    write-uri))
+
+;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
+(define (declare-uri-reference-header! name)
+  (declare-header! name
+    (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str)))
+    uri-reference?
     write-uri))
 
 ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
@@ -1440,7 +1447,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Content-Location = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Content-Location")
+(declare-uri-reference-header! "Content-Location")
 
 ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
 ;;
@@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Referer = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Referer")
+(declare-uri-reference-header! "Referer")
 
 ;; TE = #( t-codings )
 ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 78614a5..da55902 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -34,6 +34,7 @@
   #:export (uri?
             uri-scheme uri-userinfo uri-host uri-port
             uri-path uri-query uri-fragment
+            absolute-uri? relative-ref? uri-reference?
 
             build-uri
             declare-default-port!
@@ -53,6 +54,17 @@
   (query uri-query)
   (fragment uri-fragment))
 
+(define (absolute-uri? uri)
+  (and (uri? uri) (uri-scheme uri) #t))
+
+(define (relative-ref? uri)
+  (and (uri? uri) (not (uri-scheme uri)) #t))
+
+(define (uri-reference? uri)
+  (and (or (absolute-uri? uri)
+           (relative-ref? uri))
+       #t))
+
 (define (uri-error message . args)
   (throw 'uri-error message args))
 
@@ -61,7 +73,7 @@
 
 (define (validate-uri scheme userinfo host port path query fragment)
   (cond
-   ((not (symbol? scheme))
+   ((and scheme (not (symbol? scheme)))
     (uri-error "Expected a symbol for the URI scheme: ~s" scheme))
    ((and (or userinfo port) (not host))
     (uri-error "Expected a host, given userinfo or port"))
@@ -150,6 +162,17 @@ consistency checks to make sure that the constructed URI is valid."
 ;;;               / path-absolute
 ;;;               / path-rootless
 ;;;               / path-empty
+;;;
+;;; RFC 3986, #4.
+;;;
+;;;   URI-reference = URI / relative-ref
+;;;
+;;;   relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
+;;;
+;;;   relative-part = "//" authority path-abempty
+;;;                 / path-absolute
+;;;                 / path-noscheme
+;;;                 / path-empty
 
 (define scheme-pat
   "[a-zA-Z][a-zA-Z0-9+.-]*")
@@ -162,7 +185,7 @@ consistency checks to make sure that the constructed URI is valid."
 (define fragment-pat
   ".*")
 (define uri-pat
-  (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$"
+  (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$"
           scheme-pat authority-pat path-pat query-pat fragment-pat))
 (define uri-regexp
   (make-regexp uri-pat))
@@ -172,12 +195,12 @@ consistency checks to make sure that the constructed URI is valid."
 could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
        (if (not m) (abort))
-       (let ((scheme (string->symbol
-                      (string-downcase (match:substring m 1))))
-             (authority (match:substring m 2))
-             (path (match:substring m 3))
-             (query (match:substring m 5))
-             (fragment (match:substring m 7)))
+       (let ((scheme (let ((s (match:substring m 2)))
+                       (and s (string->symbol (string-downcase s)))))
+             (authority (match:substring m 3))
+             (path (match:substring m 4))
+             (query (match:substring m 6))
+             (fragment (match:substring m 8)))
          (call-with-values
              (lambda ()
                (if authority
@@ -206,8 +229,7 @@ printed."
 
 (define (uri->string uri)
   "Serialize @var{uri} to a string."
-  (let* ((scheme-str (string-append
-                      (symbol->string (uri-scheme uri)) ":"))
+  (let* ((scheme (uri-scheme uri))
          (userinfo (uri-userinfo uri))
          (host (uri-host uri))
          (port (uri-port uri))
@@ -215,7 +237,9 @@ printed."
          (query (uri-query uri))
          (fragment (uri-fragment uri)))
     (string-append
-     scheme-str
+     (if scheme
+         (string-append (symbol->string scheme) ":")
+         "")
      (if host
          (string-append "//"
                         (if userinfo (string-append userinfo "@")
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..5d80fde 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -145,6 +145,8 @@
   (pass-if-parse content-length "010" 10)
   (pass-if-parse content-location "http://foo/"
                  (build-uri 'http #:host "foo" #:path "/"))
+  (pass-if-parse content-location "//foo"
+                 (build-uri #f #:host "foo"))
   (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
   (pass-if-parse content-range "bytes */*" '(bytes * *))
   (pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -208,6 +210,8 @@
   (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
   (pass-if-parse referer "http://foo/bar?baz"
                  (build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+  (pass-if-parse referer "//foo/bar?baz"
+                 (build-uri #f #:host "foo" #:path "/bar" #:query "baz"))
   (pass-if-parse te "trailers" '((trailers)))
   (pass-if-parse te "trailers,foo" '((trailers) (foo)))
   (pass-if-parse user-agent "guile" "guile"))
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3f6e7e3..d4d44d6 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -78,6 +78,24 @@
            #:port 22
            #:path "/baz"))
 
+  (pass-if "foo"
+    (uri=? (build-uri #f #:path "foo")
+           #:path "foo"))
+
+  (pass-if "/foo"
+    (uri=? (build-uri #f #:path "/foo")
+           #:path "/foo"))
+
+  (pass-if "//foo/bar"
+    (uri=? (build-uri #f #:host "foo" #:path "/bar")
+           #:host "foo"
+           #:path "/bar"))
+
+  (pass-if "?foo"
+    (uri=? (build-uri #f #:query "foo")
+           #:path ""
+           #:query "foo"))
+
   (pass-if-uri-exception "non-symbol scheme"
                          "Expected.*symbol"
                          (build-uri "nonsym"))
@@ -123,6 +141,80 @@
                          "Expected.*host"
                          (build-uri 'http #:userinfo "foo")))
 
+(with-test-prefix "absolute-uri?"
+  (pass-if "ftp:"
+    (absolute-uri? (build-uri 'ftp)))
+
+  (pass-if "ftp:foo"
+    (absolute-uri? (build-uri 'ftp #:path "foo")))
+
+  (pass-if "ftp://foo/bar"
+    (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (pass-if "ftp://foo@bar:22/baz"
+    (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (expect-fail "foo"
+    (absolute-uri? (build-uri #f #:path "foo")))
+
+  (expect-fail "/foo"
+    (absolute-uri? (build-uri #f #:path "/foo")))
+
+  (expect-fail "//foo/bar"
+    (absolute-uri? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (expect-fail "?foo"
+    (absolute-uri? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "relative-ref?"
+  (expect-fail "ftp:"
+    (relative-ref? (build-uri 'ftp)))
+
+  (expect-fail "ftp:foo"
+    (relative-ref? (build-uri 'ftp #:path "foo")))
+
+  (expect-fail "ftp://foo/bar"
+    (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (expect-fail "ftp://foo@bar:22/baz"
+    (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (pass-if "foo"
+    (relative-ref? (build-uri #f #:path "foo")))
+
+  (pass-if "/foo"
+    (relative-ref? (build-uri #f #:path "/foo")))
+
+  (pass-if "//foo/bar"
+    (relative-ref? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (pass-if "?foo"
+    (relative-ref? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "uri-reference?"
+  (pass-if "ftp:"
+    (uri-reference? (build-uri 'ftp)))
+
+  (pass-if "ftp:foo"
+    (uri-reference? (build-uri 'ftp #:path "foo")))
+
+  (pass-if "ftp://foo/bar"
+    (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (pass-if "ftp://foo@bar:22/baz"
+    (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (pass-if "foo"
+    (uri-reference? (build-uri #f #:path "foo")))
+
+  (pass-if "/foo"
+    (uri-reference? (build-uri #f #:path "/foo")))
+
+  (pass-if "//foo/bar"
+    (uri-reference? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (pass-if "?foo"
+    (uri-reference? (build-uri #f #:query "foo"))))
 
 (with-test-prefix "string->uri"
   (pass-if "ftp:"
@@ -149,6 +241,24 @@
            #:port 22
            #:path "/baz"))
 
+  (pass-if "foo"
+    (uri=? (string->uri "foo")
+           #:path "foo"))
+
+  (pass-if "/foo"
+    (uri=? (string->uri "/foo")
+           #:path "/foo"))
+
+  (pass-if "//foo/bar"
+    (uri=? (string->uri "//foo/bar")
+           #:host "foo"
+           #:path "/bar"))
+
+  (pass-if "?foo"
+    (uri=? (string->uri "?foo")
+           #:path ""
+           #:query "foo"))
+
   (pass-if "http://bad.host.1"
     (not (string->uri "http://bad.host.1")))
 
@@ -229,6 +339,21 @@
     (equal? "ftp://foo@bar:22/baz"
             (uri->string (string->uri "ftp://foo@bar:22/baz"))))
   
+  (pass-if "foo"
+    (equal? "foo"
+            (uri->string (string->uri "foo"))))
+
+  (pass-if "/foo"
+    (equal? "/foo" (uri->string (string->uri "/foo"))))
+
+  (pass-if "//foo/bar"
+    (equal? "//foo/bar"
+            (uri->string (string->uri "//foo/bar"))))
+
+  (pass-if "?foo"
+    (equal? "?foo"
+            (uri->string (string->uri "?foo"))))
+
   (when (memq 'socket *features*)
     (pass-if "http://192.0.2.1"
       (equal? "http://192.0.2.1"
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-08  5:52 ` Daniel Hartwig
@ 2012-11-08 20:10   ` Ludovic Courtès
  2012-11-09  0:39     ` Daniel Hartwig
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-08 20:10 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Hi Daniel,

Thanks for the quick reply and patch!

Daniel Hartwig <mandyke@gmail.com> skribis:

> On 8 November 2012 04:40, Ludovic Courtès <ludo@gnu.org> wrote:
>> scheme@(guile-user)> (use-modules (web client) (web uri))
>> scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist"))
>> web/http.scm:191:11: In procedure read-header:
>> web/http.scm:191:11: Bad uri header component: gnu-404.html
>
> Some headers are supposed to support these “URI-references”.

Oh, OK.

> * TODO maybe use more scheme-ish names instead of RFC 3986
> This latest RFC makes a clear distinction between an actual URI
> and a reference to such.  I thought it best to reflect that
> distinction, but maybe it does pollute the namespace a bit:
> - absolute-uri? → uri-absolute?
> - relative-ref? → uri-relative?

No, I’d keep the names on the left.

Name space control is best left to ‘use-modules’ forms, anyway.

> * TODO build-uri validation is broken/less strict and will now pass
> relative-refs, so maybe introduce build-uri-reference instead

Yes.  Should uri-reference be a disjoint type, then?

> -(declare-uri-header! "Content-Location")
> +(declare-uri-reference-header! "Content-Location")

OK.

>  ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
>  ;;
> @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string."
>  
>  ;; Referer = ( absoluteURI | relativeURI )
>  ;;
> -(declare-uri-header! "Referer")
> +(declare-uri-reference-header! "Referer")

Should actually be “Referrer”, no?

> +(define (absolute-uri? uri)
> +  (and (uri? uri) (uri-scheme uri) #t))

Eventually, we’ll need docstrings, and updated documentation.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-08 20:10   ` Ludovic Courtès
@ 2012-11-09  0:39     ` Daniel Hartwig
  2012-11-09 20:52       ` Ludovic Courtès
  2012-11-23 22:19       ` Ludovic Courtès
  0 siblings, 2 replies; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-09  0:39 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

On 9 November 2012 04:10, Ludovic Courtès <ludo@gnu.org> wrote:
>> * TODO build-uri validation is broken/less strict and will now pass
>> relative-refs, so maybe introduce build-uri-reference instead
>
> Yes.  Should uri-reference be a disjoint type, then?

It needn't be, as long as there are predicates to distinguish.
(Actually, since <uri> is internal, maybe we should only expose the
new predicates, and keep “uri?” internal also).  The build-uri
validation works on the values before the <uri> object is constructed,
so I was just thinking of a separate build method with different, less
strict validation.

We just have to think of <uri> and uri? as guile implementation
details, not RFC.  Another option, is to rename <uri> to
<uri-reference>.  Then uri? can mean the same as absolute-uri? (as per
the RFC).

>> @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string."
>>
>>  ;; Referer = ( absoluteURI | relativeURI )
>>  ;;
>> -(declare-uri-header! "Referer")
>> +(declare-uri-reference-header! "Referer")
>
> Should actually be “Referrer”, no?

This is the actual spelling used in the RFC.

> Eventually, we’ll need docstrings, and updated documentation.

Yes.  I lazily left that until the other parts are finalized.  Let me
tackle the remaining items over the next week.

If we had those docs and the naming is ok, this patch is enough to
support reading the HTTP headers.  Users of http-get should be sure to
pass only an absolute-uri.  The missing function to resolve a
relative-ref to an absolute-uri is not required for reading or writing
headers, or using http-get, so that can come later (maybe I get this
week anyway).

Regards





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-09  0:39     ` Daniel Hartwig
@ 2012-11-09 20:52       ` Ludovic Courtès
  2012-11-10  1:45         ` Daniel Hartwig
  2012-11-23 22:19       ` Ludovic Courtès
  1 sibling, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-09 20:52 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Hi Daniel,

Daniel Hartwig <mandyke@gmail.com> skribis:

> On 9 November 2012 04:10, Ludovic Courtès <ludo@gnu.org> wrote:
>>> * TODO build-uri validation is broken/less strict and will now pass
>>> relative-refs, so maybe introduce build-uri-reference instead
>>
>> Yes.  Should uri-reference be a disjoint type, then?
>
> It needn't be, as long as there are predicates to distinguish.
> (Actually, since <uri> is internal, maybe we should only expose the
> new predicates, and keep “uri?” internal also).

I’m fine with keeping <uri> internal, but ‘uri?’ is public and must
remain so.

Anyway, I think it’s fine if the documentation makes it clear whether
functions expect absolute or relative URIs.  WDYT?

> The build-uri validation works on the values before the <uri> object
> is constructed, so I was just thinking of a separate build method with
> different, less strict validation.
>
> We just have to think of <uri> and uri? as guile implementation
> details, not RFC.  Another option, is to rename <uri> to
> <uri-reference>.  Then uri? can mean the same as absolute-uri? (as per
> the RFC).

Out current URI objects are actually absolute URI references, right?  In
that case, we’ll indeed have to make ‘uri?’ synonymous with
‘absolute-uri?’, for backward compatibility.

>>> @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string."
>>>
>>>  ;; Referer = ( absoluteURI | relativeURI )
>>>  ;;
>>> -(declare-uri-header! "Referer")
>>> +(declare-uri-reference-header! "Referer")
>>
>> Should actually be “Referrer”, no?
>
> This is the actual spelling used in the RFC.

Ouch.

>> Eventually, we’ll need docstrings, and updated documentation.
>
> Yes.  I lazily left that until the other parts are finalized.  Let me
> tackle the remaining items over the next week.

Yes, sure.

Thanks!

Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-09 20:52       ` Ludovic Courtès
@ 2012-11-10  1:45         ` Daniel Hartwig
  2012-11-10 13:52           ` Ludovic Courtès
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-10  1:45 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

On 10 November 2012 04:52, Ludovic Courtès <ludo@gnu.org> wrote:
> Anyway, I think it’s fine if the documentation makes it clear whether
> functions expect absolute or relative URIs.  WDYT?

Yes.  With the new predicates, it should be clear enough to use the
(pseudo-)type names in the usual scheme-doc way:

  -- Scheme Procedure: uri-resolve base-uri uri-reference

and not need to repeat too much in the prose.  Of course, doing so
when appropriate.  I'll try to draft something sensible.

>
>> The build-uri validation works on the values before the <uri> object
>> is constructed, so I was just thinking of a separate build method with
>> different, less strict validation.
>>
>> We just have to think of <uri> and uri? as guile implementation
>> details, not RFC.  Another option, is to rename <uri> to
>> <uri-reference>.  Then uri? can mean the same as absolute-uri? (as per
>> the RFC).
>
> Out current URI objects are actually absolute URI references, right?  In
> that case, we’ll indeed have to make ‘uri?’ synonymous with
> ‘absolute-uri?’, for backward compatibility.

More-or-less, the only exception being when validation is disabled:

scheme@(guile-user)> (uri? (build-uri #f #:path "foo" #:validate? #f))
$1 = #t

that object has no scheme, and is not an absolute-uri.  This is a bit
of an edge case.

The current documentation only defines a URI as an absolute-uri and
does not talk about anything else.  Most functions (uri->string, etc.)
will not work when passed something without a scheme.  So I think your
suggestion is ok as any users of the current API will most certainly
be using only absolute-uri's.

Regards





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-10  1:45         ` Daniel Hartwig
@ 2012-11-10 13:52           ` Ludovic Courtès
  0 siblings, 0 replies; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-10 13:52 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Hi,

Daniel Hartwig <mandyke@gmail.com> skribis:

> On 10 November 2012 04:52, Ludovic Courtès <ludo@gnu.org> wrote:
>> Anyway, I think it’s fine if the documentation makes it clear whether
>> functions expect absolute or relative URIs.  WDYT?
>
> Yes.  With the new predicates, it should be clear enough to use the
> (pseudo-)type names in the usual scheme-doc way:
>
>   -- Scheme Procedure: uri-resolve base-uri uri-reference
>
> and not need to repeat too much in the prose.  Of course, doing so
> when appropriate.  I'll try to draft something sensible.

Yes.

>>> The build-uri validation works on the values before the <uri> object
>>> is constructed, so I was just thinking of a separate build method with
>>> different, less strict validation.
>>>
>>> We just have to think of <uri> and uri? as guile implementation
>>> details, not RFC.  Another option, is to rename <uri> to
>>> <uri-reference>.  Then uri? can mean the same as absolute-uri? (as per
>>> the RFC).
>>
>> Out current URI objects are actually absolute URI references, right?  In
>> that case, we’ll indeed have to make ‘uri?’ synonymous with
>> ‘absolute-uri?’, for backward compatibility.
>
> More-or-less, the only exception being when validation is disabled:
>
> scheme@(guile-user)> (uri? (build-uri #f #:path "foo" #:validate? #f))
> $1 = #t
>
> that object has no scheme, and is not an absolute-uri.  This is a bit
> of an edge case.

Yes, but when the user sets #:validate? to #f, then they take the
responsibility for anything that will happen.  IOW, #:validate? #f
allows users to forge broken URI objects, but that’s part of the
contract anyway.

> The current documentation only defines a URI as an absolute-uri and
> does not talk about anything else.  Most functions (uri->string, etc.)
> will not work when passed something without a scheme.  So I think your
> suggestion is ok as any users of the current API will most certainly
> be using only absolute-uri's.

Good.  So that means that URI refs can be added without introducing any
incompatibility.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-09  0:39     ` Daniel Hartwig
  2012-11-09 20:52       ` Ludovic Courtès
@ 2012-11-23 22:19       ` Ludovic Courtès
  2012-11-24 11:23         ` Daniel Hartwig
  1 sibling, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-23 22:19 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Hi Daniel,

Daniel Hartwig <mandyke@gmail.com> skribis:

>> Eventually, we’ll need docstrings, and updated documentation.
>
> Yes.  I lazily left that until the other parts are finalized.  Let me
> tackle the remaining items over the next week.

Any update on that?  The plan is to release 2.0.7 next week, so it’d be
great if this could be in.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-23 22:19       ` Ludovic Courtès
@ 2012-11-24 11:23         ` Daniel Hartwig
  2012-11-24 15:10           ` Ludovic Courtès
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-24 11:23 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

[-- Attachment #1: Type: text/plain, Size: 686 bytes --]

On 24 November 2012 06:19, Ludovic Courtès <ludo@gnu.org> wrote:
> Any update on that?  The plan is to release 2.0.7 next week, so it’d be
> great if this could be in.

I have made a first attempt at the doc strings and manual.  This
involved first syncronizing the two, as only the manual had been
receiving updates.

Some more tweaking to the code.

Personally I am not 100% on this, but I attach it for comment anyway.
I will not be able to work on it again for a short while.

A quick solution may be to silently introduce just enough to fix the
current bug, and worry about the extra predicates, uri-record-type vs.
rfc-definition-of-uri, etc. later.

Regards

[-- Attachment #2: 0001-syncronize-web-module-docstrings-with-manual.patch --]
[-- Type: application/octet-stream, Size: 21066 bytes --]

From fcc01b345b93d7a75980d7607684e4a5b3243daa Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 24 Nov 2012 14:10:12 +0800
Subject: [PATCH 1/3] syncronize web module docstrings with manual

* doc/ref/web.texi: Fix spacing.  Update with a few missing function
  descriptions.

* module/web/client.scm:
* module/web/http.scm:
* module/web/request.scm:
* module/web/server.scm:
* module/web/uri.scm: Update docstrings from manual.
---
 doc/ref/web.texi       |   35 ++++++++++++++---------------
 module/web/client.scm  |    9 ++++++++
 module/web/http.scm    |   41 ++++++++++++++++------------------
 module/web/request.scm |   14 ++++++++----
 module/web/server.scm  |   10 ++++-----
 module/web/uri.scm     |   57 +++++++++++++++++++++++++++++-------------------
 6 files changed, 96 insertions(+), 70 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 161a28d..e6e594e 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -431,8 +431,8 @@ from @code{header-writer}.
 @end deffn
 
 @deffn {Scheme Procedure} read-headers port
-Read the headers of an HTTP message from @var{port}, returning the
-headers as an ordered alist.
+Read the headers of an HTTP message from @var{port}, returning them
+as an ordered alist.
 @end deffn
 
 @deffn {Scheme Procedure} write-headers headers port
@@ -1368,6 +1368,7 @@ Return the given response header, or @var{default} if none was present.
 the lower-level HTTP, request, and response modules.
 
 @deffn {Scheme Procedure} open-socket-for-uri uri
+Return an open input/output port for a connection to URI.
 @end deffn
 
 @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
@@ -1470,17 +1471,17 @@ the server socket.
 
 A user may define a server implementation with the following form:
 
-@deffn {Scheme Procedure} define-server-impl name open read write close
+@deffn {Scheme Syntax} define-server-impl name open read write close
 Make a @code{<server-impl>} object with the hooks @var{open},
 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
 @var{name} in the current module.
 @end deffn
 
 @deffn {Scheme Procedure} lookup-server-impl impl
-Look up a server implementation. If @var{impl} is a server
-implementation already, it is returned directly. If it is a symbol, the
+Look up a server implementation.  If @var{impl} is a server
+implementation already, it is returned directly.  If it is a symbol, the
 binding named @var{impl} in the @code{(web server @var{impl})} module is
-looked up. Otherwise an error is signaled.
+looked up.  Otherwise an error is signaled.
 
 Currently a server implementation is a somewhat opaque type, useful only
 for passing to other procedures in this module, like @code{read-client}.
@@ -1494,7 +1495,7 @@ any access to the impl objects.
 
 @deffn {Scheme Procedure} open-server impl open-params
 Open a server for the given implementation.  Return one value, the new
-server object. The implementation's @code{open} procedure is applied to
+server object.  The implementation's @code{open} procedure is applied to
 @var{open-params}, which should be a list.
 @end deffn
 
@@ -1502,7 +1503,7 @@ server object. The implementation's @code{open} procedure is applied to
 Read a new client from @var{server}, by applying the implementation's
 @code{read} procedure to the server.  If successful, return three
 values: an object corresponding to the client, a request object, and the
-request body. If any exception occurs, return @code{#f} for all three
+request body.  If any exception occurs, return @code{#f} for all three
 values.
 @end deffn
 
@@ -1513,9 +1514,9 @@ The response and response body are produced by calling the given
 @var{handler} with @var{request} and @var{body} as arguments.
 
 The elements of @var{state} are also passed to @var{handler} as
-arguments, and may be returned as additional values. The new
+arguments, and may be returned as additional values.  The new
 @var{state}, collected from the @var{handler}'s return values, is then
-returned as a list. The idea is that a server loop receives a handler
+returned as a list.  The idea is that a server loop receives a handler
 from the user, along with whatever state values the user is interested
 in, allowing the user's handler to explicitly manage its state.
 @end deffn
@@ -1526,20 +1527,20 @@ given request.
 
 As a convenience to web handler authors, @var{response} may be given as
 an alist of headers, in which case it is used to construct a default
-response. Ensures that the response version corresponds to the request
-version. If @var{body} is a string, encodes the string to a bytevector,
-in an encoding appropriate for @var{response}. Adds a
+response.  Ensures that the response version corresponds to the request
+version.  If @var{body} is a string, encodes the string to a bytevector,
+in an encoding appropriate for @var{response}.  Adds a
 @code{content-length} and @code{content-type} header, as necessary.
 
 If @var{body} is a procedure, it is called with a port as an argument,
-and the output collected as a bytevector. In the future we might try to
+and the output collected as a bytevector.  In the future we might try to
 instead use a compressing, chunk-encoded port, and call this procedure
-later, in the write-client procedure. Authors are advised not to rely on
+later, in the write-client procedure.  Authors are advised not to rely on
 the procedure being called at any particular time.
 @end deffn
 
 @deffn {Scheme Procedure} write-client impl server client response body
-Write an HTTP response and body to @var{client}. If the server and
+Write an HTTP response and body to @var{client}.  If the server and
 client support persistent connections, it is the implementation's
 responsibility to keep track of the client thereafter, presumably by
 attaching it to the @var{server} argument somehow.
@@ -1572,7 +1573,7 @@ before sending back to the client.
 
 Additional arguments to @var{handler} are taken from @var{state}.
 Additional return values are accumulated into a new @var{state}, which
-will be used for subsequent requests. In this way a handler can
+will be used for subsequent requests.  In this way a handler can
 explicitly manage its state.
 @end deffn
 
diff --git a/module/web/client.scm b/module/web/client.scm
index cf7ea53..0991373 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -115,6 +115,15 @@
 (define* (http-get uri #:key (port (open-socket-for-uri uri))
                    (version '(1 . 1)) (keep-alive? #f) (extra-headers '())
                    (decode-body? #t))
+  "Connect to the server corresponding to @var{uri} and ask for the
+resource, using the @code{GET} method.  If you already have a port open,
+pass it as @var{port}.  The port will be closed at the end of the
+request unless @var{keep-alive?} is true.  Any extra headers in the
+alist @var{extra-headers} will be added to the request.
+
+If @var{decode-body?} is true, as is the default, the body of the
+response will be decoded to string, if it is a textual content-type.
+Otherwise it will be returned as a bytevector."
   (let ((req (build-request uri #:version version
                             #:headers (if keep-alive?
                                           extra-headers
diff --git a/module/web/http.scm b/module/web/http.scm
index cc5dd5a..3b78d08 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -100,12 +100,7 @@
                           validator
                           writer
                           #:key multiple?)
-  "Define a parser, validator, and writer for the HTTP header, @var{name}.
-
-@var{parser} should be a procedure that takes a string and returns a
-Scheme value.  @var{validator} is a predicate for whether the given
-Scheme value is valid for this header.  @var{writer} takes a value and a
-port, and writes the value to the port."
+  "Declare a parser, validator, and writer for a given header."
   (if (and (string? name) parser validator writer)
       (let ((decl (make-header-decl name parser validator writer multiple?)))
         (hashq-set! *declared-headers* (string->header name) decl)
@@ -120,27 +115,33 @@ port, and writes the value to the port."
         (string-titlecase (symbol->string sym)))))
 
 (define (known-header? sym)
-  "Return @code{#t} if there are parsers and writers registered for this
-header, otherwise @code{#f}."
+  "Return @code{#t} iff @var{sym} is a known header, with associated
+parsers and serialization procedures."
   (and (lookup-header-decl sym) #t))
 
 (define (header-parser sym)
-  "Returns a procedure to parse values for the given header."
+  "Return the value parser for headers named @var{sym}.  The result is a
+procedure that takes one argument, a string, and returns the parsed
+value.  If the header isn't known to Guile, a default parser is returned
+that passes through the string unchanged."
   (let ((decl (lookup-header-decl sym)))
     (if decl
         (header-decl-parser decl)
         (lambda (x) x))))
 
 (define (header-validator sym)
-  "Returns a procedure to validate values for the given header."
+  "Return a predicate which returns @code{#t} if the given value is valid
+for headers named @var{sym}.  The default validator for unknown headers
+is @code{string?}."
   (let ((decl (lookup-header-decl sym)))
     (if decl
         (header-decl-validator decl)
         string?)))
 
 (define (header-writer sym)
-  "Returns a procedure to write values for the given header to a given
-port."
+  "Return a procedure that writes values for headers named @var{sym} to a
+port.  The resulting procedure takes two arguments: a value and a port.
+The default writer is @code{display}."
   (let ((decl (lookup-header-decl sym)))
     (if decl
         (header-decl-writer decl)
@@ -196,10 +197,7 @@ body was reached (i.e., a blank line)."
 
 (define (parse-header sym val)
   "Parse @var{val}, a string, with the parser registered for the header
-named @var{sym}.
-
-Returns the parsed value.  If a parser was not found, the value is
-returned as a string."
+named @var{sym}.  Returns the parsed value."
   ((header-parser sym) val))
 
 (define (valid-header? sym val)
@@ -210,17 +208,16 @@ header with name @var{sym}."
       (error "header name not a symbol" sym)))
 
 (define (write-header sym val port)
-  "Writes the given header name and value to @var{port}.  If @var{sym}
-is a known header, uses the specific writer registered for that header.
-Otherwise the value is written using @code{display}."
+  "Write the given header name and value to @var{port}, using the writer
+from @code{header-writer}."
   (display (header->string sym) port)
   (display ": " port)
   ((header-writer sym) val port)
   (display "\r\n" port))
 
 (define (read-headers port)
-  "Read an HTTP message from @var{port}, returning the headers as an
-ordered alist."
+  "Read the headers of an HTTP message from @var{port}, returning them
+as an ordered alist."
   (let lp ((headers '()))
     (call-with-values (lambda () (read-header port))
       (lambda (k v)
@@ -230,7 +227,7 @@ ordered alist."
 
 (define (write-headers headers port)
   "Write the given header alist to @var{port}.  Doesn't write the final
-\\r\\n, as the user might want to add another header."
+@samp{\\r\\n}, as the user might want to add another header."
   (let lp ((headers headers))
     (if (pair? headers)
         (begin
diff --git a/module/web/request.scm b/module/web/request.scm
index 40d4a66..51ef473 100644
--- a/module/web/request.scm
+++ b/module/web/request.scm
@@ -195,7 +195,11 @@ metadata, @var{meta}.
 As a side effect, sets the encoding on @var{port} to
 ISO-8859-1 (latin-1), so that reading one character reads one byte.  See
 the discussion of character sets in \"HTTP Requests\" in the manual, for
-more information."
+more information.
+
+Note that the body is not part of the request.  Once you have read a
+request, you may read the body separately, and likewise for writing
+requests."
   (set-port-encoding! port "ISO-8859-1")
   (call-with-values (lambda () (read-request-line port))
     (lambda (method uri version)
@@ -205,7 +209,7 @@ more information."
 (define (write-request r port)
   "Write the given HTTP request to @var{port}.
 
-Returns a new request, whose @code{request-port} will continue writing
+Return a new request, whose @code{request-port} will continue writing
 on @var{port}, perhaps using some transfer encoding."
   (write-request-line (request-method r) (request-uri r)
                       (request-version r) port)
@@ -217,8 +221,8 @@ on @var{port}, perhaps using some transfer encoding."
                     (request-headers r) (request-meta r) port)))
 
 (define (read-request-body r)
-  "Reads the request body from @var{r}, as a bytevector.  Returns
-@code{#f} if there was no request body."
+  "Reads the request body from @var{r}, as a bytevector.  Return @code{#f}
+if there was no request body."
   (let ((nbytes (request-content-length r)))
     (and nbytes
          (let ((bv (get-bytevector-n (request-port r) nbytes)))
@@ -297,6 +301,8 @@ request @var{r}."
 
 ;; Misc accessors
 (define* (request-absolute-uri r #:optional default-host default-port)
+  "A helper routine to determine the absolute URI of a request, using the
+@code{host} header and the default host and port."
   (let ((uri (request-uri r)))
     (if (uri-host uri)
         uri
diff --git a/module/web/server.scm b/module/web/server.scm
index 42887e6..ed88329 100644
--- a/module/web/server.scm
+++ b/module/web/server.scm
@@ -143,7 +143,7 @@ for passing to other procedures in this module, like
 
 ;; -> server
 (define (open-server impl open-params)
-  "Open a server for the given implementation.  Returns one value, the
+  "Open a server for the given implementation.  Return one value, the
 new server object.  The implementation's @code{open} procedure is
 applied to @var{open-params}, which should be a list."
   (apply (server-impl-open impl) open-params))
@@ -151,9 +151,9 @@ applied to @var{open-params}, which should be a list."
 ;; -> (client request body | #f #f #f)
 (define (read-client impl server)
   "Read a new client from @var{server}, by applying the implementation's
-@code{read} procedure to the server.  If successful, returns three
+@code{read} procedure to the server.  If successful, return three
 values: an object corresponding to the client, a request object, and the
-request body.  If any exception occurs, returns @code{#f} for all three
+request body.  If any exception occurs, return @code{#f} for all three
 values."
   (call-with-error-handling
    (lambda ()
@@ -364,7 +364,7 @@ attaching it to the @var{server} argument somehow."
 ;; -> new-state
 (define (serve-one-client handler impl server state)
   "Read one request from @var{server}, call @var{handler} on the request
-and body, and write the response to the client.  Returns the new state
+and body, and write the response to the client.  Return the new state
 produced by the handler procedure."
   (debug-elapsed 'serve-again)
   (call-with-values
@@ -404,7 +404,7 @@ The response and body will be run through @code{sanitize-response}
 before sending back to the client.
 
 Additional arguments to @var{handler} are taken from
-@var{state}. Additional return values are accumulated into a new
+@var{state}.  Additional return values are accumulated into a new
 @var{state}, which will be used for subsequent requests.  In this way a
 handler can explicitly manage its state.
 
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 78614a5..ddab7be 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -79,8 +79,10 @@
 
 (define* (build-uri scheme #:key userinfo host port (path "") query fragment
                     (validate? #t))
-  "Construct a URI object. If @var{validate?} is true, also run some
-consistency checks to make sure that the constructed URI is valid."
+  "Construct a URI object.  @var{scheme} should be a symbol, and the rest
+of the fields are either strings or @code{#f}.  If @var{validate?} is
+true, also run some consistency checks to make sure that the constructed
+URI is valid."
   (if validate?
       (validate-uri scheme userinfo host port path query fragment))
   (make-uri scheme userinfo host port path query fragment))
@@ -168,7 +170,7 @@ consistency checks to make sure that the constructed URI is valid."
   (make-regexp uri-pat))
 
 (define (string->uri string)
-  "Parse @var{string} into a URI object. Returns @code{#f} if the string
+  "Parse @var{string} into a URI object.  Return @code{#f} if the string
 could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
        (if (not m) (abort))
@@ -191,10 +193,7 @@ could not be parsed."
 (define *default-ports* (make-hash-table))
 
 (define (declare-default-port! scheme port)
-  "Declare a default port for the given URI scheme.
-
-Default ports are for printing URI objects: a default port is not
-printed."
+  "Declare a default port for the given URI scheme."
   (hashq-set! *default-ports* scheme port))
 
 (define (default-port? scheme port)
@@ -205,7 +204,9 @@ printed."
 (declare-default-port! 'https 443)
 
 (define (uri->string uri)
-  "Serialize @var{uri} to a string."
+  "Serialize @var{uri} to a string.  If the URI has a port that is the
+default port for its scheme, the port is not included in the
+serialization."
   (let* ((scheme-str (string-append
                       (symbol->string (uri-scheme uri)) ":"))
          (userinfo (uri-userinfo uri))
@@ -293,18 +294,24 @@ printed."
   (string->char-set "0123456789abcdefABCDEF"))
 
 (define* (uri-decode str #:key (encoding "utf-8"))
-  "Percent-decode the given @var{str}, according to @var{encoding}.
+  "Percent-decode the given @var{str}, according to @var{encoding},
+which should be the name of a character encoding.
 
 Note that this function should not generally be applied to a full URI
 string. For paths, use split-and-decode-uri-path instead. For query
 strings, split the query on @code{&} and @code{=} boundaries, and decode
 the components separately.
 
-Note that percent-encoded strings encode @emph{bytes}, not characters.
-There is no guarantee that a given byte sequence is a valid string
-encoding. Therefore this routine may signal an error if the decoded
-bytes are not valid for the given encoding. Pass @code{#f} for
-@var{encoding} if you want decoded bytes as a bytevector directly."
+Note also that percent-encoded strings encode @emph{bytes}, not
+characters.  There is no guarantee that a given byte sequence is a valid
+string encoding. Therefore this routine may signal an error if the
+decoded bytes are not valid for the given encoding. Pass @code{#f} for
+@var{encoding} if you want decoded bytes as a bytevector directly.
+@xref{Ports, @code{set-port-encoding!}}, for more information on
+character encodings.
+
+Returns a string of the decoded characters, or a bytevector if
+@var{encoding} was @code{#f}."
   (let* ((len (string-length str))
          (bv
           (call-with-output-bytevector*
@@ -358,10 +365,13 @@ bytes are not valid for the given encoding. Pass @code{#f} for
 ;;
 (define* (uri-encode str #:key (encoding "utf-8")
                      (unescaped-chars unreserved-chars))
-  "Percent-encode any character not in the character set, @var{unescaped-chars}.
+  "Percent-encode any character not in the character set,
+@var{unescaped-chars}.
 
-Percent-encoding first writes out the given character to a bytevector
-within the given @var{encoding}, then encodes each byte as
+The default character set includes alphanumerics from ASCII, as well as
+the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}.  Any
+other character will be percent-encoded, by writing out the character to
+a bytevector within the given @var{encoding}, then encoding each byte as
 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
 the byte."
   (define (needs-escaped? ch)
@@ -387,15 +397,18 @@ the byte."
       str))
 
 (define (split-and-decode-uri-path path)
-  "Split @var{path} into its components, and decode each
-component, removing empty components.
+  "Split @var{path} into its components, and decode each component,
+removing empty components.
 
-For example, @code{\"/foo/bar/\"} decodes to the two-element list,
-@code{(\"foo\" \"bar\")}."
+For example, @code{\"/foo/bar%20baz/\"} decodes to the two-element list,
+@code{(\"foo\" \"bar baz\")}."
   (filter (lambda (x) (not (string-null? x)))
           (map uri-decode (string-split path #\/))))
 
 (define (encode-and-join-uri-path parts)
   "URI-encode each element of @var{parts}, which should be a list of
-strings, and join the parts together with @code{/} as a delimiter."
+strings, and join the parts together with @code{/} as a delimiter.
+
+For example, the list @code{(\"scrambled eggs\" \"biscuits&gravy\")}
+encodes as @code{\"scrambled%20eggs/biscuits%26gravy\"}."
   (string-join (map uri-encode parts) "/"))
-- 
1.7.10.4


[-- Attachment #3: 0002-web-uri-document-that-uri-port-is-an-integer.patch --]
[-- Type: application/octet-stream, Size: 2827 bytes --]

From 50c5235ca50e356acfad709ae820c2963a8ff11c Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 24 Nov 2012 15:11:21 +0800
Subject: [PATCH 2/3] (web uri): document that uri-port is an integer

* doc/ref/web.texi (URIs):
* module/web/uri.scm (build-uri): Document that uri-port is an integer.
---
 doc/ref/web.texi   |   13 +++++++------
 module/web/uri.scm |    9 +++++----
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index e6e594e..3e93bea 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -209,10 +209,11 @@ access to them.
 @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
        [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
        [#:fragment=@code{#f}] [#:validate?=@code{#t}]
-Construct a URI object.  @var{scheme} should be a symbol, and the rest
-of the fields are either strings or @code{#f}.  If @var{validate?} is
-true, also run some consistency checks to make sure that the constructed
-URI is valid.
+Construct a URI object.  @var{scheme} should be a symbol, @var{port}
+either a positive, exact integer or @code{#f}, and the rest of the
+fields are either strings or @code{#f}.  If @var{validate?} is true,
+also run some consistency checks to make sure that the constructed URI
+is valid.
 @end deffn
 
 @deffn {Scheme Procedure} uri? x
@@ -224,8 +225,8 @@ URI is valid.
 @deffnx {Scheme Procedure} uri-query uri
 @deffnx {Scheme Procedure} uri-fragment uri
 A predicate and field accessors for the URI record type.  The URI scheme
-will be a symbol, and the rest either strings or @code{#f} if not
-present.
+will be a symbol, the port either a positive, exact integer or @code{#f},
+and the rest either strings or @code{#f} if not present.
 @end deffn
 
 @deffn {Scheme Procedure} string->uri string
diff --git a/module/web/uri.scm b/module/web/uri.scm
index ddab7be..e84bc03 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -79,10 +79,11 @@
 
 (define* (build-uri scheme #:key userinfo host port (path "") query fragment
                     (validate? #t))
-  "Construct a URI object.  @var{scheme} should be a symbol, and the rest
-of the fields are either strings or @code{#f}.  If @var{validate?} is
-true, also run some consistency checks to make sure that the constructed
-URI is valid."
+  "Construct a URI object.  @var{scheme} should be a symbol, @var{port}
+either a positive, exact integer or @code{#f}, and the rest of the
+fields are either strings or @code{#f}.  If @var{validator?} is true,
+also run some consistency checks to make sure that the constructed URI
+is valid."
   (if validate?
       (validate-uri scheme userinfo host port path query fragment))
   (make-uri scheme userinfo host port path query fragment))
-- 
1.7.10.4


[-- Attachment #4: 0003-uri-reference-support.patch --]
[-- Type: application/octet-stream, Size: 15823 bytes --]

From ebbcc923b776fd3fbfe28c5c050ee8df7b71529d Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Thu, 8 Nov 2012 13:11:52 +0800
Subject: [PATCH 3/3] uri-reference support

* module/web/uri.scm (build-uri, validate-uri): Also build
  relative-refs, which have no scheme.
  (string->uri, uri->string): Also support URI objects with no scheme.
  (uri-reference?, relative-ref?, absolute-uri?): New predicates to
  distinguish various URI-like objects.
  (uri?): Redefine so that semantics are unchanged; only return #t for
  objects previously built and validated by build-uri?.  Such objects
  always had a uri-scheme.
* module/web/http.scm (declare-uri-reference-header!): New header type
  accepting any URI-reference.
  ("Content-Location", "Referer"): Change to URI-reference type.
* doc/ref/web.texi (URIs): Document other URI-like syntaxes defined in
  RFC 3986.  Include brief discussion.  Update functions
* test-suite/tests/web-http.test:
* test-suite/tests/web-uri.test: Add relevant tests.
---
 doc/ref/web.texi               |   39 ++++++++++---
 module/web/http.scm            |   13 ++++-
 module/web/uri.scm             |   59 ++++++++++++++-----
 test-suite/tests/web-http.test |    4 ++
 test-suite/tests/web-uri.test  |  125 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 214 insertions(+), 26 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 3e93bea..d247080 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -173,8 +173,9 @@ Guile provides a standard data type for Universal Resource Identifiers
 The generic URI syntax is as follows:
 
 @example
-URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
-       [ "?" query ] [ "#" fragment ]
+URI := scheme ":" hier-part [ "?" query ] [ "#" fragment ]
+
+hier-part := ["//" [userinfo "@@"] host [":" port]] path
 @end example
 
 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
@@ -198,6 +199,25 @@ fragment identifies a part of a resource, not the resource itself.  But
 it is useful to have a fragment field in the URI record itself, so we
 hope you will forgive the inconsistency.
 
+There are some additional URI-like syntaxes:
+
+@example
+URI-reference := URI / relative-ref
+
+relative-ref := hier-part [ "?" query ] [ "#" fragment ]
+
+absolute-URI := scheme ":" hier-part [ "?" query ]
+@end example
+
+These extra forms are useful in various situations.  For example,
+relative-refs are convenient in documents to refer to other parts of the
+same document, or resources located on the same site.
+
+A relative-ref must be considered in relation to a given base URI to
+correctly identify a resource.  The base URI is determined according to
+the context and properties of the document in which the relative-ref is
+located.  See section 5 of RFC 3986 for details.
+
 @example
 (use-modules (web uri))
 @end example
@@ -211,12 +231,14 @@ access to them.
        [#:fragment=@code{#f}] [#:validate?=@code{#t}]
 Construct a URI object.  @var{scheme} should be a symbol, @var{port}
 either a positive, exact integer or @code{#f}, and the rest of the
-fields are either strings or @code{#f}.  If @var{validate?} is true,
-also run some consistency checks to make sure that the constructed URI
-is valid.
+fields are strings.  If @var{validate?} is true, also run some
+consistency checks to make sure that the constructed URI is valid.
 @end deffn
 
 @deffn {Scheme Procedure} uri? x
+@deffnx {Scheme Procedure} uri-reference? x
+@deffnx {Scheme Procedure} relative-ref? x
+@deffnx {Scheme Procedure} absolute-uri? x
 @deffnx {Scheme Procedure} uri-scheme uri
 @deffnx {Scheme Procedure} uri-userinfo uri
 @deffnx {Scheme Procedure} uri-host uri
@@ -224,9 +246,10 @@ is valid.
 @deffnx {Scheme Procedure} uri-path uri
 @deffnx {Scheme Procedure} uri-query uri
 @deffnx {Scheme Procedure} uri-fragment uri
-A predicate and field accessors for the URI record type.  The URI scheme
-will be a symbol, the port either a positive, exact integer or @code{#f},
-and the rest either strings or @code{#f} if not present.
+Predicates and field accessors for the URI record type.  The URI scheme
+will be a symbol, the port a positive, exact integer, and the rest
+strings.  Any field other than @code{uri-path} may also be @code{#f} if
+not present.
 @end deffn
 
 @deffn {Scheme Procedure} string->uri string
diff --git a/module/web/http.scm b/module/web/http.scm
index 3b78d08..389880b 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1179,12 +1179,19 @@ treated specially, and is just returned as a plain string."
     parse-non-negative-integer non-negative-integer? display))
 
 ;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1)
-(define (declare-uri-header! name)
+(define* (declare-uri-header! name #:optional)
   (declare-header! name
     (lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
     uri?
     write-uri))
 
+;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
+(define (declare-uri-reference-header! name)
+  (declare-header! name
+    (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str)))
+    uri-reference?
+    write-uri))
+
 ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
 (define (declare-quality-list-header! name)
   (declare-header! name
@@ -1437,7 +1444,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Content-Location = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Content-Location")
+(declare-uri-reference-header! "Content-Location")
 
 ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
 ;;
@@ -1726,7 +1733,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Referer = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Referer")
+(declare-uri-reference-header! "Referer")
 
 ;; TE = #( t-codings )
 ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )
diff --git a/module/web/uri.scm b/module/web/uri.scm
index e84bc03..e7990ad 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -31,7 +31,7 @@
   #:use-module (ice-9 control)
   #:use-module (rnrs bytevectors)
   #:use-module (ice-9 binary-ports)
-  #:export (uri?
+  #:export (uri? uri-reference? relative-ref? absolute-uri?
             uri-scheme uri-userinfo uri-host uri-port
             uri-path uri-query uri-fragment
 
@@ -44,7 +44,7 @@
 
 (define-record-type <uri>
   (make-uri scheme userinfo host port path query fragment)
-  uri?
+  uri-reference?
   (scheme uri-scheme)
   (userinfo uri-userinfo)
   (host uri-host)
@@ -53,15 +53,32 @@
   (query uri-query)
   (fragment uri-fragment))
 
+(define (uri? x)
+  (and (uri-reference? x)
+       (uri-scheme x)
+       #t))
+
+(define (relative-ref? x)
+  (and (uri-reference? x)
+       (not (uri-scheme x))
+       #t))
+
+(define (absolute-uri? x)
+  (and (uri-reference? x)
+       (uri-scheme x)
+       (not (uri-fragment x))
+       #t))
+
 (define (uri-error message . args)
   (throw 'uri-error message args))
 
 (define (positive-exact-integer? port)
   (and (number? port) (exact? port) (integer? port) (positive? port)))
 
-(define (validate-uri scheme userinfo host port path query fragment)
+(define (validate-uri scheme userinfo host port path query fragment reference?)
   (cond
-   ((not (symbol? scheme))
+   ((and (not (symbol? scheme))
+         (or (not reference?) scheme))
     (uri-error "Expected a symbol for the URI scheme: ~s" scheme))
    ((and (or userinfo port) (not host))
     (uri-error "Expected a host, given userinfo or port"))
@@ -85,7 +102,7 @@ fields are either strings or @code{#f}.  If @var{validator?} is true,
 also run some consistency checks to make sure that the constructed URI
 is valid."
   (if validate?
-      (validate-uri scheme userinfo host port path query fragment))
+      (validate-uri scheme userinfo host port path query fragment #t))
   (make-uri scheme userinfo host port path query fragment))
 
 ;; See RFC 3986 #3.2.2 for comments on percent-encodings, IDNA (RFC
@@ -153,6 +170,17 @@ is valid."
 ;;;               / path-absolute
 ;;;               / path-rootless
 ;;;               / path-empty
+;;;
+;;; RFC 3986, #4.
+;;;
+;;;   URI-reference = URI / relative-ref
+;;;
+;;;   relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
+;;;
+;;;   relative-part = "//" authority path-abempty
+;;;                 / path-absolute
+;;;                 / path-noscheme
+;;;                 / path-empty
 
 (define scheme-pat
   "[a-zA-Z][a-zA-Z0-9+.-]*")
@@ -165,7 +193,7 @@ is valid."
 (define fragment-pat
   ".*")
 (define uri-pat
-  (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$"
+  (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$"
           scheme-pat authority-pat path-pat query-pat fragment-pat))
 (define uri-regexp
   (make-regexp uri-pat))
@@ -175,12 +203,12 @@ is valid."
 could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
        (if (not m) (abort))
-       (let ((scheme (string->symbol
-                      (string-downcase (match:substring m 1))))
-             (authority (match:substring m 2))
-             (path (match:substring m 3))
-             (query (match:substring m 5))
-             (fragment (match:substring m 7)))
+       (let ((scheme (let ((s (match:substring m 2)))
+                       (and s (string->symbol (string-downcase s)))))
+             (authority (match:substring m 3))
+             (path (match:substring m 4))
+             (query (match:substring m 6))
+             (fragment (match:substring m 8)))
          (call-with-values
              (lambda ()
                (if authority
@@ -208,8 +236,7 @@ could not be parsed."
   "Serialize @var{uri} to a string.  If the URI has a port that is the
 default port for its scheme, the port is not included in the
 serialization."
-  (let* ((scheme-str (string-append
-                      (symbol->string (uri-scheme uri)) ":"))
+  (let* ((scheme (uri-scheme uri))
          (userinfo (uri-userinfo uri))
          (host (uri-host uri))
          (port (uri-port uri))
@@ -217,7 +244,9 @@ serialization."
          (query (uri-query uri))
          (fragment (uri-fragment uri)))
     (string-append
-     scheme-str
+     (if scheme
+         (string-append (symbol->string scheme) ":")
+         "")
      (if host
          (string-append "//"
                         (if userinfo (string-append userinfo "@")
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..5d80fde 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -145,6 +145,8 @@
   (pass-if-parse content-length "010" 10)
   (pass-if-parse content-location "http://foo/"
                  (build-uri 'http #:host "foo" #:path "/"))
+  (pass-if-parse content-location "//foo"
+                 (build-uri #f #:host "foo"))
   (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
   (pass-if-parse content-range "bytes */*" '(bytes * *))
   (pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -208,6 +210,8 @@
   (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
   (pass-if-parse referer "http://foo/bar?baz"
                  (build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+  (pass-if-parse referer "//foo/bar?baz"
+                 (build-uri #f #:host "foo" #:path "/bar" #:query "baz"))
   (pass-if-parse te "trailers" '((trailers)))
   (pass-if-parse te "trailers,foo" '((trailers) (foo)))
   (pass-if-parse user-agent "guile" "guile"))
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3f6e7e3..d4d44d6 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -78,6 +78,24 @@
            #:port 22
            #:path "/baz"))
 
+  (pass-if "foo"
+    (uri=? (build-uri #f #:path "foo")
+           #:path "foo"))
+
+  (pass-if "/foo"
+    (uri=? (build-uri #f #:path "/foo")
+           #:path "/foo"))
+
+  (pass-if "//foo/bar"
+    (uri=? (build-uri #f #:host "foo" #:path "/bar")
+           #:host "foo"
+           #:path "/bar"))
+
+  (pass-if "?foo"
+    (uri=? (build-uri #f #:query "foo")
+           #:path ""
+           #:query "foo"))
+
   (pass-if-uri-exception "non-symbol scheme"
                          "Expected.*symbol"
                          (build-uri "nonsym"))
@@ -123,6 +141,80 @@
                          "Expected.*host"
                          (build-uri 'http #:userinfo "foo")))
 
+(with-test-prefix "absolute-uri?"
+  (pass-if "ftp:"
+    (absolute-uri? (build-uri 'ftp)))
+
+  (pass-if "ftp:foo"
+    (absolute-uri? (build-uri 'ftp #:path "foo")))
+
+  (pass-if "ftp://foo/bar"
+    (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (pass-if "ftp://foo@bar:22/baz"
+    (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (expect-fail "foo"
+    (absolute-uri? (build-uri #f #:path "foo")))
+
+  (expect-fail "/foo"
+    (absolute-uri? (build-uri #f #:path "/foo")))
+
+  (expect-fail "//foo/bar"
+    (absolute-uri? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (expect-fail "?foo"
+    (absolute-uri? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "relative-ref?"
+  (expect-fail "ftp:"
+    (relative-ref? (build-uri 'ftp)))
+
+  (expect-fail "ftp:foo"
+    (relative-ref? (build-uri 'ftp #:path "foo")))
+
+  (expect-fail "ftp://foo/bar"
+    (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (expect-fail "ftp://foo@bar:22/baz"
+    (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (pass-if "foo"
+    (relative-ref? (build-uri #f #:path "foo")))
+
+  (pass-if "/foo"
+    (relative-ref? (build-uri #f #:path "/foo")))
+
+  (pass-if "//foo/bar"
+    (relative-ref? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (pass-if "?foo"
+    (relative-ref? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "uri-reference?"
+  (pass-if "ftp:"
+    (uri-reference? (build-uri 'ftp)))
+
+  (pass-if "ftp:foo"
+    (uri-reference? (build-uri 'ftp #:path "foo")))
+
+  (pass-if "ftp://foo/bar"
+    (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (pass-if "ftp://foo@bar:22/baz"
+    (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (pass-if "foo"
+    (uri-reference? (build-uri #f #:path "foo")))
+
+  (pass-if "/foo"
+    (uri-reference? (build-uri #f #:path "/foo")))
+
+  (pass-if "//foo/bar"
+    (uri-reference? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (pass-if "?foo"
+    (uri-reference? (build-uri #f #:query "foo"))))
 
 (with-test-prefix "string->uri"
   (pass-if "ftp:"
@@ -149,6 +241,24 @@
            #:port 22
            #:path "/baz"))
 
+  (pass-if "foo"
+    (uri=? (string->uri "foo")
+           #:path "foo"))
+
+  (pass-if "/foo"
+    (uri=? (string->uri "/foo")
+           #:path "/foo"))
+
+  (pass-if "//foo/bar"
+    (uri=? (string->uri "//foo/bar")
+           #:host "foo"
+           #:path "/bar"))
+
+  (pass-if "?foo"
+    (uri=? (string->uri "?foo")
+           #:path ""
+           #:query "foo"))
+
   (pass-if "http://bad.host.1"
     (not (string->uri "http://bad.host.1")))
 
@@ -229,6 +339,21 @@
     (equal? "ftp://foo@bar:22/baz"
             (uri->string (string->uri "ftp://foo@bar:22/baz"))))
   
+  (pass-if "foo"
+    (equal? "foo"
+            (uri->string (string->uri "foo"))))
+
+  (pass-if "/foo"
+    (equal? "/foo" (uri->string (string->uri "/foo"))))
+
+  (pass-if "//foo/bar"
+    (equal? "//foo/bar"
+            (uri->string (string->uri "//foo/bar"))))
+
+  (pass-if "?foo"
+    (equal? "?foo"
+            (uri->string (string->uri "?foo"))))
+
   (when (memq 'socket *features*)
     (pass-if "http://192.0.2.1"
       (equal? "http://192.0.2.1"
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-24 11:23         ` Daniel Hartwig
@ 2012-11-24 15:10           ` Ludovic Courtès
  2012-11-24 15:34             ` Daniel Hartwig
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-24 15:10 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Hi Daniel,

Daniel Hartwig <mandyke@gmail.com> skribis:

> On 24 November 2012 06:19, Ludovic Courtès <ludo@gnu.org> wrote:
>> Any update on that?  The plan is to release 2.0.7 next week, so it’d be
>> great if this could be in.
>
> I have made a first attempt at the doc strings and manual.  This
> involved first syncronizing the two, as only the manual had been
> receiving updates.
>
> Some more tweaking to the code.

Thanks.  I applied the first two patches, and passed the source files
through:

  sed -e"s/@var{\([a-z0-9?!-]\+\)}/\U\1/g ; s/@code{\([^}]\+\)}/‘\1’/g"

because docstrings should not contain Texinfo markup.

> Personally I am not 100% on this, but I attach it for comment anyway.
> I will not be able to work on it again for a short while.
>
> A quick solution may be to silently introduce just enough to fix the
> current bug, and worry about the extra predicates, uri-record-type vs.
> rfc-definition-of-uri, etc. later.

I could come up with a ‘declare-relative-uri-header!’ that would use
(build-uri xxx #:validate? #f) as a quick fix.

However, it seems to me that your patch is actually fine, and doesn’t
break compatibility, so I’d rather apply it directly.  Did you have
other concerns?

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-24 15:10           ` Ludovic Courtès
@ 2012-11-24 15:34             ` Daniel Hartwig
  2012-11-26  0:15               ` Ludovic Courtès
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-24 15:34 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

On 24 November 2012 23:10, Ludovic Courtès <ludo@gnu.org> wrote:
>> Personally I am not 100% on this, but I attach it for comment anyway.
>> I will not be able to work on it again for a short while.
>>
>> A quick solution may be to silently introduce just enough to fix the
>> current bug, and worry about the extra predicates, uri-record-type vs.
>> rfc-definition-of-uri, etc. later.
>
> I could come up with a ‘declare-relative-uri-header!’ that would use
> (build-uri xxx #:validate? #f) as a quick fix.
>
> However, it seems to me that your patch is actually fine, and doesn’t
> break compatibility, so I’d rather apply it directly.  Did you have
> other concerns?

The API seems less clean, and it is not immediately clear
that uri? is not the top of the URI-like type hierarchy.  The other
functions only indicate “uri” in their name.  I did not
wish to introduce parallel “build-uri-reference”, etc. for each of
these, and did consider adding #:reference? on some to select
weaker validation.

I think I prefer to keep the base type predicate as uri?, and use
relative-ref? and absolute-uri? to distinguish.  This would mean
deviating from the RFC by:
- permitting fragments in absolute-uri?;
- not requiring a scheme in uri?;

which are both forgivable.

Maybe I am just too fussy!





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-24 15:34             ` Daniel Hartwig
@ 2012-11-26  0:15               ` Ludovic Courtès
  2012-11-26 23:13                 ` Ludovic Courtès
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-26  0:15 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

Hi!

Daniel Hartwig <mandyke@gmail.com> skribis:

> The API seems less clean, and it is not immediately clear
> that uri? is not the top of the URI-like type hierarchy.  The other
> functions only indicate “uri” in their name.  I did not
> wish to introduce parallel “build-uri-reference”, etc. for each of
> these, and did consider adding #:reference? on some to select
> weaker validation.

OK.

So for now, I’d go with this patch, which fixes the bug at hand:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 1396 bytes --]

diff --git a/module/web/http.scm b/module/web/http.scm
index 342f435..65ebe7d 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1185,6 +1185,17 @@ treated specially, and is just returned as a plain string."
     uri?
     write-uri))
 
+;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1)
+(define (declare-relative-uri-header! name)
+  (declare-header! name
+    (lambda (str)
+      (or (string->uri str)                       ; absolute URI
+          (build-uri #f                           ; relative URI
+                     #:path str
+                     #:validate? #f)))
+    uri?
+    write-uri))
+
 ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
 (define (declare-quality-list-header! name)
   (declare-header! name
@@ -1437,7 +1448,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Content-Location = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Content-Location")
+(declare-relative-uri-header! "Content-Location")
 
 ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
 ;;
@@ -1726,7 +1737,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Referer = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Referer")
+(declare-relative-uri-header! "Referer")
 
 ;; TE = #( t-codings )
 ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )

[-- Attachment #3: Type: text/plain, Size: 115 bytes --]


Once 2.0.7 is out, when you have more time, we can fix it cleanly.

How does that sound?

Thanks,
Ludo’.

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-26  0:15               ` Ludovic Courtès
@ 2012-11-26 23:13                 ` Ludovic Courtès
  2012-11-27  1:06                   ` Daniel Hartwig
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-26 23:13 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

ludo@gnu.org (Ludovic Courtès) skribis:

> So for now, I’d go with this patch, which fixes the bug at hand:

I just applied this patch as 261af76.

You’re welcome to polish support for URI references once 2.0.7 is out!
:-)

Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-26 23:13                 ` Ludovic Courtès
@ 2012-11-27  1:06                   ` Daniel Hartwig
  2012-11-27 12:50                     ` Ludovic Courtès
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-27  1:06 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

On 27 November 2012 07:13, Ludovic Courtès <ludo@gnu.org> wrote:
> ludo@gnu.org (Ludovic Courtès) skribis:
>
>> So for now, I’d go with this patch, which fixes the bug at hand:
>
> I just applied this patch as 261af76.
>

+;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1)
+(define (declare-relative-uri-header! name)
+  (declare-header! name
+    (lambda (str)
+      (or (string->uri str)                       ; absolute URI
+          (build-uri #f                           ; relative URI
+                     #:path str
+                     #:validate? #f)))
+    uri?
+    write-uri))
+

Sorry for late response.

Setting uri-path to str will result in wrongly constructed uri
objects.  In practice, the relative uri will often have a query part.
In theory, they may also contain any other part of the uri except
scheme (which would make them absolute).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-27  1:06                   ` Daniel Hartwig
@ 2012-11-27 12:50                     ` Ludovic Courtès
  2012-11-27 15:18                       ` Daniel Hartwig
  0 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-27 12:50 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Daniel Hartwig <mandyke@gmail.com> skribis:

> +;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1)
> +(define (declare-relative-uri-header! name)
> +  (declare-header! name
> +    (lambda (str)
> +      (or (string->uri str)                       ; absolute URI
> +          (build-uri #f                           ; relative URI
> +                     #:path str
> +                     #:validate? #f)))
> +    uri?
> +    write-uri))
> +
>
> Sorry for late response.
>
> Setting uri-path to str will result in wrongly constructed uri
> objects.  In practice, the relative uri will often have a query part.
> In theory, they may also contain any other part of the uri except
> scheme (which would make them absolute).

Sure.  But then again, the goal was just to have a hack that would solve
the problem initially reported here, while waiting for a proper fix.

I’m open to suggestions.  It seems to me that it’s either this or your
patches against (web uri).  WDYT?

Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-27 12:50                     ` Ludovic Courtès
@ 2012-11-27 15:18                       ` Daniel Hartwig
  2012-11-27 21:43                         ` Ludovic Courtès
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Hartwig @ 2012-11-27 15:18 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 12827

[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]

On 27 November 2012 20:50, Ludovic Courtès <ludo@gnu.org> wrote:
> Sure.  But then again, the goal was just to have a hack that would solve
> the problem initially reported here, while waiting for a proper fix.

Avoiding an obvious parser error, but introducing subtle problems with
the objects.  The reported bug has been present since always with (web
uri).

>
> I’m open to suggestions.  It seems to me that it’s either this or your
> patches against (web uri).  WDYT?

I still don't like the API in those patches, and don't think it is
tested enough against, e.g., reading and writing all combinations of
headers and idempotent (?) across read-write cycles (a quick check
just showed that it wasn't, due to write-uri in (web http)).

Anyway, if you like to fix this bug, I have isolated the changes to
string->uri, just enough to handle these headers without introducing
any API changes (which can come later, after refinement).

Attached a patch against current stable-2.0.

Regards

[-- Attachment #2: 0001-web-client-correctly-handle-uri-query-etc.-in-relati.patch --]
[-- Type: application/octet-stream, Size: 4336 bytes --]

From 7d9f3002a5bc0b897618759359c91bc94cd1fdec Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Tue, 27 Nov 2012 16:48:41 +0800
Subject: [PATCH] web client: correctly handle uri-query, etc. in relative URI
 headers

* module/web/uri.scm (string->uri*): New private procedure to also parse
  relative URIs.
* module/web/http.scm (declare-relative-uri-header!): Use that.
---
 module/web/http.scm |   12 +++---------
 module/web/uri.scm  |   30 ++++++++++++++++++++----------
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/module/web/http.scm b/module/web/http.scm
index f8dba30..216fddd 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1182,21 +1182,15 @@ treated specially, and is just returned as a plain string."
 (define (declare-uri-header! name)
   (declare-header! name
     (lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
-    uri?
+    (@@ (web uri) absolute-uri?)
     write-uri))
 
 ;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1)
 (define (declare-relative-uri-header! name)
   (declare-header! name
     (lambda (str)
-      ;; XXX: Attempt to build an absolute URI, and fall back to a URI
-      ;; with no scheme to represent a relative URI.
-      ;; See <http://bugs.gnu.org/12827> for ideas to fully support
-      ;; relative URIs (aka. "URI references").
-      (or (string->uri str)                       ; absolute URI
-          (build-uri #f                           ; relative URI
-                     #:path str
-                     #:validate? #f)))
+      (or ((@@ (web uri) string->uri*) str)
+          (bad-header-component 'uri str)))
     uri?
     write-uri))
 
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 6ff0076..b688ea8 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -53,6 +53,9 @@
   (query uri-query)
   (fragment uri-fragment))
 
+(define (absolute-uri? x)
+  (and (uri? x) (uri-scheme x) #t))
+
 (define (uri-error message . args)
   (throw 'uri-error message args))
 
@@ -165,21 +168,21 @@ is valid."
 (define fragment-pat
   ".*")
 (define uri-pat
-  (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$"
+  (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$"
           scheme-pat authority-pat path-pat query-pat fragment-pat))
 (define uri-regexp
   (make-regexp uri-pat))
 
-(define (string->uri string)
+(define (string->uri* string)
   "Parse STRING into a URI object.  Return ‘#f’ if the string
 could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
        (if (not m) (abort))
-       (let ((scheme (string->symbol
-                      (string-downcase (match:substring m 1))))
-             (authority (match:substring m 2))
-             (path (match:substring m 3))
-             (query (match:substring m 5))
+       (let ((scheme (let ((str (match:substring m 2)))
+                       (and str (string->symbol (string-downcase str)))))
+             (authority (match:substring m 3))
+             (path (match:substring m 4))
+             (query (match:substring m 6))
              (fragment (match:substring m 7)))
          (call-with-values
              (lambda ()
@@ -191,6 +194,12 @@ could not be parsed."
      (lambda (k)
        #f)))
 
+(define (string->uri string)
+  "Parse STRING into a URI object.  Return ‘#f’ if the string
+could not be parsed."
+  (let ((uri (string->uri* string)))
+    (and uri (uri-scheme uri) uri)))
+
 (define *default-ports* (make-hash-table))
 
 (define (declare-default-port! scheme port)
@@ -208,8 +217,7 @@ could not be parsed."
   "Serialize URI to a string.  If the URI has a port that is the
 default port for its scheme, the port is not included in the
 serialization."
-  (let* ((scheme-str (string-append
-                      (symbol->string (uri-scheme uri)) ":"))
+  (let* ((scheme (uri-scheme uri))
          (userinfo (uri-userinfo uri))
          (host (uri-host uri))
          (port (uri-port uri))
@@ -217,7 +225,9 @@ serialization."
          (query (uri-query uri))
          (fragment (uri-fragment uri)))
     (string-append
-     scheme-str
+     (if scheme
+         (string-append (symbol->string scheme) ":")
+         "")
      (if host
          (string-append "//"
                         (if userinfo (string-append userinfo "@")
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* bug#12827: [2.0.6] web client: fails to parse 404 header
  2012-11-27 15:18                       ` Daniel Hartwig
@ 2012-11-27 21:43                         ` Ludovic Courtès
  0 siblings, 0 replies; 18+ messages in thread
From: Ludovic Courtès @ 2012-11-27 21:43 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: 12827

Hi,

Daniel Hartwig <mandyke@gmail.com> skribis:

> I still don't like the API in those patches, and don't think it is
> tested enough against, e.g., reading and writing all combinations of
> headers and idempotent (?) across read-write cycles (a quick check
> just showed that it wasn't, due to write-uri in (web http)).

Fair enough.

> Anyway, if you like to fix this bug, I have isolated the changes to
> string->uri, just enough to handle these headers without introducing
> any API changes (which can come later, after refinement).

Applied, thanks!

Ludo’.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header)
  2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès
  2012-11-08  5:52 ` Daniel Hartwig
@ 2013-02-23  8:11 ` Daniel Hartwig
  1 sibling, 0 replies; 18+ messages in thread
From: Daniel Hartwig @ 2013-02-23  8:11 UTC (permalink / raw)
  To: guile-devel; +Cc: 12827, Ludovic Courtès

[-- Attachment #1: Type: text/plain, Size: 3250 bytes --]

On 24 November 2012 23:34, Daniel Hartwig <mandyke@gmail.com> wrote:
> The API seems less clean, and it is not immediately clear
> that uri? is not the top of the URI-like type hierarchy.  The other
> functions only indicate “uri” in their name.  I did not
> wish to introduce parallel “build-uri-reference”, etc. for each of
> these, and did consider adding #:reference? on some to select
> weaker validation.
>
> I think I prefer to keep the base type predicate as uri?, and use
> relative-ref? and absolute-uri? to distinguish.  This would mean
> deviating from the RFC by:
> - permitting fragments in absolute-uri?;
> - not requiring a scheme in uri?;
>
> which are both forgivable.
>
> Maybe I am just too fussy!

Hello now

Revisting this old thread, I have complete the bulk of the work now.
Attached are a few patches to generally clean up the web modules a
bit, followed by a larger one introducing the promised relative URI
support.

* Terminology

The terminology used in latest URI spec. (RFC 3986) is not widely used
elsewhere.  Not by Guile, not by the HTTP spec., or other sources.
Specifically, it defines these terms:

- URI: scheme rest ... [fragment]
- Absolute-URI: scheme rest ... [fragment]
- Relative-Ref: rest ... [fragment]
- URI-Reference: Absolute-URI | Relative-Ref

where as HTTP and other sources use the terms from an earlier URI
spec. (RFC 2396):

- Absolute-URI: scheme rest ... [fragment]
- Relative-URI: rest ... [fragment]
- URI, URI-Reference: Absolute-URI | Relative-URI

With this patch I have opted to use the later, more common terms.
This has the advantage of not requiring massive renaming or
duplicating of most procedures to include, e.g.
‘uri-reference-scheme’, as we just use the term ‘uri’ to refer to
either type.

If this is undesired, it can easily be reworked to use the terminology
from RFC 3986.

* API compatability

Presently, all APIs work only with absolute URIs.  You can not use
string->uri or build-uri to produce any relative URIs, neither are
other procedures (generally) expected to work correctly if given them.

What we have in this patch is that <uri> grows to encompass both
relative and absolute URIs.  ‘uri?’ is a general type predicate,
‘build-uri’ will produce and validate either type, and there are pairs
of converters and predicates to distinguish between relative and
absolute.

Effectively, a pseudo-type heirarchy, with uri? at the top and
absolute-uri? and relative-uri? beneath it.

* To be done

Barely touched request, response, client, or server modules.  Though
these will continue to work with current usage patterns and I have
added some notes about future work.

Also, I believe it will pay to extend http-get et. al to accept a
relative URI with separate Host header or keyword option.  Also allow
write-request-line to display exactly the URI passed to it, rather
than always chopping off the scheme and host (e.g. the HTTP spec.
allows such lines and they are require to write some types of proxy
software).

Coming along soon is a procedure to resolve a relative URI against a
base, absolute URI.  The same algorithm documented in the RFC.


Regards

[-- Attachment #2: 0001-minor-tweaks-to-web-documentation.patch --]
[-- Type: application/octet-stream, Size: 17969 bytes --]

From a25a817be56821d2656da43bcbae4f2a816a801f Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 23 Feb 2013 13:39:14 +0800
Subject: [PATCH 1/4] minor tweaks to web documentation

* doc/ref/web.texi: Say `World Wide Web'; the hyphenated form is almost
  never used (c.f. w3.org).

  General predicate arguments are named `obj'.  Fill in arguments
  omitted from some procedure definitions (e.g. request-method).

  Minor tweaks, such as using en-dash and missing markup as appropriate.
  Wrap very long deffn lines.

  (HTTP): `parse-header' and `write-header' use `sym' to be consistent
  with other procedures and docstrings.

* module/web/*.scm: Expand texinfo markup in docstrings.  Synchronize
  with changes in web.texi.
---
 doc/ref/web.texi        |   89 +++++++++++++++++++++++++++--------------------
 module/web/client.scm   |    6 ++--
 module/web/http.scm     |    6 ++--
 module/web/response.scm |    6 ++--
 module/web/uri.scm      |   28 +++++++--------
 5 files changed, 73 insertions(+), 62 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 6c33f32..82ef31b 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -10,7 +10,7 @@
 @cindex HTTP
 
 It has always been possible to connect computers together and share
-information between them, but the rise of the World-Wide Web over the
+information between them, but the rise of the World Wide Web over the
 last couple of decades has made it much easier to do so.  The result is
 a richly connected network of computation, in which Guile forms a part.
 
@@ -206,9 +206,10 @@ The following procedures can be found in the @code{(web uri)}
 module. Load it into your Guile, using a form like the above, to have
 access to them.
 
-@deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
-       [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
-       [#:fragment=@code{#f}] [#:validate?=@code{#t}]
+@deffn {Scheme Procedure} build-uri scheme @
+       [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
+       [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
+       [#:validate?=@code{#t}]
 Construct a URI object.  @var{scheme} should be a symbol, @var{port}
 either a positive, exact integer or @code{#f}, and the rest of the
 fields are either strings or @code{#f}.  If @var{validate?} is true,
@@ -216,7 +217,7 @@ also run some consistency checks to make sure that the constructed URI
 is valid.
 @end deffn
 
-@deffn {Scheme Procedure} uri? x
+@deffn {Scheme Procedure} uri? obj
 @deffnx {Scheme Procedure} uri-scheme uri
 @deffnx {Scheme Procedure} uri-userinfo uri
 @deffnx {Scheme Procedure} uri-host uri
@@ -249,9 +250,9 @@ Percent-decode the given @var{str}, according to @var{encoding}, which
 should be the name of a character encoding.
 
 Note that this function should not generally be applied to a full URI
-string. For paths, use split-and-decode-uri-path instead. For query
-strings, split the query on @code{&} and @code{=} boundaries, and decode
-the components separately.
+string. For paths, use @code{split-and-decode-uri-path} instead. For
+query strings, split the query on @code{&} and @code{=} boundaries, and
+decode the components separately.
 
 Note also that percent-encoded strings encode @emph{bytes}, not
 characters.  There is no guarantee that a given byte sequence is a valid
@@ -378,7 +379,8 @@ For more on the set of headers that Guile knows about out of the box,
 @pxref{HTTP Headers}.  To add your own, use the @code{declare-header!}
 procedure:
 
-@deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}]
+@deffn {Scheme Procedure} declare-header! name parser validator writer @
+       [#:multiple?=@code{#f}]
 Declare a parser, validator, and writer for a given header.
 @end deffn
 
@@ -421,12 +423,12 @@ Returns the end-of-file object for both values if the end of the message
 body was reached (i.e., a blank line).
 @end deffn
 
-@deffn {Scheme Procedure} parse-header name val
+@deffn {Scheme Procedure} parse-header sym val
 Parse @var{val}, a string, with the parser for the header named
-@var{name}.  Returns the parsed value.
+@var{sym}.  Returns the parsed value.
 @end deffn
 
-@deffn {Scheme Procedure} write-header name val port
+@deffn {Scheme Procedure} write-header sym val port
 Write the given header name and value to @var{port}, using the writer
 from @code{header-writer}.
 @end deffn
@@ -450,7 +452,7 @@ like @code{GET}.
 @end deffn
 
 @deffn {Scheme Procedure} parse-http-version str [start] [end]
-Parse an HTTP version from @var{str}, returning it as a major-minor
+Parse an HTTP version from @var{str}, returning it as a major--minor
 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
 @code{(1 . 1)}.
 @end deffn
@@ -471,7 +473,7 @@ Write the first line of an HTTP request to @var{port}.
 
 @deffn {Scheme Procedure} read-response-line port
 Read the first line of an HTTP response from @var{port}, returning three
-values: the HTTP version, the response code, and the "reason phrase".
+values: the HTTP version, the response code, and the ``reason phrase''.
 @end deffn
 
 @deffn {Scheme Procedure} write-response-line version code reason-phrase port
@@ -1130,13 +1132,13 @@ any loss of generality.
 
 @subsubsection Request API
 
-@deffn {Scheme Procedure} request? 
-@deffnx {Scheme Procedure} request-method 
-@deffnx {Scheme Procedure} request-uri 
-@deffnx {Scheme Procedure} request-version 
-@deffnx {Scheme Procedure} request-headers 
-@deffnx {Scheme Procedure} request-meta 
-@deffnx {Scheme Procedure} request-port 
+@deffn {Scheme Procedure} request? obj
+@deffnx {Scheme Procedure} request-method request
+@deffnx {Scheme Procedure} request-uri request
+@deffnx {Scheme Procedure} request-version request
+@deffnx {Scheme Procedure} request-headers request
+@deffnx {Scheme Procedure} request-meta request
+@deffnx {Scheme Procedure} request-port request
 A predicate and field accessors for the request type.  The fields are as
 follows:
 @table @code
@@ -1170,7 +1172,9 @@ request, you may read the body separately, and likewise for writing
 requests.
 @end deffn
 
-@deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
+@deffn {Scheme Procedure} build-request uri [#:method='GET] @
+       [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] @
+       [#:validate-headers?=#t]
 Construct an HTTP request object. If @var{validate-headers?} is true,
 the headers are each run through their respective validators.
 @end deffn
@@ -1237,7 +1241,8 @@ more information on the format of parsed headers.
 Return the given request header, or @var{default} if none was present.
 @end deffn
 
-@deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
+@deffn {Scheme Procedure} request-absolute-uri r @
+       [default-host=#f] [default-port=#f]
 A helper routine to determine the absolute URI of a request, using the
 @code{host} header and the default host and port.
 @end deffn
@@ -1253,12 +1258,12 @@ A helper routine to determine the absolute URI of a request, using the
 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
 responses.  Again, the body is represented separately from the request.
 
-@deffn {Scheme Procedure} response? 
-@deffnx {Scheme Procedure} response-version 
-@deffnx {Scheme Procedure} response-code 
+@deffn {Scheme Procedure} response? obj
+@deffnx {Scheme Procedure} response-version response
+@deffnx {Scheme Procedure} response-code response
 @deffnx {Scheme Procedure} response-reason-phrase response
-@deffnx {Scheme Procedure} response-headers 
-@deffnx {Scheme Procedure} response-port 
+@deffnx {Scheme Procedure} response-headers response
+@deffnx {Scheme Procedure} response-port response
 A predicate and field accessors for the response type.  The fields are as
 follows:
 @table @code
@@ -1284,7 +1289,9 @@ As a side effect, sets the encoding on @var{port} to ISO-8859-1
 discussion of character sets in @ref{Responses}, for more information.
 @end deffn
 
-@deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
+@deffn {Scheme Procedure} build-response [#:version='(1 . 1)] @
+       [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] @
+       [#:validate-headers?=#t]
 Construct an HTTP response object. If @var{validate-headers?} is true,
 the headers are each run through their respective validators.
 @end deffn
@@ -1384,6 +1391,10 @@ Return @code{#t} if @var{type}, a symbol as returned by
 @code{(web client)} provides a simple, synchronous HTTP client, built on
 the lower-level HTTP, request, and response modules.
 
+@example
+(use-modules (web client))
+@end example
+
 @deffn {Scheme Procedure} open-socket-for-uri uri
 Return an open input/output port for a connection to URI.
 @end deffn
@@ -1419,9 +1430,9 @@ If you already have a port open, pass it as @var{port}.  Otherwise, a
 connection will be opened to the server corresponding to @var{uri}.  Any
 extra headers in the alist @var{headers} will be added to the request.
 
-If @var{body} is not #f, a message body will also be sent with the HTTP
-request.  If @var{body} is a string, it is encoded according to the
-content-type in @var{headers}, defaulting to UTF-8.  Otherwise
+If @var{body} is not @code{#f}, a message body will also be sent with
+the HTTP request.  If @var{body} is a string, it is encoded according to
+the content-type in @var{headers}, defaulting to UTF-8.  Otherwise
 @var{body} should be a bytevector, or @code{#f} for no body.  Although a
 message body may be sent with any request, usually only @code{POST} and
 @code{PUT} requests have bodies.
@@ -1480,8 +1491,8 @@ The life cycle of a server goes as follows:
 
 @enumerate
 @item
-The @code{open} hook is called, to open the server. @code{open} takes 0 or
-more arguments, depending on the backend, and returns an opaque
+The @code{open} hook is called, to open the server. @code{open} takes
+zero or more arguments, depending on the backend, and returns an opaque
 server socket object, or signals an error.
 
 @item
@@ -1578,8 +1589,8 @@ in, allowing the user's handler to explicitly manage its state.
 @end deffn
 
 @deffn {Scheme Procedure} sanitize-response request response body
-"Sanitize" the given response and body, making them appropriate for the
-given request.
+``Sanitize'' the given response and body, making them appropriate for
+the given request.
 
 As a convenience to web handler authors, @var{response} may be given as
 an alist of headers, in which case it is used to construct a default
@@ -1615,7 +1626,8 @@ and body, and write the response to the client.  Return the new state
 produced by the handler procedure.
 @end deffn
 
-@deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state
+@deffn {Scheme Procedure} run-server handler [impl='http] @
+       [open-params='()] . state
 Run Guile's built-in web server.
 
 @var{handler} should be a procedure that takes two or more arguments,
@@ -1636,7 +1648,8 @@ explicitly manage its state.
 The default web server implementation is @code{http}, which binds to a
 socket, listening for request on that port.
 
-@deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
+@deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] @
+       [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
 The default HTTP implementation.  We document it as a function with
 keyword arguments, because that is precisely the way that it is -- all
 of the @var{open-params} to @code{run-server} get passed to the
diff --git a/module/web/client.scm b/module/web/client.scm
index 9fbb25b..7d5ea49 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -248,10 +248,10 @@ pass it as PORT.  The port will be closed at the end of the
 request unless KEEP-ALIVE? is true.  Any extra headers in the
 alist HEADERS will be added to the request.
 
-If BODY is not #f, a message body will also be sent with the HTTP
+If BODY is not ‘#f’, a message body will also be sent with the HTTP
 request.  If BODY is a string, it is encoded according to the
 content-type in HEADERS, defaulting to UTF-8.  Otherwise BODY should be
-a bytevector, or #f for no body.  Although it's allowed to send a
+a bytevector, or ‘#f’ for no body.  Although it's allowed to send a
 message body along with any request, usually only POST and PUT requests
 have bodies.  See ‘http-put’ and ‘http-post’ documentation, for more.
 
@@ -317,7 +317,7 @@ This function is similar to ‘http-get’, except it uses the \"HEAD\"
 method.  See ‘http-get’ for full documentation on the various keyword
 arguments that are accepted by this function.
 
-Returns two values: the resulting response, and #f.  Responses to HEAD
+Returns two values: the resulting response, and ‘#f’.  Responses to HEAD
 requests do not have a body.  The second value is only returned so that
 other procedures can treat all of the http-foo verbs identically.")
 
diff --git a/module/web/http.scm b/module/web/http.scm
index c79d57d..712208b 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -167,7 +167,7 @@ The default writer is ‘display’."
 (define *eof* (call-with-input-string "" read))
 
 (define (read-header port)
-  "Reads one HTTP header from PORT. Returns two values: the header
+  "Read one HTTP header from PORT. Return two values: the header
 name and the parsed Scheme value. May raise an exception if the header
 was known but the value was invalid.
 
@@ -220,7 +220,7 @@ as an ordered alist."
 
 (define (write-headers headers port)
   "Write the given header alist to PORT.  Doesn't write the final
-@samp{\\r\\n}, as the user might want to add another header."
+‘\\r\\n’, as the user might want to add another header."
   (let lp ((headers headers))
     (if (pair? headers)
         (begin
@@ -971,7 +971,7 @@ as an ordered alist."
 (define *known-versions* '())
 
 (define* (parse-http-version str #:optional (start 0) (end (string-length str)))
-  "Parse an HTTP version from STR, returning it as a major-minor
+  "Parse an HTTP version from STR, returning it as a major–minor
 pair. For example, ‘HTTP/1.1’ parses as the pair of integers,
 ‘(1 . 1)’."
   (or (let lp ((known *known-versions*))
diff --git a/module/web/response.scm b/module/web/response.scm
index 7e14f4d..de38abc 100644
--- a/module/web/response.scm
+++ b/module/web/response.scm
@@ -267,10 +267,10 @@ closes PORT, unless KEEP-ALIVE? is true."
 (define* (response-body-port r #:key (decode? #t) (keep-alive? #t))
   "Return an input port from which the body of R can be read.  The
 encoding of the returned port is set according to R's ‘content-type’
-header, when it's textual, except if DECODE? is #f.  Return #f when no
-body is available.
+header, when it's textual, except if DECODE? is ‘#f’.  Return ‘#f’ when
+no body is available.
 
-When KEEP-ALIVE? is #f, closing the returned port also closes R's
+When KEEP-ALIVE? is ‘#f’, closing the returned port also closes R's
 response port."
   (define port
     (if (member '(chunked) (response-transfer-encoding r))
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 25406b3..33db1d1 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -53,8 +53,8 @@
   (query uri-query)
   (fragment uri-fragment))
 
-(define (absolute-uri? x)
-  (and (uri? x) (uri-scheme x) #t))
+(define (absolute-uri? obj)
+  (and (uri? obj) (uri-scheme obj) #t))
 
 (define (uri-error message . args)
   (throw 'uri-error message args))
@@ -309,17 +309,16 @@ serialization."
 which should be the name of a character encoding.
 
 Note that this function should not generally be applied to a full URI
-string. For paths, use split-and-decode-uri-path instead. For query
+string. For paths, use ‘split-and-decode-uri-path’ instead. For query
 strings, split the query on ‘&’ and ‘=’ boundaries, and decode
 the components separately.
 
-Note also that percent-encoded strings encode @emph{bytes}, not
-characters.  There is no guarantee that a given byte sequence is a valid
-string encoding. Therefore this routine may signal an error if the
-decoded bytes are not valid for the given encoding. Pass ‘#f’ for
-ENCODING if you want decoded bytes as a bytevector directly.
-@xref{Ports, ‘set-port-encoding!’}, for more information on
-character encodings.
+Note also that percent-encoded strings encode _bytes_, not characters.
+There is no guarantee that a given byte sequence is a valid string
+encoding. Therefore this routine may signal an error if the decoded
+bytes are not valid for the given encoding. Pass ‘#f’ for ENCODING if
+you want decoded bytes as a bytevector directly.  See
+‘set-port-encoding!’ for more information on character encodings.
 
 Returns a string of the decoded characters, or a bytevector if
 ENCODING was ‘#f’."
@@ -380,11 +379,10 @@ ENCODING was ‘#f’."
 UNESCAPED-CHARS.
 
 The default character set includes alphanumerics from ASCII, as well as
-the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}.  Any
-other character will be percent-encoded, by writing out the character to
-a bytevector within the given ENCODING, then encoding each byte as
-‘%HH’, where HH is the hexadecimal representation of
-the byte."
+the special characters ‘-’, ‘.’, ‘_’, and ‘~’.  Any other character will
+be percent-encoded, by writing out the character to a bytevector within
+the given ENCODING, then encoding each byte as ‘%HH’, where HH is the
+hexadecimal representation of the byte."
   (define (needs-escaped? ch)
     (not (char-set-contains? unescaped-chars ch)))
   (if (string-index str needs-escaped?)
-- 
1.7.10.4


[-- Attachment #3: 0002-web-public-access-to-default-port-information.patch --]
[-- Type: application/octet-stream, Size: 3078 bytes --]

From c5b77b2735f688182846c9955eaf451f49d71bef Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 23 Feb 2013 13:51:50 +0800
Subject: [PATCH 2/4] web: public access to default-port information

* module/web/uri.scm (default-port): New procedure.
  (default-port?): Export.  Add docstring.

* module/web/http.scm (write-uri): Use `default-port?'.

* doc/ref/web.texi (Universal Resouce Identifiers): Document.
---
 doc/ref/web.texi    |   11 +++++++++++
 module/web/http.scm |    2 +-
 module/web/uri.scm  |   12 ++++++++++--
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 82ef31b..963ae3f 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -245,6 +245,17 @@ serialization.
 Declare a default port for the given URI scheme.
 @end deffn
 
+@deffn {Scheme Procedure} default-port scheme
+Return the default port for @var{scheme} if one has been declared,
+otherwise @code{#f}.
+@end deffn
+
+@deffn {Scheme Procedure} default-port? scheme port
+Return @code{#t} when @var{port} @emph{matches} the default port for
+@var{scheme}.  Note that @var{port} may be @code{#f}, which implies a
+match and consequent return value of @code{#t}.
+@end deffn
+
 @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
 Percent-decode the given @var{str}, according to @var{encoding}, which
 should be the name of a character encoding.
diff --git a/module/web/http.scm b/module/web/http.scm
index 712208b..5671330 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1064,7 +1064,7 @@ three values: the method, the URI, and the version."
               (display #\@ port)))
         (display (uri-host uri) port)
         (let ((p (uri-port uri)))
-          (if (and p (not (eqv? p 80)))
+          (if (not (default-port? (uri-scheme uri) p))
               (begin
                 (display #\: port)
                 (display p port))))))
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 33db1d1..2e8c4a6 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -36,7 +36,7 @@
             uri-path uri-query uri-fragment
 
             build-uri
-            declare-default-port!
+            declare-default-port! default-port default-port?
             string->uri uri->string
             uri-decode uri-encode
             split-and-decode-uri-path
@@ -206,9 +206,17 @@ could not be parsed."
   "Declare a default port for the given URI scheme."
   (hashq-set! *default-ports* scheme port))
 
+(define (default-port scheme)
+  "Return the default port for SCHEME if one has been declared,
+otherwise ‘#f’."
+  (hashq-ref *default-ports* scheme))
+
 (define (default-port? scheme port)
+  "Return ‘#t’ when PORT _matches_ the default port for SCHEME.  Note
+that PORT may be ‘#f’, which implies a match and consequent return value
+of ‘#t’."
   (or (not port)
-      (eqv? port (hashq-ref *default-ports* scheme))))
+      (eqv? port (default-port scheme))))
 
 (declare-default-port! 'http 80)
 (declare-default-port! 'https 443)
-- 
1.7.10.4


[-- Attachment #4: 0003-add-tests-for-read-request-line-etc.patch --]
[-- Type: application/octet-stream, Size: 5619 bytes --]

From 2bf9660cfdf7c4f0ffbc10a80433910eb2d23bf6 Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 23 Feb 2013 15:15:33 +0800
Subject: [PATCH 3/4] add tests for read-request-line, etc.

* test-suite/web/web-http.test ("read-request-line"):
  ("write-request-line", "read-response-line", "write-response-line"):
  Add.
---
 test-suite/tests/web-http.test |  107 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..6fa16bd 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -85,6 +85,113 @@
                #t
                (error "unexpected exception" component arg))))))))
 
+(define-syntax pass-if-read-request-line
+  (syntax-rules ()
+    ((_ str expected-method expected-uri expected-version)
+     (pass-if str
+       (equal? (call-with-values
+                   (lambda ()
+                     (read-request-line (open-input-string
+                                         (string-append str "\r\n"))))
+                 list)
+               (list 'expected-method
+                     expected-uri
+                     'expected-version))))))
+
+(define-syntax pass-if-write-request-line
+  (syntax-rules ()
+    ((_ expected-str method uri version)
+     (pass-if expected-str
+       (equal? (string-append expected-str "\r\n")
+               (call-with-output-string
+                (lambda (port)
+                  (write-request-line 'method uri 'version port))))))))
+
+(define-syntax pass-if-read-response-line
+  (syntax-rules ()
+    ((_ str expected-version expected-code expected-phrase)
+     (pass-if str
+       (equal? (call-with-values
+                   (lambda ()
+                     (read-response-line (open-input-string
+                                          (string-append str "\r\n"))))
+                 list)
+               (list 'expected-version
+                     expected-code
+                     expected-phrase))))))
+
+(define-syntax pass-if-write-response-line
+  (syntax-rules ()
+    ((_ expected-str version code phrase)
+     (pass-if expected-str
+       (equal? (string-append expected-str "\r\n")
+               (call-with-output-string
+                (lambda (port)
+                  (write-response-line 'version code phrase port))))))))
+
+(with-test-prefix "read-request-line"
+  (pass-if-read-request-line "GET / HTTP/1.1"
+                             GET
+                             (build-uri 'http
+                                        #:path "/")
+                             (1 . 1))
+  (pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1"
+                             GET
+                             (build-uri 'http
+                                        #:host "www.w3.org"
+                                        #:path "/pub/WWW/TheProject.html")
+                             (1 . 1))
+  (pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
+                             GET
+                             (build-uri 'http
+                                        #:path "/pub/WWW/TheProject.html")
+                             (1 . 1))
+  (pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
+                             HEAD
+                             (build-uri 'http
+                                        #:path "/etc/hosts"
+                                        #:query "foo=bar")
+                             (1 . 1)))
+
+(with-test-prefix "write-request-line"
+  (pass-if-write-request-line "GET / HTTP/1.1"
+                              GET
+                              (build-uri 'http
+                                         #:path "/")
+                              (1 . 1))
+  ;;; FIXME: Test fails due to scheme, host always being removed.
+  ;;; However, it should be supported to request these be present, and
+  ;;; that is possible with absolute/relative URI support.
+  ;; (pass-if-write-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1"
+  ;;                             GET
+  ;;                             (build-uri 'http
+  ;;                                        #:host "www.w3.org"
+  ;;                                        #:path "/pub/WWW/TheProject.html")
+  ;;                             (1 . 1))
+  (pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
+                              GET
+                              (build-uri 'http
+                                         #:path "/pub/WWW/TheProject.html")
+                              (1 . 1))
+  (pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
+                              HEAD
+                              (build-uri 'http
+                                         #:path "/etc/hosts"
+                                         #:query "foo=bar")
+                              (1 . 1)))
+
+(with-test-prefix "read-response-line"
+  (pass-if-read-response-line "HTTP/1.0 404 Not Found"
+                              (1 . 0) 404 "Not Found")
+  (pass-if-read-response-line "HTTP/1.1 200 OK"
+                              (1 . 1) 200 "OK"))
+
+(with-test-prefix "write-response-line"
+  (pass-if-write-response-line "HTTP/1.0 404 Not Found"
+                               (1 . 0) 404 "Not Found")
+  (pass-if-write-response-line "HTTP/1.1 200 OK"
+                               (1 . 1) 200 "OK"))
+
 (with-test-prefix "general headers"
 
   (pass-if-parse cache-control "no-transform" '(no-transform))
-- 
1.7.10.4


[-- Attachment #5: 0004-extend-support-for-relative-URIs-to-public-interface.patch --]
[-- Type: application/octet-stream, Size: 22649 bytes --]

From b70bcaba50bf208a81841be2550b3f187967114c Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 23 Feb 2013 15:49:20 +0800
Subject: [PATCH 4/4] extend support for relative URIs to public interface

* module/web/uri.scm (absolute-uri?, relative-uri?): New predicates.

  (validate-uri): SCHEME can be `#f' as for a relative URI.
  (build-uri): SCHEME is optional.  Validation is relaxed to also
  permit relative URIs.

  (string->uri): Fold in `string->uri*' and parse _any_ URI string,
  absolute or relative.
  (string->absolute-uri, string->relative-uri): New procedures.  Parse
  only strings representing the specific type of URI.

* module/web/http.scm (declare-absolute-uri-header!): Rename from
  `declare-uri-header!' to reflect new terminology.  Use new public uri
  interfaces to parse and validate.
  (declare-uri-header!): Rename from `declare-relative-uri-header!' to
  reflect new terminology.  Use new public uri interfaces to parsing.

  (write-uri): Do not display an absent `uri-scheme', however, do
  display the scheme even when `uri-host' is absent.  Add note to look
  at using `uri->string'.

  (parse-request-uri): Use new `string->absolute-uri' in the else
  clause.  No longer imply that scheme is `http', which was wrong for
  HTTPS.  Add note to later support "authority" form used by the CONNECT
  method.

* test-suite/tests/web-http.test ("read-request-line"):
  ("write-request-line"): Remove scheme from URIs as appropriate.

  ("entity headers", "request headers"): content-location and refer
  should also parse relative URIs.
  ("response headers"): location should not parse relative URIs.

* test-suite/tests/web-request.test ("example-1"): Do not expect any
  uri-scheme.

* test-suite/tests/web-uri.test ("build-uri"): Add tests for relative
  URIs.
  ("string->uri", "uri->string"): Add symmetric tests for handling of
  relative URIs.
  ("string->absolute-uri", "string->relative-uri"): New tests.

* doc/ref/web.texi (URIs): Add introductory note about absolute
  vs. relative URIs.  Document new procedures.

  (HTTP Headers): Note that some headers are absolute URIs.
---
 doc/ref/web.texi                  |   43 ++++++++++++----
 module/web/http.scm               |   40 ++++++++------
 module/web/uri.scm                |   39 +++++++++-----
 test-suite/tests/web-http.test    |   33 +++++++-----
 test-suite/tests/web-request.test |    2 +-
 test-suite/tests/web-uri.test     |  103 +++++++++++++++++++++++++++++++++++--
 6 files changed, 205 insertions(+), 55 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 963ae3f..7cfd9e5 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -198,6 +198,15 @@ fragment identifies a part of a resource, not the resource itself.  But
 it is useful to have a fragment field in the URI record itself, so we
 hope you will forgive the inconsistency.
 
+Finally, there are URI-like objects that omit the scheme part.
+Depending on these reference, these either are or are not considered
+proper URIs.  In Guile there is a single URI record type that holds any
+URI-like object.  This manual uses the term @dfn{absolute URI} to refer
+to a URI object with a scheme, and @dfn{relative URI} to refer to one
+without.  In cases where either type will do, the term @dfn{URI} is
+used.  For example, @indicateurl{/path/to/foo} is a relative URI, where
+as all of the previous examples are absolute URIs.
+
 @example
 (use-modules (web uri))
 @end example
@@ -206,15 +215,15 @@ The following procedures can be found in the @code{(web uri)}
 module. Load it into your Guile, using a form like the above, to have
 access to them.
 
-@deffn {Scheme Procedure} build-uri scheme @
+@deffn {Scheme Procedure} build-uri [scheme] @
        [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
        [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
        [#:validate?=@code{#t}]
-Construct a URI object.  @var{scheme} should be a symbol, @var{port}
-either a positive, exact integer or @code{#f}, and the rest of the
-fields are either strings or @code{#f}.  If @var{validate?} is true,
-also run some consistency checks to make sure that the constructed URI
-is valid.
+Construct a URI object.  @var{scheme} should be a symbol, @var{port} a
+positive, exact integer, and the rest of the fields are strings.  Any
+field except @var{port} may also be @code{#f} to indicate it is not
+present.  If @var{validate?} is true, also run some consistency checks
+to make sure that the constructed URI is valid.
 @end deffn
 
 @deffn {Scheme Procedure} uri? obj
@@ -226,8 +235,13 @@ is valid.
 @deffnx {Scheme Procedure} uri-query uri
 @deffnx {Scheme Procedure} uri-fragment uri
 A predicate and field accessors for the URI record type.  The URI scheme
-will be a symbol, the port either a positive, exact integer or @code{#f},
-and the rest either strings or @code{#f} if not present.
+will be a symbol, the port a positive, exact integer, and the rest
+either strings.  Any field except port may be @code{#f} if not present.
+@end deffn
+
+@deffn {Scheme Procedure} absolute-uri? obj
+@deffnx {Scheme Procedure} relative-uri? obj
+Return @code{#t} iff @var{obj} is a URI object of the indicated type.
 @end deffn
 
 @deffn {Scheme Procedure} string->uri string
@@ -235,6 +249,12 @@ Parse @var{string} into a URI object.  Return @code{#f} if the string
 could not be parsed.
 @end deffn
 
+@deffn {Scheme Procedure} string->absolute-uri string
+@deffnx {Scheme Procedure} string->relative-uri string
+Parse @var{string} into a URI object of the indicated type.  Return
+@code{#f} if the string could not be parsed.
+@end deffn
+
 @deffn {Scheme Procedure} uri->string uri
 Serialize @var{uri} to a string.  If the URI has a port that is the
 default port for its scheme, the port is not included in the
@@ -986,9 +1006,10 @@ The entity-tag of the resource.
 @end example
 @end deftypevr
 
-@deftypevr {HTTP Header} URI location
-A URI on which a request may be completed.  Used in combination with a
-redirecting status code to perform client-side redirection.
+@deftypevr {HTTP Header} Absolute-URI location
+An absolute URI on which a request may be completed.  Used in
+combination with a redirecting status code to perform client-side
+redirection.
 @example
 (parse-header 'location "http://example.com/other")
 @result{} #<uri ...>
diff --git a/module/web/http.scm b/module/web/http.scm
index 5671330..2db64d0 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1021,6 +1021,9 @@ symbol, like ‘GET’."
    ((string= str "TRACE" start end) 'TRACE)
    (else (bad-request "Invalid method: ~a" (substring str start end)))))
 
+;; FIXME: At the moment we do not support the CONNECT form (see above)
+;; but we should and this requires to support a request-URI which is
+;; just "authority" (i.e. [userinfo "@"] host [":" port]).
 (define* (parse-request-uri str #:optional (start 0) (end (string-length str)))
   "Parse a URI from an HTTP request line.  Note that URIs in requests do
 not have to have a scheme or host name.  The result is a URI object."
@@ -1033,12 +1036,11 @@ not have to have a scheme or host name.  The result is a URI object."
     (let* ((q (string-index str #\? start end))
            (f (string-index str #\# start end))
            (q (and q (or (not f) (< q f)) q)))
-      (build-uri 'http
-                 #:path (substring str start (or q f end))
+      (build-uri #:path (substring str start (or q f end))
                  #:query (and q (substring str (1+ q) (or f end)))
                  #:fragment (and f (substring str (1+ f) end)))))
    (else
-    (or (string->uri (substring str start end))
+    (or (string->absolute-uri (substring str start end))
         (bad-request "Invalid URI: ~a" (substring str start end))))))
 
 (define (read-request-line port)
@@ -1053,11 +1055,17 @@ three values: the method, the URI, and the version."
                 (parse-http-version line (1+ d1) (string-length line)))
         (bad-request "Bad Request-Line: ~s" line))))
 
+;; FIXME: The validation here should be reconsidered and moved to
+;; individual header validators if they do not already covered.  Then
+;; this procedure should be using uri->string.
 (define (write-uri uri port)
-  (if (uri-host uri)
+  (if (uri-scheme uri)
       (begin
         (display (uri-scheme uri) port)
-        (display "://" port)
+        (display #\:)))
+  (if (uri-host uri)
+      (begin
+        (display "//" port)
         (if (uri-userinfo uri)
             (begin
               (display (uri-userinfo uri) port)
@@ -1171,18 +1179,20 @@ treated specially, and is just returned as a plain string."
   (declare-header! name
     parse-non-negative-integer non-negative-integer? display))
 
-;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1)
-(define (declare-uri-header! name)
+;; emacs: (put 'declare-absolute-uri-header! 'scheme-indent-function 1)
+(define (declare-absolute-uri-header! name)
   (declare-header! name
-    (lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
-    (@@ (web uri) absolute-uri?)
+    (lambda (str)
+      (or (string->absolute-uri str)
+          (bad-header-component 'absolute-uri str)))
+    absolute-uri?
     write-uri))
 
-;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1)
-(define (declare-relative-uri-header! name)
+;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1)
+(define (declare-uri-header! name)
   (declare-header! name
     (lambda (str)
-      (or ((@@ (web uri) string->uri*) str)
+      (or (string->uri str)
           (bad-header-component 'uri str)))
     uri?
     write-uri))
@@ -1449,7 +1459,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Content-Location = ( absoluteURI | relativeURI )
 ;;
-(declare-relative-uri-header! "Content-Location")
+(declare-uri-header! "Content-Location")
 
 ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
 ;;
@@ -1738,7 +1748,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Referer = ( absoluteURI | relativeURI )
 ;;
-(declare-relative-uri-header! "Referer")
+(declare-uri-header! "Referer")
 
 ;; TE = #( t-codings )
 ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )
@@ -1775,7 +1785,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Location = absoluteURI
 ;; 
-(declare-uri-header! "Location")
+(declare-absolute-uri-header! "Location")
 
 ;; Proxy-Authenticate = 1#challenge
 ;;
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 2e8c4a6..440e620 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -31,13 +31,14 @@
   #:use-module (ice-9 control)
   #:use-module (rnrs bytevectors)
   #:use-module (ice-9 binary-ports)
-  #:export (uri?
+  #:export (uri? absolute-uri? relative-uri?
             uri-scheme uri-userinfo uri-host uri-port
             uri-path uri-query uri-fragment
 
             build-uri
             declare-default-port! default-port default-port?
             string->uri uri->string
+            string->absolute-uri string->relative-uri
             uri-decode uri-encode
             split-and-decode-uri-path
             encode-and-join-uri-path))
@@ -54,8 +55,13 @@
   (fragment uri-fragment))
 
 (define (absolute-uri? obj)
+  "Return ‘#t’ iff OBJ is an absolute URI object."
   (and (uri? obj) (uri-scheme obj) #t))
 
+(define (relative-uri? obj)
+  "Return ‘#t’ iff OBJ is a relative URI object."
+  (and (uri? obj) (not (uri-scheme obj))))
+
 (define (uri-error message . args)
   (throw 'uri-error message args))
 
@@ -64,7 +70,7 @@
 
 (define (validate-uri scheme userinfo host port path query fragment)
   (cond
-   ((not (symbol? scheme))
+   ((and scheme (not (symbol? scheme)))
     (uri-error "Expected a symbol for the URI scheme: ~s" scheme))
    ((and (or userinfo port) (not host))
     (uri-error "Expected a host, given userinfo or port"))
@@ -80,13 +86,14 @@
          (not (eqv? (string-ref path 0) #\/)))
     (uri-error "Expected path of absolute URI to start with a /: ~a" path))))
 
-(define* (build-uri scheme #:key userinfo host port (path "") query fragment
+(define* (build-uri #:optional scheme #:key
+                    userinfo host port (path "") query fragment
                     (validate? #t))
-  "Construct a URI object.  SCHEME should be a symbol, PORT
-either a positive, exact integer or ‘#f’, and the rest of the
-fields are either strings or ‘#f’.  If VALIDATE? is true,
-also run some consistency checks to make sure that the constructed URI
-is valid."
+  "Construct a URI object.  SCHEME should be a symbol, PORT a positive,
+exact integer, and the rest of the fields are strings.  Any field except
+PORT may also be ‘#f’ to indicate it is not present.  If VALIDATE? is
+true, also run some consistency checks to make sure that the constructed
+URI is valid."
   (if validate?
       (validate-uri scheme userinfo host port path query fragment))
   (make-uri scheme userinfo host port path query fragment))
@@ -173,7 +180,7 @@ is valid."
 (define uri-regexp
   (make-regexp uri-pat))
 
-(define (string->uri* string)
+(define (string->uri string)
   "Parse STRING into a URI object.  Return ‘#f’ if the string
 could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
@@ -194,11 +201,17 @@ could not be parsed."
      (lambda (k)
        #f)))
 
-(define (string->uri string)
-  "Parse STRING into a URI object.  Return ‘#f’ if the string
+(define (string->absolute-uri string)
+  "Parse STRING into an absolute URI object.  Return ‘#f’ if the string
+could not be parsed."
+  (let ((uri (string->uri string)))
+    (and (absolute-uri? uri) uri)))
+
+(define (string->relative-uri string)
+  "Parse STRING into a relative URI object.  Return ‘#f’ if the string
 could not be parsed."
-  (let ((uri (string->uri* string)))
-    (and uri (uri-scheme uri) uri)))
+  (let ((uri (string->uri string)))
+    (and (relative-uri? uri) uri)))
 
 (define *default-ports* (make-hash-table))
 
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 6fa16bd..80dbd55 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -132,8 +132,7 @@
 (with-test-prefix "read-request-line"
   (pass-if-read-request-line "GET / HTTP/1.1"
                              GET
-                             (build-uri 'http
-                                        #:path "/")
+                             (build-uri #:path "/")
                              (1 . 1))
   (pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1"
                              GET
@@ -143,21 +142,18 @@
                              (1 . 1))
   (pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
                              GET
-                             (build-uri 'http
-                                        #:path "/pub/WWW/TheProject.html")
+                             (build-uri #:path "/pub/WWW/TheProject.html")
                              (1 . 1))
   (pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
                              HEAD
-                             (build-uri 'http
-                                        #:path "/etc/hosts"
+                             (build-uri #:path "/etc/hosts"
                                         #:query "foo=bar")
                              (1 . 1)))
 
 (with-test-prefix "write-request-line"
   (pass-if-write-request-line "GET / HTTP/1.1"
                               GET
-                              (build-uri 'http
-                                         #:path "/")
+                              (build-uri #:path "/")
                               (1 . 1))
   ;;; FIXME: Test fails due to scheme, host always being removed.
   ;;; However, it should be supported to request these be present, and
@@ -170,13 +166,11 @@
   ;;                             (1 . 1))
   (pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
                               GET
-                              (build-uri 'http
-                                         #:path "/pub/WWW/TheProject.html")
+                              (build-uri #:path "/pub/WWW/TheProject.html")
                               (1 . 1))
   (pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
                               HEAD
-                              (build-uri 'http
-                                         #:path "/etc/hosts"
+                              (build-uri #:path "/etc/hosts"
                                          #:query "foo=bar")
                               (1 . 1)))
 
@@ -252,6 +246,12 @@
   (pass-if-parse content-length "010" 10)
   (pass-if-parse content-location "http://foo/"
                  (build-uri 'http #:host "foo" #:path "/"))
+  (pass-if-parse content-location "//foo/"
+                 (build-uri #:host "foo" #:path "/"))
+  (pass-if-parse content-location "/etc/foo"
+                 (build-uri #:path "/etc/foo"))
+  (pass-if-parse content-location "foo"
+                 (build-uri #:path "foo"))
   (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
   (pass-if-parse content-range "bytes */*" '(bytes * *))
   (pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -315,6 +315,12 @@
   (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
   (pass-if-parse referer "http://foo/bar?baz"
                  (build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+  (pass-if-parse referer "//foo/bar?baz"
+                 (build-uri #:host "foo" #:path "/bar" #:query "baz"))
+  (pass-if-parse referer "/etc/foo"
+                 (build-uri #:path "/etc/foo"))
+  (pass-if-parse referer "foo"
+                 (build-uri #:path "foo"))
   (pass-if-parse te "trailers" '((trailers)))
   (pass-if-parse te "trailers,foo" '((trailers) (foo)))
   (pass-if-parse user-agent "guile" "guile"))
@@ -329,6 +335,9 @@
   (pass-if-parse etag "W/\"foo\"" '("foo" . #f))
   (pass-if-parse location "http://other-place"
                  (build-uri 'http #:host "other-place"))
+  (pass-if-any-error location "//other-place")
+  (pass-if-any-error location "/etc/foo")
+  (pass-if-any-error location "foo")
   (pass-if-parse proxy-authenticate "Basic realm=\"guile\""
                  '((basic (realm . "guile"))))
   (pass-if-parse retry-after "Tue, 15 Nov 1994 08:12:31 GMT"
diff --git a/test-suite/tests/web-request.test b/test-suite/tests/web-request.test
index 8cf1c2e..1574d69 100644
--- a/test-suite/tests/web-request.test
+++ b/test-suite/tests/web-request.test
@@ -53,7 +53,7 @@ Accept-Language: en-gb, en;q=0.9\r
     
     (pass-if (equal? (request-method r) 'GET))
     
-    (pass-if (equal? (request-uri r) (build-uri 'http #:path "/qux")))
+    (pass-if (equal? (request-uri r) (build-uri #:path "/qux")))
     
     (pass-if (equal? (read-request-body r) #f))
 
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3f6e7e3..5db1442 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -121,8 +121,23 @@
 
   (pass-if-uri-exception "http://foo@"
                          "Expected.*host"
-                         (build-uri 'http #:userinfo "foo")))
+                         (build-uri 'http #:userinfo "foo"))
 
+  (pass-if "//host/etc/foo"
+    (uri=? (build-uri #:host "host"
+                      #:path "/etc/foo")
+           #:host "host"
+           #:path "/etc/foo"))
+
+  (pass-if "/path/to/some/foo?query"
+    (uri=? (build-uri #:path "/path/to/some/foo"
+                      #:query "query")
+           #:path "/path/to/some/foo"
+           #:query "query"))
+
+  (pass-if "nextdoc/foo"
+    (uri=? (build-uri #:path "nextdoc/foo")
+           #:path "nextdoc/foo")))
 
 (with-test-prefix "string->uri"
   (pass-if "ftp:"
@@ -210,7 +225,30 @@
   (pass-if "file:///etc/hosts"
     (uri=? (string->uri "file:///etc/hosts")
            #:scheme 'file
-           #:path "/etc/hosts")))
+           #:path "/etc/hosts"))
+
+  (pass-if "/"
+    (uri=? (string->uri "/")
+           #:path "/"))
+
+  (pass-if "/path/to/foo"
+    (uri=? (string->uri "/path/to/foo")
+           #:path "/path/to/foo"))
+
+  (pass-if "//example.org"
+    (uri=? (string->uri "//example.org")
+           #:host "example.org"
+           #:path ""))
+
+  (pass-if "//bar@example.org/path/to/foo"
+    (uri=? (string->uri "//bar@example.org/path/to/foo")
+           #:userinfo "bar"
+           #:host "example.org"
+           #:path "/path/to/foo"))
+
+  (pass-if "nextdoc/foo"
+    (uri=? (string->uri "nextdoc/foo")
+           #:path "nextdoc/foo")))
 
 (with-test-prefix "uri->string"
   (pass-if "ftp:"
@@ -248,7 +286,66 @@
   
   (pass-if "http://foo:/"
     (equal? "http://foo/"
-            (uri->string (string->uri "http://foo:/")))))
+            (uri->string (string->uri "http://foo:/"))))
+
+  (pass-if "/"
+    (equal? "/"
+            (uri->string (string->uri "/"))))
+
+  (pass-if "/path/to/foo"
+    (equal? "/path/to/foo"
+            (uri->string (string->uri "/path/to/foo"))))
+
+  (pass-if "//example.org"
+    (equal? "//example.org"
+            (uri->string (string->uri "//example.org"))))
+
+  (pass-if "//bar@example.org/path/to/foo"
+    (equal? "//bar@example.org/path/to/foo"
+            (uri->string (string->uri "//bar@example.org/path/to/foo"))))
+
+  (pass-if "nextdoc/foo"
+    (equal? "nextdoc/foo"
+            (uri->string (string->uri "nextdoc/foo")))))
+
+(with-test-prefix "string->absolute-uri"
+  (pass-if "ftp:"
+    (uri=? (string->absolute-uri "ftp:")
+           #:scheme 'ftp
+           #:path ""))
+
+  (pass-if "/"
+    (not (string->absolute-uri "/")))
+
+  (pass-if "/path/to/foo"
+    (not (string->absolute-uri "/path/to/foo")))
+
+  (pass-if "//example.org"
+    (not (string->absolute-uri "//example.org")))
+
+  (pass-if "nextdoc/foo"
+    (not (string->absolute-uri "nextdoc/foo"))))
+
+(with-test-prefix "string->relative-uri"
+  (pass-if "ftp:"
+    (not (string->relative-uri "ftp:")))
+
+  (pass-if "/"
+    (uri=? (string->relative-uri "/")
+           #:path "/"))
+
+  (pass-if "/path/to/foo"
+    (uri=? (string->relative-uri "/path/to/foo")
+           #:path "/path/to/foo"))
+
+  (pass-if "//example.org"
+    (uri=? (string->relative-uri "//example.org")
+           #:host "example.org"
+           #:path ""))
+
+  (pass-if "nextdoc/foo"
+    (uri=? (string->relative-uri "nextdoc/foo")
+           #:path "nextdoc/foo")))
 
 (with-test-prefix "decode"
   (pass-if "foo%20bar"
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-02-23  8:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès
2012-11-08  5:52 ` Daniel Hartwig
2012-11-08 20:10   ` Ludovic Courtès
2012-11-09  0:39     ` Daniel Hartwig
2012-11-09 20:52       ` Ludovic Courtès
2012-11-10  1:45         ` Daniel Hartwig
2012-11-10 13:52           ` Ludovic Courtès
2012-11-23 22:19       ` Ludovic Courtès
2012-11-24 11:23         ` Daniel Hartwig
2012-11-24 15:10           ` Ludovic Courtès
2012-11-24 15:34             ` Daniel Hartwig
2012-11-26  0:15               ` Ludovic Courtès
2012-11-26 23:13                 ` Ludovic Courtès
2012-11-27  1:06                   ` Daniel Hartwig
2012-11-27 12:50                     ` Ludovic Courtès
2012-11-27 15:18                       ` Daniel Hartwig
2012-11-27 21:43                         ` Ludovic Courtès
2013-02-23  8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).