unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: Daniel Hartwig <mandyke@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 12827@debbugs.gnu.org
Subject: bug#12827: [2.0.6] web client: fails to parse 404 header
Date: Thu, 8 Nov 2012 13:52:29 +0800	[thread overview]
Message-ID: <CAN3veReUQa=1R1sEqmRW-hJAhodfkyxysavdtzDff537t2nQaA@mail.gmail.com> (raw)
In-Reply-To: <87r4o5kuy8.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1333 bytes --]

On 8 November 2012 04:40, Ludovic Courtès <ludo@gnu.org> wrote:
> scheme@(guile-user)> (use-modules (web client) (web uri))
> scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist"))
> web/http.scm:191:11: In procedure read-header:
> web/http.scm:191:11: Bad uri header component: gnu-404.html

Some headers are supposed to support these “URI-references”.

I have just updated a patch I worked on a while back that adds support
for such things, including “partial-refs”.  It is not yet ready for
inclusion, some points to consider:
* DONE (web uri) support for relative-refs
* DONE (web http) support for relative-refs in headers
* TODO (resolve-ref uri uri-reference) → absolute-uri
* TODO documentation
* TODO maybe use more scheme-ish names instead of RFC 3986
This latest RFC makes a clear distinction between an actual URI
and a reference to such.  I thought it best to reflect that
distinction, but maybe it does pollute the namespace a bit:
- absolute-uri? → uri-absolute?
- relative-ref? → uri-relative?
* TODO build-uri validation is broken/less strict and will now pass
relative-refs, so maybe introduce build-uri-reference instead

Anyway, this may provide a useful base to work from, if not something
suitable for inclusion almost immediately.

Regards

[-- Attachment #2: 0001-uri-reference-support.patch --]
[-- Type: application/octet-stream, Size: 11426 bytes --]

From fc7f4dbe9fbd855af4d47990de31bcfccb55dfe1 Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Thu, 8 Nov 2012 13:11:52 +0800
Subject: [PATCH] uri-reference support

* module/web/uri.scm (build-uri, validate-uri, string->uri,
  uri->string): Allow uri-scheme to be #f.  Any such object is a
  relative-ref.
  (absolute-uri?, relative-ref?, uri-reference?): New predicates
  to distinguish between kinds of URI reference.
* module/web/http.scm (declare-uri-header!): Use absolute-uri? to
  validate these headers.
  (declare-uri-reference-header!): New type of header accepting any
  URI-reference.
  ("Content-Location", "Referer"): Change to URI-reference headers
  to support relative references.
* test-suite/tests/web-uri.test, test-suite/tests/web-http.test:
  Add tests for the above.
---
 module/web/http.scm            |   13 ++++-
 module/web/uri.scm             |   46 +++++++++++----
 test-suite/tests/web-http.test |    4 ++
 test-suite/tests/web-uri.test  |  125 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 174 insertions(+), 14 deletions(-)

diff --git a/module/web/http.scm b/module/web/http.scm
index cc5dd5a..1a54cc6 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1185,7 +1185,14 @@ treated specially, and is just returned as a plain string."
 (define (declare-uri-header! name)
   (declare-header! name
     (lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
-    uri?
+    absolute-uri?
+    write-uri))
+
+;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
+(define (declare-uri-reference-header! name)
+  (declare-header! name
+    (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str)))
+    uri-reference?
     write-uri))
 
 ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
@@ -1440,7 +1447,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Content-Location = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Content-Location")
+(declare-uri-reference-header! "Content-Location")
 
 ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
 ;;
@@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string."
 
 ;; Referer = ( absoluteURI | relativeURI )
 ;;
-(declare-uri-header! "Referer")
+(declare-uri-reference-header! "Referer")
 
 ;; TE = #( t-codings )
 ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 78614a5..da55902 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -34,6 +34,7 @@
   #:export (uri?
             uri-scheme uri-userinfo uri-host uri-port
             uri-path uri-query uri-fragment
+            absolute-uri? relative-ref? uri-reference?
 
             build-uri
             declare-default-port!
@@ -53,6 +54,17 @@
   (query uri-query)
   (fragment uri-fragment))
 
+(define (absolute-uri? uri)
+  (and (uri? uri) (uri-scheme uri) #t))
+
+(define (relative-ref? uri)
+  (and (uri? uri) (not (uri-scheme uri)) #t))
+
+(define (uri-reference? uri)
+  (and (or (absolute-uri? uri)
+           (relative-ref? uri))
+       #t))
+
 (define (uri-error message . args)
   (throw 'uri-error message args))
 
@@ -61,7 +73,7 @@
 
 (define (validate-uri scheme userinfo host port path query fragment)
   (cond
-   ((not (symbol? scheme))
+   ((and scheme (not (symbol? scheme)))
     (uri-error "Expected a symbol for the URI scheme: ~s" scheme))
    ((and (or userinfo port) (not host))
     (uri-error "Expected a host, given userinfo or port"))
@@ -150,6 +162,17 @@ consistency checks to make sure that the constructed URI is valid."
 ;;;               / path-absolute
 ;;;               / path-rootless
 ;;;               / path-empty
+;;;
+;;; RFC 3986, #4.
+;;;
+;;;   URI-reference = URI / relative-ref
+;;;
+;;;   relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
+;;;
+;;;   relative-part = "//" authority path-abempty
+;;;                 / path-absolute
+;;;                 / path-noscheme
+;;;                 / path-empty
 
 (define scheme-pat
   "[a-zA-Z][a-zA-Z0-9+.-]*")
@@ -162,7 +185,7 @@ consistency checks to make sure that the constructed URI is valid."
 (define fragment-pat
   ".*")
 (define uri-pat
-  (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$"
+  (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$"
           scheme-pat authority-pat path-pat query-pat fragment-pat))
 (define uri-regexp
   (make-regexp uri-pat))
@@ -172,12 +195,12 @@ consistency checks to make sure that the constructed URI is valid."
 could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
        (if (not m) (abort))
-       (let ((scheme (string->symbol
-                      (string-downcase (match:substring m 1))))
-             (authority (match:substring m 2))
-             (path (match:substring m 3))
-             (query (match:substring m 5))
-             (fragment (match:substring m 7)))
+       (let ((scheme (let ((s (match:substring m 2)))
+                       (and s (string->symbol (string-downcase s)))))
+             (authority (match:substring m 3))
+             (path (match:substring m 4))
+             (query (match:substring m 6))
+             (fragment (match:substring m 8)))
          (call-with-values
              (lambda ()
                (if authority
@@ -206,8 +229,7 @@ printed."
 
 (define (uri->string uri)
   "Serialize @var{uri} to a string."
-  (let* ((scheme-str (string-append
-                      (symbol->string (uri-scheme uri)) ":"))
+  (let* ((scheme (uri-scheme uri))
          (userinfo (uri-userinfo uri))
          (host (uri-host uri))
          (port (uri-port uri))
@@ -215,7 +237,9 @@ printed."
          (query (uri-query uri))
          (fragment (uri-fragment uri)))
     (string-append
-     scheme-str
+     (if scheme
+         (string-append (symbol->string scheme) ":")
+         "")
      (if host
          (string-append "//"
                         (if userinfo (string-append userinfo "@")
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..5d80fde 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -145,6 +145,8 @@
   (pass-if-parse content-length "010" 10)
   (pass-if-parse content-location "http://foo/"
                  (build-uri 'http #:host "foo" #:path "/"))
+  (pass-if-parse content-location "//foo"
+                 (build-uri #f #:host "foo"))
   (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
   (pass-if-parse content-range "bytes */*" '(bytes * *))
   (pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -208,6 +210,8 @@
   (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
   (pass-if-parse referer "http://foo/bar?baz"
                  (build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+  (pass-if-parse referer "//foo/bar?baz"
+                 (build-uri #f #:host "foo" #:path "/bar" #:query "baz"))
   (pass-if-parse te "trailers" '((trailers)))
   (pass-if-parse te "trailers,foo" '((trailers) (foo)))
   (pass-if-parse user-agent "guile" "guile"))
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3f6e7e3..d4d44d6 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -78,6 +78,24 @@
            #:port 22
            #:path "/baz"))
 
+  (pass-if "foo"
+    (uri=? (build-uri #f #:path "foo")
+           #:path "foo"))
+
+  (pass-if "/foo"
+    (uri=? (build-uri #f #:path "/foo")
+           #:path "/foo"))
+
+  (pass-if "//foo/bar"
+    (uri=? (build-uri #f #:host "foo" #:path "/bar")
+           #:host "foo"
+           #:path "/bar"))
+
+  (pass-if "?foo"
+    (uri=? (build-uri #f #:query "foo")
+           #:path ""
+           #:query "foo"))
+
   (pass-if-uri-exception "non-symbol scheme"
                          "Expected.*symbol"
                          (build-uri "nonsym"))
@@ -123,6 +141,80 @@
                          "Expected.*host"
                          (build-uri 'http #:userinfo "foo")))
 
+(with-test-prefix "absolute-uri?"
+  (pass-if "ftp:"
+    (absolute-uri? (build-uri 'ftp)))
+
+  (pass-if "ftp:foo"
+    (absolute-uri? (build-uri 'ftp #:path "foo")))
+
+  (pass-if "ftp://foo/bar"
+    (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (pass-if "ftp://foo@bar:22/baz"
+    (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (expect-fail "foo"
+    (absolute-uri? (build-uri #f #:path "foo")))
+
+  (expect-fail "/foo"
+    (absolute-uri? (build-uri #f #:path "/foo")))
+
+  (expect-fail "//foo/bar"
+    (absolute-uri? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (expect-fail "?foo"
+    (absolute-uri? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "relative-ref?"
+  (expect-fail "ftp:"
+    (relative-ref? (build-uri 'ftp)))
+
+  (expect-fail "ftp:foo"
+    (relative-ref? (build-uri 'ftp #:path "foo")))
+
+  (expect-fail "ftp://foo/bar"
+    (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (expect-fail "ftp://foo@bar:22/baz"
+    (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (pass-if "foo"
+    (relative-ref? (build-uri #f #:path "foo")))
+
+  (pass-if "/foo"
+    (relative-ref? (build-uri #f #:path "/foo")))
+
+  (pass-if "//foo/bar"
+    (relative-ref? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (pass-if "?foo"
+    (relative-ref? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "uri-reference?"
+  (pass-if "ftp:"
+    (uri-reference? (build-uri 'ftp)))
+
+  (pass-if "ftp:foo"
+    (uri-reference? (build-uri 'ftp #:path "foo")))
+
+  (pass-if "ftp://foo/bar"
+    (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+  (pass-if "ftp://foo@bar:22/baz"
+    (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+  (pass-if "foo"
+    (uri-reference? (build-uri #f #:path "foo")))
+
+  (pass-if "/foo"
+    (uri-reference? (build-uri #f #:path "/foo")))
+
+  (pass-if "//foo/bar"
+    (uri-reference? (build-uri #f #:host "foo" #:path "/bar")))
+
+  (pass-if "?foo"
+    (uri-reference? (build-uri #f #:query "foo"))))
 
 (with-test-prefix "string->uri"
   (pass-if "ftp:"
@@ -149,6 +241,24 @@
            #:port 22
            #:path "/baz"))
 
+  (pass-if "foo"
+    (uri=? (string->uri "foo")
+           #:path "foo"))
+
+  (pass-if "/foo"
+    (uri=? (string->uri "/foo")
+           #:path "/foo"))
+
+  (pass-if "//foo/bar"
+    (uri=? (string->uri "//foo/bar")
+           #:host "foo"
+           #:path "/bar"))
+
+  (pass-if "?foo"
+    (uri=? (string->uri "?foo")
+           #:path ""
+           #:query "foo"))
+
   (pass-if "http://bad.host.1"
     (not (string->uri "http://bad.host.1")))
 
@@ -229,6 +339,21 @@
     (equal? "ftp://foo@bar:22/baz"
             (uri->string (string->uri "ftp://foo@bar:22/baz"))))
   
+  (pass-if "foo"
+    (equal? "foo"
+            (uri->string (string->uri "foo"))))
+
+  (pass-if "/foo"
+    (equal? "/foo" (uri->string (string->uri "/foo"))))
+
+  (pass-if "//foo/bar"
+    (equal? "//foo/bar"
+            (uri->string (string->uri "//foo/bar"))))
+
+  (pass-if "?foo"
+    (equal? "?foo"
+            (uri->string (string->uri "?foo"))))
+
   (when (memq 'socket *features*)
     (pass-if "http://192.0.2.1"
       (equal? "http://192.0.2.1"
-- 
1.7.10.4


  reply	other threads:[~2012-11-08  5:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès
2012-11-08  5:52 ` Daniel Hartwig [this message]
2012-11-08 20:10   ` Ludovic Courtès
2012-11-09  0:39     ` Daniel Hartwig
2012-11-09 20:52       ` Ludovic Courtès
2012-11-10  1:45         ` Daniel Hartwig
2012-11-10 13:52           ` Ludovic Courtès
2012-11-23 22:19       ` Ludovic Courtès
2012-11-24 11:23         ` Daniel Hartwig
2012-11-24 15:10           ` Ludovic Courtès
2012-11-24 15:34             ` Daniel Hartwig
2012-11-26  0:15               ` Ludovic Courtès
2012-11-26 23:13                 ` Ludovic Courtès
2012-11-27  1:06                   ` Daniel Hartwig
2012-11-27 12:50                     ` Ludovic Courtès
2012-11-27 15:18                       ` Daniel Hartwig
2012-11-27 21:43                         ` Ludovic Courtès
2013-02-23  8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN3veReUQa=1R1sEqmRW-hJAhodfkyxysavdtzDff537t2nQaA@mail.gmail.com' \
    --to=mandyke@gmail.com \
    --cc=12827@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).