From: Daniel Hartwig <mandyke@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 12827@debbugs.gnu.org
Subject: bug#12827: [2.0.6] web client: fails to parse 404 header
Date: Thu, 8 Nov 2012 13:52:29 +0800 [thread overview]
Message-ID: <CAN3veReUQa=1R1sEqmRW-hJAhodfkyxysavdtzDff537t2nQaA@mail.gmail.com> (raw)
In-Reply-To: <87r4o5kuy8.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 1333 bytes --]
On 8 November 2012 04:40, Ludovic Courtès <ludo@gnu.org> wrote:
> scheme@(guile-user)> (use-modules (web client) (web uri))
> scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist"))
> web/http.scm:191:11: In procedure read-header:
> web/http.scm:191:11: Bad uri header component: gnu-404.html
Some headers are supposed to support these “URI-references”.
I have just updated a patch I worked on a while back that adds support
for such things, including “partial-refs”. It is not yet ready for
inclusion, some points to consider:
* DONE (web uri) support for relative-refs
* DONE (web http) support for relative-refs in headers
* TODO (resolve-ref uri uri-reference) → absolute-uri
* TODO documentation
* TODO maybe use more scheme-ish names instead of RFC 3986
This latest RFC makes a clear distinction between an actual URI
and a reference to such. I thought it best to reflect that
distinction, but maybe it does pollute the namespace a bit:
- absolute-uri? → uri-absolute?
- relative-ref? → uri-relative?
* TODO build-uri validation is broken/less strict and will now pass
relative-refs, so maybe introduce build-uri-reference instead
Anyway, this may provide a useful base to work from, if not something
suitable for inclusion almost immediately.
Regards
[-- Attachment #2: 0001-uri-reference-support.patch --]
[-- Type: application/octet-stream, Size: 11426 bytes --]
From fc7f4dbe9fbd855af4d47990de31bcfccb55dfe1 Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Thu, 8 Nov 2012 13:11:52 +0800
Subject: [PATCH] uri-reference support
* module/web/uri.scm (build-uri, validate-uri, string->uri,
uri->string): Allow uri-scheme to be #f. Any such object is a
relative-ref.
(absolute-uri?, relative-ref?, uri-reference?): New predicates
to distinguish between kinds of URI reference.
* module/web/http.scm (declare-uri-header!): Use absolute-uri? to
validate these headers.
(declare-uri-reference-header!): New type of header accepting any
URI-reference.
("Content-Location", "Referer"): Change to URI-reference headers
to support relative references.
* test-suite/tests/web-uri.test, test-suite/tests/web-http.test:
Add tests for the above.
---
module/web/http.scm | 13 ++++-
module/web/uri.scm | 46 +++++++++++----
test-suite/tests/web-http.test | 4 ++
test-suite/tests/web-uri.test | 125 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 174 insertions(+), 14 deletions(-)
diff --git a/module/web/http.scm b/module/web/http.scm
index cc5dd5a..1a54cc6 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1185,7 +1185,14 @@ treated specially, and is just returned as a plain string."
(define (declare-uri-header! name)
(declare-header! name
(lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
- uri?
+ absolute-uri?
+ write-uri))
+
+;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
+(define (declare-uri-reference-header! name)
+ (declare-header! name
+ (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str)))
+ uri-reference?
write-uri))
;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
@@ -1440,7 +1447,7 @@ treated specially, and is just returned as a plain string."
;; Content-Location = ( absoluteURI | relativeURI )
;;
-(declare-uri-header! "Content-Location")
+(declare-uri-reference-header! "Content-Location")
;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
;;
@@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string."
;; Referer = ( absoluteURI | relativeURI )
;;
-(declare-uri-header! "Referer")
+(declare-uri-reference-header! "Referer")
;; TE = #( t-codings )
;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 78614a5..da55902 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -34,6 +34,7 @@
#:export (uri?
uri-scheme uri-userinfo uri-host uri-port
uri-path uri-query uri-fragment
+ absolute-uri? relative-ref? uri-reference?
build-uri
declare-default-port!
@@ -53,6 +54,17 @@
(query uri-query)
(fragment uri-fragment))
+(define (absolute-uri? uri)
+ (and (uri? uri) (uri-scheme uri) #t))
+
+(define (relative-ref? uri)
+ (and (uri? uri) (not (uri-scheme uri)) #t))
+
+(define (uri-reference? uri)
+ (and (or (absolute-uri? uri)
+ (relative-ref? uri))
+ #t))
+
(define (uri-error message . args)
(throw 'uri-error message args))
@@ -61,7 +73,7 @@
(define (validate-uri scheme userinfo host port path query fragment)
(cond
- ((not (symbol? scheme))
+ ((and scheme (not (symbol? scheme)))
(uri-error "Expected a symbol for the URI scheme: ~s" scheme))
((and (or userinfo port) (not host))
(uri-error "Expected a host, given userinfo or port"))
@@ -150,6 +162,17 @@ consistency checks to make sure that the constructed URI is valid."
;;; / path-absolute
;;; / path-rootless
;;; / path-empty
+;;;
+;;; RFC 3986, #4.
+;;;
+;;; URI-reference = URI / relative-ref
+;;;
+;;; relative-ref = relative-part [ "?" query ] [ "#" fragment ]
+;;;
+;;; relative-part = "//" authority path-abempty
+;;; / path-absolute
+;;; / path-noscheme
+;;; / path-empty
(define scheme-pat
"[a-zA-Z][a-zA-Z0-9+.-]*")
@@ -162,7 +185,7 @@ consistency checks to make sure that the constructed URI is valid."
(define fragment-pat
".*")
(define uri-pat
- (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$"
+ (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$"
scheme-pat authority-pat path-pat query-pat fragment-pat))
(define uri-regexp
(make-regexp uri-pat))
@@ -172,12 +195,12 @@ consistency checks to make sure that the constructed URI is valid."
could not be parsed."
(% (let ((m (regexp-exec uri-regexp string)))
(if (not m) (abort))
- (let ((scheme (string->symbol
- (string-downcase (match:substring m 1))))
- (authority (match:substring m 2))
- (path (match:substring m 3))
- (query (match:substring m 5))
- (fragment (match:substring m 7)))
+ (let ((scheme (let ((s (match:substring m 2)))
+ (and s (string->symbol (string-downcase s)))))
+ (authority (match:substring m 3))
+ (path (match:substring m 4))
+ (query (match:substring m 6))
+ (fragment (match:substring m 8)))
(call-with-values
(lambda ()
(if authority
@@ -206,8 +229,7 @@ printed."
(define (uri->string uri)
"Serialize @var{uri} to a string."
- (let* ((scheme-str (string-append
- (symbol->string (uri-scheme uri)) ":"))
+ (let* ((scheme (uri-scheme uri))
(userinfo (uri-userinfo uri))
(host (uri-host uri))
(port (uri-port uri))
@@ -215,7 +237,9 @@ printed."
(query (uri-query uri))
(fragment (uri-fragment uri)))
(string-append
- scheme-str
+ (if scheme
+ (string-append (symbol->string scheme) ":")
+ "")
(if host
(string-append "//"
(if userinfo (string-append userinfo "@")
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..5d80fde 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -145,6 +145,8 @@
(pass-if-parse content-length "010" 10)
(pass-if-parse content-location "http://foo/"
(build-uri 'http #:host "foo" #:path "/"))
+ (pass-if-parse content-location "//foo"
+ (build-uri #f #:host "foo"))
(pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
(pass-if-parse content-range "bytes */*" '(bytes * *))
(pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -208,6 +210,8 @@
(pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
(pass-if-parse referer "http://foo/bar?baz"
(build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+ (pass-if-parse referer "//foo/bar?baz"
+ (build-uri #f #:host "foo" #:path "/bar" #:query "baz"))
(pass-if-parse te "trailers" '((trailers)))
(pass-if-parse te "trailers,foo" '((trailers) (foo)))
(pass-if-parse user-agent "guile" "guile"))
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3f6e7e3..d4d44d6 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -78,6 +78,24 @@
#:port 22
#:path "/baz"))
+ (pass-if "foo"
+ (uri=? (build-uri #f #:path "foo")
+ #:path "foo"))
+
+ (pass-if "/foo"
+ (uri=? (build-uri #f #:path "/foo")
+ #:path "/foo"))
+
+ (pass-if "//foo/bar"
+ (uri=? (build-uri #f #:host "foo" #:path "/bar")
+ #:host "foo"
+ #:path "/bar"))
+
+ (pass-if "?foo"
+ (uri=? (build-uri #f #:query "foo")
+ #:path ""
+ #:query "foo"))
+
(pass-if-uri-exception "non-symbol scheme"
"Expected.*symbol"
(build-uri "nonsym"))
@@ -123,6 +141,80 @@
"Expected.*host"
(build-uri 'http #:userinfo "foo")))
+(with-test-prefix "absolute-uri?"
+ (pass-if "ftp:"
+ (absolute-uri? (build-uri 'ftp)))
+
+ (pass-if "ftp:foo"
+ (absolute-uri? (build-uri 'ftp #:path "foo")))
+
+ (pass-if "ftp://foo/bar"
+ (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+ (pass-if "ftp://foo@bar:22/baz"
+ (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+ (expect-fail "foo"
+ (absolute-uri? (build-uri #f #:path "foo")))
+
+ (expect-fail "/foo"
+ (absolute-uri? (build-uri #f #:path "/foo")))
+
+ (expect-fail "//foo/bar"
+ (absolute-uri? (build-uri #f #:host "foo" #:path "/bar")))
+
+ (expect-fail "?foo"
+ (absolute-uri? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "relative-ref?"
+ (expect-fail "ftp:"
+ (relative-ref? (build-uri 'ftp)))
+
+ (expect-fail "ftp:foo"
+ (relative-ref? (build-uri 'ftp #:path "foo")))
+
+ (expect-fail "ftp://foo/bar"
+ (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+ (expect-fail "ftp://foo@bar:22/baz"
+ (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+ (pass-if "foo"
+ (relative-ref? (build-uri #f #:path "foo")))
+
+ (pass-if "/foo"
+ (relative-ref? (build-uri #f #:path "/foo")))
+
+ (pass-if "//foo/bar"
+ (relative-ref? (build-uri #f #:host "foo" #:path "/bar")))
+
+ (pass-if "?foo"
+ (relative-ref? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "uri-reference?"
+ (pass-if "ftp:"
+ (uri-reference? (build-uri 'ftp)))
+
+ (pass-if "ftp:foo"
+ (uri-reference? (build-uri 'ftp #:path "foo")))
+
+ (pass-if "ftp://foo/bar"
+ (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+ (pass-if "ftp://foo@bar:22/baz"
+ (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+ (pass-if "foo"
+ (uri-reference? (build-uri #f #:path "foo")))
+
+ (pass-if "/foo"
+ (uri-reference? (build-uri #f #:path "/foo")))
+
+ (pass-if "//foo/bar"
+ (uri-reference? (build-uri #f #:host "foo" #:path "/bar")))
+
+ (pass-if "?foo"
+ (uri-reference? (build-uri #f #:query "foo"))))
(with-test-prefix "string->uri"
(pass-if "ftp:"
@@ -149,6 +241,24 @@
#:port 22
#:path "/baz"))
+ (pass-if "foo"
+ (uri=? (string->uri "foo")
+ #:path "foo"))
+
+ (pass-if "/foo"
+ (uri=? (string->uri "/foo")
+ #:path "/foo"))
+
+ (pass-if "//foo/bar"
+ (uri=? (string->uri "//foo/bar")
+ #:host "foo"
+ #:path "/bar"))
+
+ (pass-if "?foo"
+ (uri=? (string->uri "?foo")
+ #:path ""
+ #:query "foo"))
+
(pass-if "http://bad.host.1"
(not (string->uri "http://bad.host.1")))
@@ -229,6 +339,21 @@
(equal? "ftp://foo@bar:22/baz"
(uri->string (string->uri "ftp://foo@bar:22/baz"))))
+ (pass-if "foo"
+ (equal? "foo"
+ (uri->string (string->uri "foo"))))
+
+ (pass-if "/foo"
+ (equal? "/foo" (uri->string (string->uri "/foo"))))
+
+ (pass-if "//foo/bar"
+ (equal? "//foo/bar"
+ (uri->string (string->uri "//foo/bar"))))
+
+ (pass-if "?foo"
+ (equal? "?foo"
+ (uri->string (string->uri "?foo"))))
+
(when (memq 'socket *features*)
(pass-if "http://192.0.2.1"
(equal? "http://192.0.2.1"
--
1.7.10.4
next prev parent reply other threads:[~2012-11-08 5:52 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès
2012-11-08 5:52 ` Daniel Hartwig [this message]
2012-11-08 20:10 ` Ludovic Courtès
2012-11-09 0:39 ` Daniel Hartwig
2012-11-09 20:52 ` Ludovic Courtès
2012-11-10 1:45 ` Daniel Hartwig
2012-11-10 13:52 ` Ludovic Courtès
2012-11-23 22:19 ` Ludovic Courtès
2012-11-24 11:23 ` Daniel Hartwig
2012-11-24 15:10 ` Ludovic Courtès
2012-11-24 15:34 ` Daniel Hartwig
2012-11-26 0:15 ` Ludovic Courtès
2012-11-26 23:13 ` Ludovic Courtès
2012-11-27 1:06 ` Daniel Hartwig
2012-11-27 12:50 ` Ludovic Courtès
2012-11-27 15:18 ` Daniel Hartwig
2012-11-27 21:43 ` Ludovic Courtès
2013-02-23 8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN3veReUQa=1R1sEqmRW-hJAhodfkyxysavdtzDff537t2nQaA@mail.gmail.com' \
--to=mandyke@gmail.com \
--cc=12827@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).