* bug#12827: [2.0.6] web client: fails to parse 404 header @ 2012-11-07 20:40 Ludovic Courtès 2012-11-08 5:52 ` Daniel Hartwig 2013-02-23 8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig 0 siblings, 2 replies; 18+ messages in thread From: Ludovic Courtès @ 2012-11-07 20:40 UTC (permalink / raw) To: 12827 scheme@(guile-user)> (use-modules (web client) (web uri)) scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist")) web/http.scm:191:11: In procedure read-header: web/http.scm:191:11: Bad uri header component: gnu-404.html Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. scheme@(guile-user) [1]> ,bt In web/client.scm: 127:4 3 (http-get #<<uri> scheme: http userinfo: #f host: "www.gnu.org" port: #f path: "/does-not-exist" query: #f fragment: #f> #:port #<input-output: socket 9> #:version (1 . 1) #:keep-alive? #f #:extra-headers () # #) In web/response.scm: 188:6 2 (read-response #<input-output: socket 9>) In web/http.scm: 225:33 1 (lp ((server . "Apache/2.2.14") (date . #<date nanosecond: 0 second: 12 minute: 36 hour: 20 day: 7 month: 11 year: 2012 zone-offset: 0>))) 191:11 0 (read-header #<input-output: socket 9>) scheme@(guile-user) [1]> ,locals Local variables: $5 = port = #<input-output: socket 9> $6 = line = "Content-Location: gnu-404.html" $7 = delim = 16 $8 = sym = content-location Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès @ 2012-11-08 5:52 ` Daniel Hartwig 2012-11-08 20:10 ` Ludovic Courtès 2013-02-23 8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig 1 sibling, 1 reply; 18+ messages in thread From: Daniel Hartwig @ 2012-11-08 5:52 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 [-- Attachment #1: Type: text/plain, Size: 1333 bytes --] On 8 November 2012 04:40, Ludovic Courtès <ludo@gnu.org> wrote: > scheme@(guile-user)> (use-modules (web client) (web uri)) > scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist")) > web/http.scm:191:11: In procedure read-header: > web/http.scm:191:11: Bad uri header component: gnu-404.html Some headers are supposed to support these “URI-references”. I have just updated a patch I worked on a while back that adds support for such things, including “partial-refs”. It is not yet ready for inclusion, some points to consider: * DONE (web uri) support for relative-refs * DONE (web http) support for relative-refs in headers * TODO (resolve-ref uri uri-reference) → absolute-uri * TODO documentation * TODO maybe use more scheme-ish names instead of RFC 3986 This latest RFC makes a clear distinction between an actual URI and a reference to such. I thought it best to reflect that distinction, but maybe it does pollute the namespace a bit: - absolute-uri? → uri-absolute? - relative-ref? → uri-relative? * TODO build-uri validation is broken/less strict and will now pass relative-refs, so maybe introduce build-uri-reference instead Anyway, this may provide a useful base to work from, if not something suitable for inclusion almost immediately. Regards [-- Attachment #2: 0001-uri-reference-support.patch --] [-- Type: application/octet-stream, Size: 11426 bytes --] From fc7f4dbe9fbd855af4d47990de31bcfccb55dfe1 Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Thu, 8 Nov 2012 13:11:52 +0800 Subject: [PATCH] uri-reference support * module/web/uri.scm (build-uri, validate-uri, string->uri, uri->string): Allow uri-scheme to be #f. Any such object is a relative-ref. (absolute-uri?, relative-ref?, uri-reference?): New predicates to distinguish between kinds of URI reference. * module/web/http.scm (declare-uri-header!): Use absolute-uri? to validate these headers. (declare-uri-reference-header!): New type of header accepting any URI-reference. ("Content-Location", "Referer"): Change to URI-reference headers to support relative references. * test-suite/tests/web-uri.test, test-suite/tests/web-http.test: Add tests for the above. --- module/web/http.scm | 13 ++++- module/web/uri.scm | 46 +++++++++++---- test-suite/tests/web-http.test | 4 ++ test-suite/tests/web-uri.test | 125 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 174 insertions(+), 14 deletions(-) diff --git a/module/web/http.scm b/module/web/http.scm index cc5dd5a..1a54cc6 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -1185,7 +1185,14 @@ treated specially, and is just returned as a plain string." (define (declare-uri-header! name) (declare-header! name (lambda (str) (or (string->uri str) (bad-header-component 'uri str))) - uri? + absolute-uri? + write-uri)) + +;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1) +(define (declare-uri-reference-header! name) + (declare-header! name + (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str))) + uri-reference? write-uri)) ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1) @@ -1440,7 +1447,7 @@ treated specially, and is just returned as a plain string." ;; Content-Location = ( absoluteURI | relativeURI ) ;; -(declare-uri-header! "Content-Location") +(declare-uri-reference-header! "Content-Location") ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864> ;; @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string." ;; Referer = ( absoluteURI | relativeURI ) ;; -(declare-uri-header! "Referer") +(declare-uri-reference-header! "Referer") ;; TE = #( t-codings ) ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] ) diff --git a/module/web/uri.scm b/module/web/uri.scm index 78614a5..da55902 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -34,6 +34,7 @@ #:export (uri? uri-scheme uri-userinfo uri-host uri-port uri-path uri-query uri-fragment + absolute-uri? relative-ref? uri-reference? build-uri declare-default-port! @@ -53,6 +54,17 @@ (query uri-query) (fragment uri-fragment)) +(define (absolute-uri? uri) + (and (uri? uri) (uri-scheme uri) #t)) + +(define (relative-ref? uri) + (and (uri? uri) (not (uri-scheme uri)) #t)) + +(define (uri-reference? uri) + (and (or (absolute-uri? uri) + (relative-ref? uri)) + #t)) + (define (uri-error message . args) (throw 'uri-error message args)) @@ -61,7 +73,7 @@ (define (validate-uri scheme userinfo host port path query fragment) (cond - ((not (symbol? scheme)) + ((and scheme (not (symbol? scheme))) (uri-error "Expected a symbol for the URI scheme: ~s" scheme)) ((and (or userinfo port) (not host)) (uri-error "Expected a host, given userinfo or port")) @@ -150,6 +162,17 @@ consistency checks to make sure that the constructed URI is valid." ;;; / path-absolute ;;; / path-rootless ;;; / path-empty +;;; +;;; RFC 3986, #4. +;;; +;;; URI-reference = URI / relative-ref +;;; +;;; relative-ref = relative-part [ "?" query ] [ "#" fragment ] +;;; +;;; relative-part = "//" authority path-abempty +;;; / path-absolute +;;; / path-noscheme +;;; / path-empty (define scheme-pat "[a-zA-Z][a-zA-Z0-9+.-]*") @@ -162,7 +185,7 @@ consistency checks to make sure that the constructed URI is valid." (define fragment-pat ".*") (define uri-pat - (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$" + (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$" scheme-pat authority-pat path-pat query-pat fragment-pat)) (define uri-regexp (make-regexp uri-pat)) @@ -172,12 +195,12 @@ consistency checks to make sure that the constructed URI is valid." could not be parsed." (% (let ((m (regexp-exec uri-regexp string))) (if (not m) (abort)) - (let ((scheme (string->symbol - (string-downcase (match:substring m 1)))) - (authority (match:substring m 2)) - (path (match:substring m 3)) - (query (match:substring m 5)) - (fragment (match:substring m 7))) + (let ((scheme (let ((s (match:substring m 2))) + (and s (string->symbol (string-downcase s))))) + (authority (match:substring m 3)) + (path (match:substring m 4)) + (query (match:substring m 6)) + (fragment (match:substring m 8))) (call-with-values (lambda () (if authority @@ -206,8 +229,7 @@ printed." (define (uri->string uri) "Serialize @var{uri} to a string." - (let* ((scheme-str (string-append - (symbol->string (uri-scheme uri)) ":")) + (let* ((scheme (uri-scheme uri)) (userinfo (uri-userinfo uri)) (host (uri-host uri)) (port (uri-port uri)) @@ -215,7 +237,9 @@ printed." (query (uri-query uri)) (fragment (uri-fragment uri))) (string-append - scheme-str + (if scheme + (string-append (symbol->string scheme) ":") + "") (if host (string-append "//" (if userinfo (string-append userinfo "@") diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test index 97f5559..5d80fde 100644 --- a/test-suite/tests/web-http.test +++ b/test-suite/tests/web-http.test @@ -145,6 +145,8 @@ (pass-if-parse content-length "010" 10) (pass-if-parse content-location "http://foo/" (build-uri 'http #:host "foo" #:path "/")) + (pass-if-parse content-location "//foo" + (build-uri #f #:host "foo")) (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *)) (pass-if-parse content-range "bytes */*" '(bytes * *)) (pass-if-parse content-range "bytes */30" '(bytes * 30)) @@ -208,6 +210,8 @@ (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30))) (pass-if-parse referer "http://foo/bar?baz" (build-uri 'http #:host "foo" #:path "/bar" #:query "baz")) + (pass-if-parse referer "//foo/bar?baz" + (build-uri #f #:host "foo" #:path "/bar" #:query "baz")) (pass-if-parse te "trailers" '((trailers))) (pass-if-parse te "trailers,foo" '((trailers) (foo))) (pass-if-parse user-agent "guile" "guile")) diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test index 3f6e7e3..d4d44d6 100644 --- a/test-suite/tests/web-uri.test +++ b/test-suite/tests/web-uri.test @@ -78,6 +78,24 @@ #:port 22 #:path "/baz")) + (pass-if "foo" + (uri=? (build-uri #f #:path "foo") + #:path "foo")) + + (pass-if "/foo" + (uri=? (build-uri #f #:path "/foo") + #:path "/foo")) + + (pass-if "//foo/bar" + (uri=? (build-uri #f #:host "foo" #:path "/bar") + #:host "foo" + #:path "/bar")) + + (pass-if "?foo" + (uri=? (build-uri #f #:query "foo") + #:path "" + #:query "foo")) + (pass-if-uri-exception "non-symbol scheme" "Expected.*symbol" (build-uri "nonsym")) @@ -123,6 +141,80 @@ "Expected.*host" (build-uri 'http #:userinfo "foo"))) +(with-test-prefix "absolute-uri?" + (pass-if "ftp:" + (absolute-uri? (build-uri 'ftp))) + + (pass-if "ftp:foo" + (absolute-uri? (build-uri 'ftp #:path "foo"))) + + (pass-if "ftp://foo/bar" + (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar"))) + + (pass-if "ftp://foo@bar:22/baz" + (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz"))) + + (expect-fail "foo" + (absolute-uri? (build-uri #f #:path "foo"))) + + (expect-fail "/foo" + (absolute-uri? (build-uri #f #:path "/foo"))) + + (expect-fail "//foo/bar" + (absolute-uri? (build-uri #f #:host "foo" #:path "/bar"))) + + (expect-fail "?foo" + (absolute-uri? (build-uri #f #:query "foo")))) + +(with-test-prefix "relative-ref?" + (expect-fail "ftp:" + (relative-ref? (build-uri 'ftp))) + + (expect-fail "ftp:foo" + (relative-ref? (build-uri 'ftp #:path "foo"))) + + (expect-fail "ftp://foo/bar" + (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar"))) + + (expect-fail "ftp://foo@bar:22/baz" + (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz"))) + + (pass-if "foo" + (relative-ref? (build-uri #f #:path "foo"))) + + (pass-if "/foo" + (relative-ref? (build-uri #f #:path "/foo"))) + + (pass-if "//foo/bar" + (relative-ref? (build-uri #f #:host "foo" #:path "/bar"))) + + (pass-if "?foo" + (relative-ref? (build-uri #f #:query "foo")))) + +(with-test-prefix "uri-reference?" + (pass-if "ftp:" + (uri-reference? (build-uri 'ftp))) + + (pass-if "ftp:foo" + (uri-reference? (build-uri 'ftp #:path "foo"))) + + (pass-if "ftp://foo/bar" + (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar"))) + + (pass-if "ftp://foo@bar:22/baz" + (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz"))) + + (pass-if "foo" + (uri-reference? (build-uri #f #:path "foo"))) + + (pass-if "/foo" + (uri-reference? (build-uri #f #:path "/foo"))) + + (pass-if "//foo/bar" + (uri-reference? (build-uri #f #:host "foo" #:path "/bar"))) + + (pass-if "?foo" + (uri-reference? (build-uri #f #:query "foo")))) (with-test-prefix "string->uri" (pass-if "ftp:" @@ -149,6 +241,24 @@ #:port 22 #:path "/baz")) + (pass-if "foo" + (uri=? (string->uri "foo") + #:path "foo")) + + (pass-if "/foo" + (uri=? (string->uri "/foo") + #:path "/foo")) + + (pass-if "//foo/bar" + (uri=? (string->uri "//foo/bar") + #:host "foo" + #:path "/bar")) + + (pass-if "?foo" + (uri=? (string->uri "?foo") + #:path "" + #:query "foo")) + (pass-if "http://bad.host.1" (not (string->uri "http://bad.host.1"))) @@ -229,6 +339,21 @@ (equal? "ftp://foo@bar:22/baz" (uri->string (string->uri "ftp://foo@bar:22/baz")))) + (pass-if "foo" + (equal? "foo" + (uri->string (string->uri "foo")))) + + (pass-if "/foo" + (equal? "/foo" (uri->string (string->uri "/foo")))) + + (pass-if "//foo/bar" + (equal? "//foo/bar" + (uri->string (string->uri "//foo/bar")))) + + (pass-if "?foo" + (equal? "?foo" + (uri->string (string->uri "?foo")))) + (when (memq 'socket *features*) (pass-if "http://192.0.2.1" (equal? "http://192.0.2.1" -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-08 5:52 ` Daniel Hartwig @ 2012-11-08 20:10 ` Ludovic Courtès 2012-11-09 0:39 ` Daniel Hartwig 0 siblings, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-08 20:10 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Hi Daniel, Thanks for the quick reply and patch! Daniel Hartwig <mandyke@gmail.com> skribis: > On 8 November 2012 04:40, Ludovic Courtès <ludo@gnu.org> wrote: >> scheme@(guile-user)> (use-modules (web client) (web uri)) >> scheme@(guile-user)> (http-get (string->uri "http://www.gnu.org/does-not-exist")) >> web/http.scm:191:11: In procedure read-header: >> web/http.scm:191:11: Bad uri header component: gnu-404.html > > Some headers are supposed to support these “URI-references”. Oh, OK. > * TODO maybe use more scheme-ish names instead of RFC 3986 > This latest RFC makes a clear distinction between an actual URI > and a reference to such. I thought it best to reflect that > distinction, but maybe it does pollute the namespace a bit: > - absolute-uri? → uri-absolute? > - relative-ref? → uri-relative? No, I’d keep the names on the left. Name space control is best left to ‘use-modules’ forms, anyway. > * TODO build-uri validation is broken/less strict and will now pass > relative-refs, so maybe introduce build-uri-reference instead Yes. Should uri-reference be a disjoint type, then? > -(declare-uri-header! "Content-Location") > +(declare-uri-reference-header! "Content-Location") OK. > ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864> > ;; > @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string." > > ;; Referer = ( absoluteURI | relativeURI ) > ;; > -(declare-uri-header! "Referer") > +(declare-uri-reference-header! "Referer") Should actually be “Referrer”, no? > +(define (absolute-uri? uri) > + (and (uri? uri) (uri-scheme uri) #t)) Eventually, we’ll need docstrings, and updated documentation. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-08 20:10 ` Ludovic Courtès @ 2012-11-09 0:39 ` Daniel Hartwig 2012-11-09 20:52 ` Ludovic Courtès 2012-11-23 22:19 ` Ludovic Courtès 0 siblings, 2 replies; 18+ messages in thread From: Daniel Hartwig @ 2012-11-09 0:39 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 On 9 November 2012 04:10, Ludovic Courtès <ludo@gnu.org> wrote: >> * TODO build-uri validation is broken/less strict and will now pass >> relative-refs, so maybe introduce build-uri-reference instead > > Yes. Should uri-reference be a disjoint type, then? It needn't be, as long as there are predicates to distinguish. (Actually, since <uri> is internal, maybe we should only expose the new predicates, and keep “uri?” internal also). The build-uri validation works on the values before the <uri> object is constructed, so I was just thinking of a separate build method with different, less strict validation. We just have to think of <uri> and uri? as guile implementation details, not RFC. Another option, is to rename <uri> to <uri-reference>. Then uri? can mean the same as absolute-uri? (as per the RFC). >> @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string." >> >> ;; Referer = ( absoluteURI | relativeURI ) >> ;; >> -(declare-uri-header! "Referer") >> +(declare-uri-reference-header! "Referer") > > Should actually be “Referrer”, no? This is the actual spelling used in the RFC. > Eventually, we’ll need docstrings, and updated documentation. Yes. I lazily left that until the other parts are finalized. Let me tackle the remaining items over the next week. If we had those docs and the naming is ok, this patch is enough to support reading the HTTP headers. Users of http-get should be sure to pass only an absolute-uri. The missing function to resolve a relative-ref to an absolute-uri is not required for reading or writing headers, or using http-get, so that can come later (maybe I get this week anyway). Regards ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-09 0:39 ` Daniel Hartwig @ 2012-11-09 20:52 ` Ludovic Courtès 2012-11-10 1:45 ` Daniel Hartwig 2012-11-23 22:19 ` Ludovic Courtès 1 sibling, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-09 20:52 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Hi Daniel, Daniel Hartwig <mandyke@gmail.com> skribis: > On 9 November 2012 04:10, Ludovic Courtès <ludo@gnu.org> wrote: >>> * TODO build-uri validation is broken/less strict and will now pass >>> relative-refs, so maybe introduce build-uri-reference instead >> >> Yes. Should uri-reference be a disjoint type, then? > > It needn't be, as long as there are predicates to distinguish. > (Actually, since <uri> is internal, maybe we should only expose the > new predicates, and keep “uri?” internal also). I’m fine with keeping <uri> internal, but ‘uri?’ is public and must remain so. Anyway, I think it’s fine if the documentation makes it clear whether functions expect absolute or relative URIs. WDYT? > The build-uri validation works on the values before the <uri> object > is constructed, so I was just thinking of a separate build method with > different, less strict validation. > > We just have to think of <uri> and uri? as guile implementation > details, not RFC. Another option, is to rename <uri> to > <uri-reference>. Then uri? can mean the same as absolute-uri? (as per > the RFC). Out current URI objects are actually absolute URI references, right? In that case, we’ll indeed have to make ‘uri?’ synonymous with ‘absolute-uri?’, for backward compatibility. >>> @@ -1729,7 +1736,7 @@ treated specially, and is just returned as a plain string." >>> >>> ;; Referer = ( absoluteURI | relativeURI ) >>> ;; >>> -(declare-uri-header! "Referer") >>> +(declare-uri-reference-header! "Referer") >> >> Should actually be “Referrer”, no? > > This is the actual spelling used in the RFC. Ouch. >> Eventually, we’ll need docstrings, and updated documentation. > > Yes. I lazily left that until the other parts are finalized. Let me > tackle the remaining items over the next week. Yes, sure. Thanks! Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-09 20:52 ` Ludovic Courtès @ 2012-11-10 1:45 ` Daniel Hartwig 2012-11-10 13:52 ` Ludovic Courtès 0 siblings, 1 reply; 18+ messages in thread From: Daniel Hartwig @ 2012-11-10 1:45 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 On 10 November 2012 04:52, Ludovic Courtès <ludo@gnu.org> wrote: > Anyway, I think it’s fine if the documentation makes it clear whether > functions expect absolute or relative URIs. WDYT? Yes. With the new predicates, it should be clear enough to use the (pseudo-)type names in the usual scheme-doc way: -- Scheme Procedure: uri-resolve base-uri uri-reference and not need to repeat too much in the prose. Of course, doing so when appropriate. I'll try to draft something sensible. > >> The build-uri validation works on the values before the <uri> object >> is constructed, so I was just thinking of a separate build method with >> different, less strict validation. >> >> We just have to think of <uri> and uri? as guile implementation >> details, not RFC. Another option, is to rename <uri> to >> <uri-reference>. Then uri? can mean the same as absolute-uri? (as per >> the RFC). > > Out current URI objects are actually absolute URI references, right? In > that case, we’ll indeed have to make ‘uri?’ synonymous with > ‘absolute-uri?’, for backward compatibility. More-or-less, the only exception being when validation is disabled: scheme@(guile-user)> (uri? (build-uri #f #:path "foo" #:validate? #f)) $1 = #t that object has no scheme, and is not an absolute-uri. This is a bit of an edge case. The current documentation only defines a URI as an absolute-uri and does not talk about anything else. Most functions (uri->string, etc.) will not work when passed something without a scheme. So I think your suggestion is ok as any users of the current API will most certainly be using only absolute-uri's. Regards ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-10 1:45 ` Daniel Hartwig @ 2012-11-10 13:52 ` Ludovic Courtès 0 siblings, 0 replies; 18+ messages in thread From: Ludovic Courtès @ 2012-11-10 13:52 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Hi, Daniel Hartwig <mandyke@gmail.com> skribis: > On 10 November 2012 04:52, Ludovic Courtès <ludo@gnu.org> wrote: >> Anyway, I think it’s fine if the documentation makes it clear whether >> functions expect absolute or relative URIs. WDYT? > > Yes. With the new predicates, it should be clear enough to use the > (pseudo-)type names in the usual scheme-doc way: > > -- Scheme Procedure: uri-resolve base-uri uri-reference > > and not need to repeat too much in the prose. Of course, doing so > when appropriate. I'll try to draft something sensible. Yes. >>> The build-uri validation works on the values before the <uri> object >>> is constructed, so I was just thinking of a separate build method with >>> different, less strict validation. >>> >>> We just have to think of <uri> and uri? as guile implementation >>> details, not RFC. Another option, is to rename <uri> to >>> <uri-reference>. Then uri? can mean the same as absolute-uri? (as per >>> the RFC). >> >> Out current URI objects are actually absolute URI references, right? In >> that case, we’ll indeed have to make ‘uri?’ synonymous with >> ‘absolute-uri?’, for backward compatibility. > > More-or-less, the only exception being when validation is disabled: > > scheme@(guile-user)> (uri? (build-uri #f #:path "foo" #:validate? #f)) > $1 = #t > > that object has no scheme, and is not an absolute-uri. This is a bit > of an edge case. Yes, but when the user sets #:validate? to #f, then they take the responsibility for anything that will happen. IOW, #:validate? #f allows users to forge broken URI objects, but that’s part of the contract anyway. > The current documentation only defines a URI as an absolute-uri and > does not talk about anything else. Most functions (uri->string, etc.) > will not work when passed something without a scheme. So I think your > suggestion is ok as any users of the current API will most certainly > be using only absolute-uri's. Good. So that means that URI refs can be added without introducing any incompatibility. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-09 0:39 ` Daniel Hartwig 2012-11-09 20:52 ` Ludovic Courtès @ 2012-11-23 22:19 ` Ludovic Courtès 2012-11-24 11:23 ` Daniel Hartwig 1 sibling, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-23 22:19 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Hi Daniel, Daniel Hartwig <mandyke@gmail.com> skribis: >> Eventually, we’ll need docstrings, and updated documentation. > > Yes. I lazily left that until the other parts are finalized. Let me > tackle the remaining items over the next week. Any update on that? The plan is to release 2.0.7 next week, so it’d be great if this could be in. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-23 22:19 ` Ludovic Courtès @ 2012-11-24 11:23 ` Daniel Hartwig 2012-11-24 15:10 ` Ludovic Courtès 0 siblings, 1 reply; 18+ messages in thread From: Daniel Hartwig @ 2012-11-24 11:23 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 [-- Attachment #1: Type: text/plain, Size: 686 bytes --] On 24 November 2012 06:19, Ludovic Courtès <ludo@gnu.org> wrote: > Any update on that? The plan is to release 2.0.7 next week, so it’d be > great if this could be in. I have made a first attempt at the doc strings and manual. This involved first syncronizing the two, as only the manual had been receiving updates. Some more tweaking to the code. Personally I am not 100% on this, but I attach it for comment anyway. I will not be able to work on it again for a short while. A quick solution may be to silently introduce just enough to fix the current bug, and worry about the extra predicates, uri-record-type vs. rfc-definition-of-uri, etc. later. Regards [-- Attachment #2: 0001-syncronize-web-module-docstrings-with-manual.patch --] [-- Type: application/octet-stream, Size: 21066 bytes --] From fcc01b345b93d7a75980d7607684e4a5b3243daa Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Sat, 24 Nov 2012 14:10:12 +0800 Subject: [PATCH 1/3] syncronize web module docstrings with manual * doc/ref/web.texi: Fix spacing. Update with a few missing function descriptions. * module/web/client.scm: * module/web/http.scm: * module/web/request.scm: * module/web/server.scm: * module/web/uri.scm: Update docstrings from manual. --- doc/ref/web.texi | 35 ++++++++++++++--------------- module/web/client.scm | 9 ++++++++ module/web/http.scm | 41 ++++++++++++++++------------------ module/web/request.scm | 14 ++++++++---- module/web/server.scm | 10 ++++----- module/web/uri.scm | 57 +++++++++++++++++++++++++++++------------------- 6 files changed, 96 insertions(+), 70 deletions(-) diff --git a/doc/ref/web.texi b/doc/ref/web.texi index 161a28d..e6e594e 100644 --- a/doc/ref/web.texi +++ b/doc/ref/web.texi @@ -431,8 +431,8 @@ from @code{header-writer}. @end deffn @deffn {Scheme Procedure} read-headers port -Read the headers of an HTTP message from @var{port}, returning the -headers as an ordered alist. +Read the headers of an HTTP message from @var{port}, returning them +as an ordered alist. @end deffn @deffn {Scheme Procedure} write-headers headers port @@ -1368,6 +1368,7 @@ Return the given response header, or @var{default} if none was present. the lower-level HTTP, request, and response modules. @deffn {Scheme Procedure} open-socket-for-uri uri +Return an open input/output port for a connection to URI. @end deffn @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t] @@ -1470,17 +1471,17 @@ the server socket. A user may define a server implementation with the following form: -@deffn {Scheme Procedure} define-server-impl name open read write close +@deffn {Scheme Syntax} define-server-impl name open read write close Make a @code{<server-impl>} object with the hooks @var{open}, @var{read}, @var{write}, and @var{close}, and bind it to the symbol @var{name} in the current module. @end deffn @deffn {Scheme Procedure} lookup-server-impl impl -Look up a server implementation. If @var{impl} is a server -implementation already, it is returned directly. If it is a symbol, the +Look up a server implementation. If @var{impl} is a server +implementation already, it is returned directly. If it is a symbol, the binding named @var{impl} in the @code{(web server @var{impl})} module is -looked up. Otherwise an error is signaled. +looked up. Otherwise an error is signaled. Currently a server implementation is a somewhat opaque type, useful only for passing to other procedures in this module, like @code{read-client}. @@ -1494,7 +1495,7 @@ any access to the impl objects. @deffn {Scheme Procedure} open-server impl open-params Open a server for the given implementation. Return one value, the new -server object. The implementation's @code{open} procedure is applied to +server object. The implementation's @code{open} procedure is applied to @var{open-params}, which should be a list. @end deffn @@ -1502,7 +1503,7 @@ server object. The implementation's @code{open} procedure is applied to Read a new client from @var{server}, by applying the implementation's @code{read} procedure to the server. If successful, return three values: an object corresponding to the client, a request object, and the -request body. If any exception occurs, return @code{#f} for all three +request body. If any exception occurs, return @code{#f} for all three values. @end deffn @@ -1513,9 +1514,9 @@ The response and response body are produced by calling the given @var{handler} with @var{request} and @var{body} as arguments. The elements of @var{state} are also passed to @var{handler} as -arguments, and may be returned as additional values. The new +arguments, and may be returned as additional values. The new @var{state}, collected from the @var{handler}'s return values, is then -returned as a list. The idea is that a server loop receives a handler +returned as a list. The idea is that a server loop receives a handler from the user, along with whatever state values the user is interested in, allowing the user's handler to explicitly manage its state. @end deffn @@ -1526,20 +1527,20 @@ given request. As a convenience to web handler authors, @var{response} may be given as an alist of headers, in which case it is used to construct a default -response. Ensures that the response version corresponds to the request -version. If @var{body} is a string, encodes the string to a bytevector, -in an encoding appropriate for @var{response}. Adds a +response. Ensures that the response version corresponds to the request +version. If @var{body} is a string, encodes the string to a bytevector, +in an encoding appropriate for @var{response}. Adds a @code{content-length} and @code{content-type} header, as necessary. If @var{body} is a procedure, it is called with a port as an argument, -and the output collected as a bytevector. In the future we might try to +and the output collected as a bytevector. In the future we might try to instead use a compressing, chunk-encoded port, and call this procedure -later, in the write-client procedure. Authors are advised not to rely on +later, in the write-client procedure. Authors are advised not to rely on the procedure being called at any particular time. @end deffn @deffn {Scheme Procedure} write-client impl server client response body -Write an HTTP response and body to @var{client}. If the server and +Write an HTTP response and body to @var{client}. If the server and client support persistent connections, it is the implementation's responsibility to keep track of the client thereafter, presumably by attaching it to the @var{server} argument somehow. @@ -1572,7 +1573,7 @@ before sending back to the client. Additional arguments to @var{handler} are taken from @var{state}. Additional return values are accumulated into a new @var{state}, which -will be used for subsequent requests. In this way a handler can +will be used for subsequent requests. In this way a handler can explicitly manage its state. @end deffn diff --git a/module/web/client.scm b/module/web/client.scm index cf7ea53..0991373 100644 --- a/module/web/client.scm +++ b/module/web/client.scm @@ -115,6 +115,15 @@ (define* (http-get uri #:key (port (open-socket-for-uri uri)) (version '(1 . 1)) (keep-alive? #f) (extra-headers '()) (decode-body? #t)) + "Connect to the server corresponding to @var{uri} and ask for the +resource, using the @code{GET} method. If you already have a port open, +pass it as @var{port}. The port will be closed at the end of the +request unless @var{keep-alive?} is true. Any extra headers in the +alist @var{extra-headers} will be added to the request. + +If @var{decode-body?} is true, as is the default, the body of the +response will be decoded to string, if it is a textual content-type. +Otherwise it will be returned as a bytevector." (let ((req (build-request uri #:version version #:headers (if keep-alive? extra-headers diff --git a/module/web/http.scm b/module/web/http.scm index cc5dd5a..3b78d08 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -100,12 +100,7 @@ validator writer #:key multiple?) - "Define a parser, validator, and writer for the HTTP header, @var{name}. - -@var{parser} should be a procedure that takes a string and returns a -Scheme value. @var{validator} is a predicate for whether the given -Scheme value is valid for this header. @var{writer} takes a value and a -port, and writes the value to the port." + "Declare a parser, validator, and writer for a given header." (if (and (string? name) parser validator writer) (let ((decl (make-header-decl name parser validator writer multiple?))) (hashq-set! *declared-headers* (string->header name) decl) @@ -120,27 +115,33 @@ port, and writes the value to the port." (string-titlecase (symbol->string sym))))) (define (known-header? sym) - "Return @code{#t} if there are parsers and writers registered for this -header, otherwise @code{#f}." + "Return @code{#t} iff @var{sym} is a known header, with associated +parsers and serialization procedures." (and (lookup-header-decl sym) #t)) (define (header-parser sym) - "Returns a procedure to parse values for the given header." + "Return the value parser for headers named @var{sym}. The result is a +procedure that takes one argument, a string, and returns the parsed +value. If the header isn't known to Guile, a default parser is returned +that passes through the string unchanged." (let ((decl (lookup-header-decl sym))) (if decl (header-decl-parser decl) (lambda (x) x)))) (define (header-validator sym) - "Returns a procedure to validate values for the given header." + "Return a predicate which returns @code{#t} if the given value is valid +for headers named @var{sym}. The default validator for unknown headers +is @code{string?}." (let ((decl (lookup-header-decl sym))) (if decl (header-decl-validator decl) string?))) (define (header-writer sym) - "Returns a procedure to write values for the given header to a given -port." + "Return a procedure that writes values for headers named @var{sym} to a +port. The resulting procedure takes two arguments: a value and a port. +The default writer is @code{display}." (let ((decl (lookup-header-decl sym))) (if decl (header-decl-writer decl) @@ -196,10 +197,7 @@ body was reached (i.e., a blank line)." (define (parse-header sym val) "Parse @var{val}, a string, with the parser registered for the header -named @var{sym}. - -Returns the parsed value. If a parser was not found, the value is -returned as a string." +named @var{sym}. Returns the parsed value." ((header-parser sym) val)) (define (valid-header? sym val) @@ -210,17 +208,16 @@ header with name @var{sym}." (error "header name not a symbol" sym))) (define (write-header sym val port) - "Writes the given header name and value to @var{port}. If @var{sym} -is a known header, uses the specific writer registered for that header. -Otherwise the value is written using @code{display}." + "Write the given header name and value to @var{port}, using the writer +from @code{header-writer}." (display (header->string sym) port) (display ": " port) ((header-writer sym) val port) (display "\r\n" port)) (define (read-headers port) - "Read an HTTP message from @var{port}, returning the headers as an -ordered alist." + "Read the headers of an HTTP message from @var{port}, returning them +as an ordered alist." (let lp ((headers '())) (call-with-values (lambda () (read-header port)) (lambda (k v) @@ -230,7 +227,7 @@ ordered alist." (define (write-headers headers port) "Write the given header alist to @var{port}. Doesn't write the final -\\r\\n, as the user might want to add another header." +@samp{\\r\\n}, as the user might want to add another header." (let lp ((headers headers)) (if (pair? headers) (begin diff --git a/module/web/request.scm b/module/web/request.scm index 40d4a66..51ef473 100644 --- a/module/web/request.scm +++ b/module/web/request.scm @@ -195,7 +195,11 @@ metadata, @var{meta}. As a side effect, sets the encoding on @var{port} to ISO-8859-1 (latin-1), so that reading one character reads one byte. See the discussion of character sets in \"HTTP Requests\" in the manual, for -more information." +more information. + +Note that the body is not part of the request. Once you have read a +request, you may read the body separately, and likewise for writing +requests." (set-port-encoding! port "ISO-8859-1") (call-with-values (lambda () (read-request-line port)) (lambda (method uri version) @@ -205,7 +209,7 @@ more information." (define (write-request r port) "Write the given HTTP request to @var{port}. -Returns a new request, whose @code{request-port} will continue writing +Return a new request, whose @code{request-port} will continue writing on @var{port}, perhaps using some transfer encoding." (write-request-line (request-method r) (request-uri r) (request-version r) port) @@ -217,8 +221,8 @@ on @var{port}, perhaps using some transfer encoding." (request-headers r) (request-meta r) port))) (define (read-request-body r) - "Reads the request body from @var{r}, as a bytevector. Returns -@code{#f} if there was no request body." + "Reads the request body from @var{r}, as a bytevector. Return @code{#f} +if there was no request body." (let ((nbytes (request-content-length r))) (and nbytes (let ((bv (get-bytevector-n (request-port r) nbytes))) @@ -297,6 +301,8 @@ request @var{r}." ;; Misc accessors (define* (request-absolute-uri r #:optional default-host default-port) + "A helper routine to determine the absolute URI of a request, using the +@code{host} header and the default host and port." (let ((uri (request-uri r))) (if (uri-host uri) uri diff --git a/module/web/server.scm b/module/web/server.scm index 42887e6..ed88329 100644 --- a/module/web/server.scm +++ b/module/web/server.scm @@ -143,7 +143,7 @@ for passing to other procedures in this module, like ;; -> server (define (open-server impl open-params) - "Open a server for the given implementation. Returns one value, the + "Open a server for the given implementation. Return one value, the new server object. The implementation's @code{open} procedure is applied to @var{open-params}, which should be a list." (apply (server-impl-open impl) open-params)) @@ -151,9 +151,9 @@ applied to @var{open-params}, which should be a list." ;; -> (client request body | #f #f #f) (define (read-client impl server) "Read a new client from @var{server}, by applying the implementation's -@code{read} procedure to the server. If successful, returns three +@code{read} procedure to the server. If successful, return three values: an object corresponding to the client, a request object, and the -request body. If any exception occurs, returns @code{#f} for all three +request body. If any exception occurs, return @code{#f} for all three values." (call-with-error-handling (lambda () @@ -364,7 +364,7 @@ attaching it to the @var{server} argument somehow." ;; -> new-state (define (serve-one-client handler impl server state) "Read one request from @var{server}, call @var{handler} on the request -and body, and write the response to the client. Returns the new state +and body, and write the response to the client. Return the new state produced by the handler procedure." (debug-elapsed 'serve-again) (call-with-values @@ -404,7 +404,7 @@ The response and body will be run through @code{sanitize-response} before sending back to the client. Additional arguments to @var{handler} are taken from -@var{state}. Additional return values are accumulated into a new +@var{state}. Additional return values are accumulated into a new @var{state}, which will be used for subsequent requests. In this way a handler can explicitly manage its state. diff --git a/module/web/uri.scm b/module/web/uri.scm index 78614a5..ddab7be 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -79,8 +79,10 @@ (define* (build-uri scheme #:key userinfo host port (path "") query fragment (validate? #t)) - "Construct a URI object. If @var{validate?} is true, also run some -consistency checks to make sure that the constructed URI is valid." + "Construct a URI object. @var{scheme} should be a symbol, and the rest +of the fields are either strings or @code{#f}. If @var{validate?} is +true, also run some consistency checks to make sure that the constructed +URI is valid." (if validate? (validate-uri scheme userinfo host port path query fragment)) (make-uri scheme userinfo host port path query fragment)) @@ -168,7 +170,7 @@ consistency checks to make sure that the constructed URI is valid." (make-regexp uri-pat)) (define (string->uri string) - "Parse @var{string} into a URI object. Returns @code{#f} if the string + "Parse @var{string} into a URI object. Return @code{#f} if the string could not be parsed." (% (let ((m (regexp-exec uri-regexp string))) (if (not m) (abort)) @@ -191,10 +193,7 @@ could not be parsed." (define *default-ports* (make-hash-table)) (define (declare-default-port! scheme port) - "Declare a default port for the given URI scheme. - -Default ports are for printing URI objects: a default port is not -printed." + "Declare a default port for the given URI scheme." (hashq-set! *default-ports* scheme port)) (define (default-port? scheme port) @@ -205,7 +204,9 @@ printed." (declare-default-port! 'https 443) (define (uri->string uri) - "Serialize @var{uri} to a string." + "Serialize @var{uri} to a string. If the URI has a port that is the +default port for its scheme, the port is not included in the +serialization." (let* ((scheme-str (string-append (symbol->string (uri-scheme uri)) ":")) (userinfo (uri-userinfo uri)) @@ -293,18 +294,24 @@ printed." (string->char-set "0123456789abcdefABCDEF")) (define* (uri-decode str #:key (encoding "utf-8")) - "Percent-decode the given @var{str}, according to @var{encoding}. + "Percent-decode the given @var{str}, according to @var{encoding}, +which should be the name of a character encoding. Note that this function should not generally be applied to a full URI string. For paths, use split-and-decode-uri-path instead. For query strings, split the query on @code{&} and @code{=} boundaries, and decode the components separately. -Note that percent-encoded strings encode @emph{bytes}, not characters. -There is no guarantee that a given byte sequence is a valid string -encoding. Therefore this routine may signal an error if the decoded -bytes are not valid for the given encoding. Pass @code{#f} for -@var{encoding} if you want decoded bytes as a bytevector directly." +Note also that percent-encoded strings encode @emph{bytes}, not +characters. There is no guarantee that a given byte sequence is a valid +string encoding. Therefore this routine may signal an error if the +decoded bytes are not valid for the given encoding. Pass @code{#f} for +@var{encoding} if you want decoded bytes as a bytevector directly. +@xref{Ports, @code{set-port-encoding!}}, for more information on +character encodings. + +Returns a string of the decoded characters, or a bytevector if +@var{encoding} was @code{#f}." (let* ((len (string-length str)) (bv (call-with-output-bytevector* @@ -358,10 +365,13 @@ bytes are not valid for the given encoding. Pass @code{#f} for ;; (define* (uri-encode str #:key (encoding "utf-8") (unescaped-chars unreserved-chars)) - "Percent-encode any character not in the character set, @var{unescaped-chars}. + "Percent-encode any character not in the character set, +@var{unescaped-chars}. -Percent-encoding first writes out the given character to a bytevector -within the given @var{encoding}, then encodes each byte as +The default character set includes alphanumerics from ASCII, as well as +the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any +other character will be percent-encoded, by writing out the character to +a bytevector within the given @var{encoding}, then encoding each byte as @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of the byte." (define (needs-escaped? ch) @@ -387,15 +397,18 @@ the byte." str)) (define (split-and-decode-uri-path path) - "Split @var{path} into its components, and decode each -component, removing empty components. + "Split @var{path} into its components, and decode each component, +removing empty components. -For example, @code{\"/foo/bar/\"} decodes to the two-element list, -@code{(\"foo\" \"bar\")}." +For example, @code{\"/foo/bar%20baz/\"} decodes to the two-element list, +@code{(\"foo\" \"bar baz\")}." (filter (lambda (x) (not (string-null? x))) (map uri-decode (string-split path #\/)))) (define (encode-and-join-uri-path parts) "URI-encode each element of @var{parts}, which should be a list of -strings, and join the parts together with @code{/} as a delimiter." +strings, and join the parts together with @code{/} as a delimiter. + +For example, the list @code{(\"scrambled eggs\" \"biscuits&gravy\")} +encodes as @code{\"scrambled%20eggs/biscuits%26gravy\"}." (string-join (map uri-encode parts) "/")) -- 1.7.10.4 [-- Attachment #3: 0002-web-uri-document-that-uri-port-is-an-integer.patch --] [-- Type: application/octet-stream, Size: 2827 bytes --] From 50c5235ca50e356acfad709ae820c2963a8ff11c Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Sat, 24 Nov 2012 15:11:21 +0800 Subject: [PATCH 2/3] (web uri): document that uri-port is an integer * doc/ref/web.texi (URIs): * module/web/uri.scm (build-uri): Document that uri-port is an integer. --- doc/ref/web.texi | 13 +++++++------ module/web/uri.scm | 9 +++++---- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/doc/ref/web.texi b/doc/ref/web.texi index e6e594e..3e93bea 100644 --- a/doc/ref/web.texi +++ b/doc/ref/web.texi @@ -209,10 +209,11 @@ access to them. @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @ [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @ [#:fragment=@code{#f}] [#:validate?=@code{#t}] -Construct a URI object. @var{scheme} should be a symbol, and the rest -of the fields are either strings or @code{#f}. If @var{validate?} is -true, also run some consistency checks to make sure that the constructed -URI is valid. +Construct a URI object. @var{scheme} should be a symbol, @var{port} +either a positive, exact integer or @code{#f}, and the rest of the +fields are either strings or @code{#f}. If @var{validate?} is true, +also run some consistency checks to make sure that the constructed URI +is valid. @end deffn @deffn {Scheme Procedure} uri? x @@ -224,8 +225,8 @@ URI is valid. @deffnx {Scheme Procedure} uri-query uri @deffnx {Scheme Procedure} uri-fragment uri A predicate and field accessors for the URI record type. The URI scheme -will be a symbol, and the rest either strings or @code{#f} if not -present. +will be a symbol, the port either a positive, exact integer or @code{#f}, +and the rest either strings or @code{#f} if not present. @end deffn @deffn {Scheme Procedure} string->uri string diff --git a/module/web/uri.scm b/module/web/uri.scm index ddab7be..e84bc03 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -79,10 +79,11 @@ (define* (build-uri scheme #:key userinfo host port (path "") query fragment (validate? #t)) - "Construct a URI object. @var{scheme} should be a symbol, and the rest -of the fields are either strings or @code{#f}. If @var{validate?} is -true, also run some consistency checks to make sure that the constructed -URI is valid." + "Construct a URI object. @var{scheme} should be a symbol, @var{port} +either a positive, exact integer or @code{#f}, and the rest of the +fields are either strings or @code{#f}. If @var{validator?} is true, +also run some consistency checks to make sure that the constructed URI +is valid." (if validate? (validate-uri scheme userinfo host port path query fragment)) (make-uri scheme userinfo host port path query fragment)) -- 1.7.10.4 [-- Attachment #4: 0003-uri-reference-support.patch --] [-- Type: application/octet-stream, Size: 15823 bytes --] From ebbcc923b776fd3fbfe28c5c050ee8df7b71529d Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Thu, 8 Nov 2012 13:11:52 +0800 Subject: [PATCH 3/3] uri-reference support * module/web/uri.scm (build-uri, validate-uri): Also build relative-refs, which have no scheme. (string->uri, uri->string): Also support URI objects with no scheme. (uri-reference?, relative-ref?, absolute-uri?): New predicates to distinguish various URI-like objects. (uri?): Redefine so that semantics are unchanged; only return #t for objects previously built and validated by build-uri?. Such objects always had a uri-scheme. * module/web/http.scm (declare-uri-reference-header!): New header type accepting any URI-reference. ("Content-Location", "Referer"): Change to URI-reference type. * doc/ref/web.texi (URIs): Document other URI-like syntaxes defined in RFC 3986. Include brief discussion. Update functions * test-suite/tests/web-http.test: * test-suite/tests/web-uri.test: Add relevant tests. --- doc/ref/web.texi | 39 ++++++++++--- module/web/http.scm | 13 ++++- module/web/uri.scm | 59 ++++++++++++++----- test-suite/tests/web-http.test | 4 ++ test-suite/tests/web-uri.test | 125 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 214 insertions(+), 26 deletions(-) diff --git a/doc/ref/web.texi b/doc/ref/web.texi index 3e93bea..d247080 100644 --- a/doc/ref/web.texi +++ b/doc/ref/web.texi @@ -173,8 +173,9 @@ Guile provides a standard data type for Universal Resource Identifiers The generic URI syntax is as follows: @example -URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \ - [ "?" query ] [ "#" fragment ] +URI := scheme ":" hier-part [ "?" query ] [ "#" fragment ] + +hier-part := ["//" [userinfo "@@"] host [":" port]] path @end example For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the @@ -198,6 +199,25 @@ fragment identifies a part of a resource, not the resource itself. But it is useful to have a fragment field in the URI record itself, so we hope you will forgive the inconsistency. +There are some additional URI-like syntaxes: + +@example +URI-reference := URI / relative-ref + +relative-ref := hier-part [ "?" query ] [ "#" fragment ] + +absolute-URI := scheme ":" hier-part [ "?" query ] +@end example + +These extra forms are useful in various situations. For example, +relative-refs are convenient in documents to refer to other parts of the +same document, or resources located on the same site. + +A relative-ref must be considered in relation to a given base URI to +correctly identify a resource. The base URI is determined according to +the context and properties of the document in which the relative-ref is +located. See section 5 of RFC 3986 for details. + @example (use-modules (web uri)) @end example @@ -211,12 +231,14 @@ access to them. [#:fragment=@code{#f}] [#:validate?=@code{#t}] Construct a URI object. @var{scheme} should be a symbol, @var{port} either a positive, exact integer or @code{#f}, and the rest of the -fields are either strings or @code{#f}. If @var{validate?} is true, -also run some consistency checks to make sure that the constructed URI -is valid. +fields are strings. If @var{validate?} is true, also run some +consistency checks to make sure that the constructed URI is valid. @end deffn @deffn {Scheme Procedure} uri? x +@deffnx {Scheme Procedure} uri-reference? x +@deffnx {Scheme Procedure} relative-ref? x +@deffnx {Scheme Procedure} absolute-uri? x @deffnx {Scheme Procedure} uri-scheme uri @deffnx {Scheme Procedure} uri-userinfo uri @deffnx {Scheme Procedure} uri-host uri @@ -224,9 +246,10 @@ is valid. @deffnx {Scheme Procedure} uri-path uri @deffnx {Scheme Procedure} uri-query uri @deffnx {Scheme Procedure} uri-fragment uri -A predicate and field accessors for the URI record type. The URI scheme -will be a symbol, the port either a positive, exact integer or @code{#f}, -and the rest either strings or @code{#f} if not present. +Predicates and field accessors for the URI record type. The URI scheme +will be a symbol, the port a positive, exact integer, and the rest +strings. Any field other than @code{uri-path} may also be @code{#f} if +not present. @end deffn @deffn {Scheme Procedure} string->uri string diff --git a/module/web/http.scm b/module/web/http.scm index 3b78d08..389880b 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -1179,12 +1179,19 @@ treated specially, and is just returned as a plain string." parse-non-negative-integer non-negative-integer? display)) ;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1) -(define (declare-uri-header! name) +(define* (declare-uri-header! name #:optional) (declare-header! name (lambda (str) (or (string->uri str) (bad-header-component 'uri str))) uri? write-uri)) +;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1) +(define (declare-uri-reference-header! name) + (declare-header! name + (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str))) + uri-reference? + write-uri)) + ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1) (define (declare-quality-list-header! name) (declare-header! name @@ -1437,7 +1444,7 @@ treated specially, and is just returned as a plain string." ;; Content-Location = ( absoluteURI | relativeURI ) ;; -(declare-uri-header! "Content-Location") +(declare-uri-reference-header! "Content-Location") ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864> ;; @@ -1726,7 +1733,7 @@ treated specially, and is just returned as a plain string." ;; Referer = ( absoluteURI | relativeURI ) ;; -(declare-uri-header! "Referer") +(declare-uri-reference-header! "Referer") ;; TE = #( t-codings ) ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] ) diff --git a/module/web/uri.scm b/module/web/uri.scm index e84bc03..e7990ad 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -31,7 +31,7 @@ #:use-module (ice-9 control) #:use-module (rnrs bytevectors) #:use-module (ice-9 binary-ports) - #:export (uri? + #:export (uri? uri-reference? relative-ref? absolute-uri? uri-scheme uri-userinfo uri-host uri-port uri-path uri-query uri-fragment @@ -44,7 +44,7 @@ (define-record-type <uri> (make-uri scheme userinfo host port path query fragment) - uri? + uri-reference? (scheme uri-scheme) (userinfo uri-userinfo) (host uri-host) @@ -53,15 +53,32 @@ (query uri-query) (fragment uri-fragment)) +(define (uri? x) + (and (uri-reference? x) + (uri-scheme x) + #t)) + +(define (relative-ref? x) + (and (uri-reference? x) + (not (uri-scheme x)) + #t)) + +(define (absolute-uri? x) + (and (uri-reference? x) + (uri-scheme x) + (not (uri-fragment x)) + #t)) + (define (uri-error message . args) (throw 'uri-error message args)) (define (positive-exact-integer? port) (and (number? port) (exact? port) (integer? port) (positive? port))) -(define (validate-uri scheme userinfo host port path query fragment) +(define (validate-uri scheme userinfo host port path query fragment reference?) (cond - ((not (symbol? scheme)) + ((and (not (symbol? scheme)) + (or (not reference?) scheme)) (uri-error "Expected a symbol for the URI scheme: ~s" scheme)) ((and (or userinfo port) (not host)) (uri-error "Expected a host, given userinfo or port")) @@ -85,7 +102,7 @@ fields are either strings or @code{#f}. If @var{validator?} is true, also run some consistency checks to make sure that the constructed URI is valid." (if validate? - (validate-uri scheme userinfo host port path query fragment)) + (validate-uri scheme userinfo host port path query fragment #t)) (make-uri scheme userinfo host port path query fragment)) ;; See RFC 3986 #3.2.2 for comments on percent-encodings, IDNA (RFC @@ -153,6 +170,17 @@ is valid." ;;; / path-absolute ;;; / path-rootless ;;; / path-empty +;;; +;;; RFC 3986, #4. +;;; +;;; URI-reference = URI / relative-ref +;;; +;;; relative-ref = relative-part [ "?" query ] [ "#" fragment ] +;;; +;;; relative-part = "//" authority path-abempty +;;; / path-absolute +;;; / path-noscheme +;;; / path-empty (define scheme-pat "[a-zA-Z][a-zA-Z0-9+.-]*") @@ -165,7 +193,7 @@ is valid." (define fragment-pat ".*") (define uri-pat - (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$" + (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$" scheme-pat authority-pat path-pat query-pat fragment-pat)) (define uri-regexp (make-regexp uri-pat)) @@ -175,12 +203,12 @@ is valid." could not be parsed." (% (let ((m (regexp-exec uri-regexp string))) (if (not m) (abort)) - (let ((scheme (string->symbol - (string-downcase (match:substring m 1)))) - (authority (match:substring m 2)) - (path (match:substring m 3)) - (query (match:substring m 5)) - (fragment (match:substring m 7))) + (let ((scheme (let ((s (match:substring m 2))) + (and s (string->symbol (string-downcase s))))) + (authority (match:substring m 3)) + (path (match:substring m 4)) + (query (match:substring m 6)) + (fragment (match:substring m 8))) (call-with-values (lambda () (if authority @@ -208,8 +236,7 @@ could not be parsed." "Serialize @var{uri} to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization." - (let* ((scheme-str (string-append - (symbol->string (uri-scheme uri)) ":")) + (let* ((scheme (uri-scheme uri)) (userinfo (uri-userinfo uri)) (host (uri-host uri)) (port (uri-port uri)) @@ -217,7 +244,9 @@ serialization." (query (uri-query uri)) (fragment (uri-fragment uri))) (string-append - scheme-str + (if scheme + (string-append (symbol->string scheme) ":") + "") (if host (string-append "//" (if userinfo (string-append userinfo "@") diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test index 97f5559..5d80fde 100644 --- a/test-suite/tests/web-http.test +++ b/test-suite/tests/web-http.test @@ -145,6 +145,8 @@ (pass-if-parse content-length "010" 10) (pass-if-parse content-location "http://foo/" (build-uri 'http #:host "foo" #:path "/")) + (pass-if-parse content-location "//foo" + (build-uri #f #:host "foo")) (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *)) (pass-if-parse content-range "bytes */*" '(bytes * *)) (pass-if-parse content-range "bytes */30" '(bytes * 30)) @@ -208,6 +210,8 @@ (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30))) (pass-if-parse referer "http://foo/bar?baz" (build-uri 'http #:host "foo" #:path "/bar" #:query "baz")) + (pass-if-parse referer "//foo/bar?baz" + (build-uri #f #:host "foo" #:path "/bar" #:query "baz")) (pass-if-parse te "trailers" '((trailers))) (pass-if-parse te "trailers,foo" '((trailers) (foo))) (pass-if-parse user-agent "guile" "guile")) diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test index 3f6e7e3..d4d44d6 100644 --- a/test-suite/tests/web-uri.test +++ b/test-suite/tests/web-uri.test @@ -78,6 +78,24 @@ #:port 22 #:path "/baz")) + (pass-if "foo" + (uri=? (build-uri #f #:path "foo") + #:path "foo")) + + (pass-if "/foo" + (uri=? (build-uri #f #:path "/foo") + #:path "/foo")) + + (pass-if "//foo/bar" + (uri=? (build-uri #f #:host "foo" #:path "/bar") + #:host "foo" + #:path "/bar")) + + (pass-if "?foo" + (uri=? (build-uri #f #:query "foo") + #:path "" + #:query "foo")) + (pass-if-uri-exception "non-symbol scheme" "Expected.*symbol" (build-uri "nonsym")) @@ -123,6 +141,80 @@ "Expected.*host" (build-uri 'http #:userinfo "foo"))) +(with-test-prefix "absolute-uri?" + (pass-if "ftp:" + (absolute-uri? (build-uri 'ftp))) + + (pass-if "ftp:foo" + (absolute-uri? (build-uri 'ftp #:path "foo"))) + + (pass-if "ftp://foo/bar" + (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar"))) + + (pass-if "ftp://foo@bar:22/baz" + (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz"))) + + (expect-fail "foo" + (absolute-uri? (build-uri #f #:path "foo"))) + + (expect-fail "/foo" + (absolute-uri? (build-uri #f #:path "/foo"))) + + (expect-fail "//foo/bar" + (absolute-uri? (build-uri #f #:host "foo" #:path "/bar"))) + + (expect-fail "?foo" + (absolute-uri? (build-uri #f #:query "foo")))) + +(with-test-prefix "relative-ref?" + (expect-fail "ftp:" + (relative-ref? (build-uri 'ftp))) + + (expect-fail "ftp:foo" + (relative-ref? (build-uri 'ftp #:path "foo"))) + + (expect-fail "ftp://foo/bar" + (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar"))) + + (expect-fail "ftp://foo@bar:22/baz" + (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz"))) + + (pass-if "foo" + (relative-ref? (build-uri #f #:path "foo"))) + + (pass-if "/foo" + (relative-ref? (build-uri #f #:path "/foo"))) + + (pass-if "//foo/bar" + (relative-ref? (build-uri #f #:host "foo" #:path "/bar"))) + + (pass-if "?foo" + (relative-ref? (build-uri #f #:query "foo")))) + +(with-test-prefix "uri-reference?" + (pass-if "ftp:" + (uri-reference? (build-uri 'ftp))) + + (pass-if "ftp:foo" + (uri-reference? (build-uri 'ftp #:path "foo"))) + + (pass-if "ftp://foo/bar" + (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar"))) + + (pass-if "ftp://foo@bar:22/baz" + (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz"))) + + (pass-if "foo" + (uri-reference? (build-uri #f #:path "foo"))) + + (pass-if "/foo" + (uri-reference? (build-uri #f #:path "/foo"))) + + (pass-if "//foo/bar" + (uri-reference? (build-uri #f #:host "foo" #:path "/bar"))) + + (pass-if "?foo" + (uri-reference? (build-uri #f #:query "foo")))) (with-test-prefix "string->uri" (pass-if "ftp:" @@ -149,6 +241,24 @@ #:port 22 #:path "/baz")) + (pass-if "foo" + (uri=? (string->uri "foo") + #:path "foo")) + + (pass-if "/foo" + (uri=? (string->uri "/foo") + #:path "/foo")) + + (pass-if "//foo/bar" + (uri=? (string->uri "//foo/bar") + #:host "foo" + #:path "/bar")) + + (pass-if "?foo" + (uri=? (string->uri "?foo") + #:path "" + #:query "foo")) + (pass-if "http://bad.host.1" (not (string->uri "http://bad.host.1"))) @@ -229,6 +339,21 @@ (equal? "ftp://foo@bar:22/baz" (uri->string (string->uri "ftp://foo@bar:22/baz")))) + (pass-if "foo" + (equal? "foo" + (uri->string (string->uri "foo")))) + + (pass-if "/foo" + (equal? "/foo" (uri->string (string->uri "/foo")))) + + (pass-if "//foo/bar" + (equal? "//foo/bar" + (uri->string (string->uri "//foo/bar")))) + + (pass-if "?foo" + (equal? "?foo" + (uri->string (string->uri "?foo")))) + (when (memq 'socket *features*) (pass-if "http://192.0.2.1" (equal? "http://192.0.2.1" -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-24 11:23 ` Daniel Hartwig @ 2012-11-24 15:10 ` Ludovic Courtès 2012-11-24 15:34 ` Daniel Hartwig 0 siblings, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-24 15:10 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Hi Daniel, Daniel Hartwig <mandyke@gmail.com> skribis: > On 24 November 2012 06:19, Ludovic Courtès <ludo@gnu.org> wrote: >> Any update on that? The plan is to release 2.0.7 next week, so it’d be >> great if this could be in. > > I have made a first attempt at the doc strings and manual. This > involved first syncronizing the two, as only the manual had been > receiving updates. > > Some more tweaking to the code. Thanks. I applied the first two patches, and passed the source files through: sed -e"s/@var{\([a-z0-9?!-]\+\)}/\U\1/g ; s/@code{\([^}]\+\)}/‘\1’/g" because docstrings should not contain Texinfo markup. > Personally I am not 100% on this, but I attach it for comment anyway. > I will not be able to work on it again for a short while. > > A quick solution may be to silently introduce just enough to fix the > current bug, and worry about the extra predicates, uri-record-type vs. > rfc-definition-of-uri, etc. later. I could come up with a ‘declare-relative-uri-header!’ that would use (build-uri xxx #:validate? #f) as a quick fix. However, it seems to me that your patch is actually fine, and doesn’t break compatibility, so I’d rather apply it directly. Did you have other concerns? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-24 15:10 ` Ludovic Courtès @ 2012-11-24 15:34 ` Daniel Hartwig 2012-11-26 0:15 ` Ludovic Courtès 0 siblings, 1 reply; 18+ messages in thread From: Daniel Hartwig @ 2012-11-24 15:34 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 On 24 November 2012 23:10, Ludovic Courtès <ludo@gnu.org> wrote: >> Personally I am not 100% on this, but I attach it for comment anyway. >> I will not be able to work on it again for a short while. >> >> A quick solution may be to silently introduce just enough to fix the >> current bug, and worry about the extra predicates, uri-record-type vs. >> rfc-definition-of-uri, etc. later. > > I could come up with a ‘declare-relative-uri-header!’ that would use > (build-uri xxx #:validate? #f) as a quick fix. > > However, it seems to me that your patch is actually fine, and doesn’t > break compatibility, so I’d rather apply it directly. Did you have > other concerns? The API seems less clean, and it is not immediately clear that uri? is not the top of the URI-like type hierarchy. The other functions only indicate “uri” in their name. I did not wish to introduce parallel “build-uri-reference”, etc. for each of these, and did consider adding #:reference? on some to select weaker validation. I think I prefer to keep the base type predicate as uri?, and use relative-ref? and absolute-uri? to distinguish. This would mean deviating from the RFC by: - permitting fragments in absolute-uri?; - not requiring a scheme in uri?; which are both forgivable. Maybe I am just too fussy! ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-24 15:34 ` Daniel Hartwig @ 2012-11-26 0:15 ` Ludovic Courtès 2012-11-26 23:13 ` Ludovic Courtès 0 siblings, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-26 0:15 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 [-- Attachment #1: Type: text/plain, Size: 489 bytes --] Hi! Daniel Hartwig <mandyke@gmail.com> skribis: > The API seems less clean, and it is not immediately clear > that uri? is not the top of the URI-like type hierarchy. The other > functions only indicate “uri” in their name. I did not > wish to introduce parallel “build-uri-reference”, etc. for each of > these, and did consider adding #:reference? on some to select > weaker validation. OK. So for now, I’d go with this patch, which fixes the bug at hand: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 1396 bytes --] diff --git a/module/web/http.scm b/module/web/http.scm index 342f435..65ebe7d 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -1185,6 +1185,17 @@ treated specially, and is just returned as a plain string." uri? write-uri)) +;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1) +(define (declare-relative-uri-header! name) + (declare-header! name + (lambda (str) + (or (string->uri str) ; absolute URI + (build-uri #f ; relative URI + #:path str + #:validate? #f))) + uri? + write-uri)) + ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1) (define (declare-quality-list-header! name) (declare-header! name @@ -1437,7 +1448,7 @@ treated specially, and is just returned as a plain string." ;; Content-Location = ( absoluteURI | relativeURI ) ;; -(declare-uri-header! "Content-Location") +(declare-relative-uri-header! "Content-Location") ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864> ;; @@ -1726,7 +1737,7 @@ treated specially, and is just returned as a plain string." ;; Referer = ( absoluteURI | relativeURI ) ;; -(declare-uri-header! "Referer") +(declare-relative-uri-header! "Referer") ;; TE = #( t-codings ) ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] ) [-- Attachment #3: Type: text/plain, Size: 115 bytes --] Once 2.0.7 is out, when you have more time, we can fix it cleanly. How does that sound? Thanks, Ludo’. ^ permalink raw reply related [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-26 0:15 ` Ludovic Courtès @ 2012-11-26 23:13 ` Ludovic Courtès 2012-11-27 1:06 ` Daniel Hartwig 0 siblings, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-26 23:13 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 ludo@gnu.org (Ludovic Courtès) skribis: > So for now, I’d go with this patch, which fixes the bug at hand: I just applied this patch as 261af76. You’re welcome to polish support for URI references once 2.0.7 is out! :-) Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-26 23:13 ` Ludovic Courtès @ 2012-11-27 1:06 ` Daniel Hartwig 2012-11-27 12:50 ` Ludovic Courtès 0 siblings, 1 reply; 18+ messages in thread From: Daniel Hartwig @ 2012-11-27 1:06 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 On 27 November 2012 07:13, Ludovic Courtès <ludo@gnu.org> wrote: > ludo@gnu.org (Ludovic Courtès) skribis: > >> So for now, I’d go with this patch, which fixes the bug at hand: > > I just applied this patch as 261af76. > +;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1) +(define (declare-relative-uri-header! name) + (declare-header! name + (lambda (str) + (or (string->uri str) ; absolute URI + (build-uri #f ; relative URI + #:path str + #:validate? #f))) + uri? + write-uri)) + Sorry for late response. Setting uri-path to str will result in wrongly constructed uri objects. In practice, the relative uri will often have a query part. In theory, they may also contain any other part of the uri except scheme (which would make them absolute). ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-27 1:06 ` Daniel Hartwig @ 2012-11-27 12:50 ` Ludovic Courtès 2012-11-27 15:18 ` Daniel Hartwig 0 siblings, 1 reply; 18+ messages in thread From: Ludovic Courtès @ 2012-11-27 12:50 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Daniel Hartwig <mandyke@gmail.com> skribis: > +;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1) > +(define (declare-relative-uri-header! name) > + (declare-header! name > + (lambda (str) > + (or (string->uri str) ; absolute URI > + (build-uri #f ; relative URI > + #:path str > + #:validate? #f))) > + uri? > + write-uri)) > + > > Sorry for late response. > > Setting uri-path to str will result in wrongly constructed uri > objects. In practice, the relative uri will often have a query part. > In theory, they may also contain any other part of the uri except > scheme (which would make them absolute). Sure. But then again, the goal was just to have a hack that would solve the problem initially reported here, while waiting for a proper fix. I’m open to suggestions. It seems to me that it’s either this or your patches against (web uri). WDYT? Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-27 12:50 ` Ludovic Courtès @ 2012-11-27 15:18 ` Daniel Hartwig 2012-11-27 21:43 ` Ludovic Courtès 0 siblings, 1 reply; 18+ messages in thread From: Daniel Hartwig @ 2012-11-27 15:18 UTC (permalink / raw) To: Ludovic Courtès; +Cc: 12827 [-- Attachment #1: Type: text/plain, Size: 1017 bytes --] On 27 November 2012 20:50, Ludovic Courtès <ludo@gnu.org> wrote: > Sure. But then again, the goal was just to have a hack that would solve > the problem initially reported here, while waiting for a proper fix. Avoiding an obvious parser error, but introducing subtle problems with the objects. The reported bug has been present since always with (web uri). > > I’m open to suggestions. It seems to me that it’s either this or your > patches against (web uri). WDYT? I still don't like the API in those patches, and don't think it is tested enough against, e.g., reading and writing all combinations of headers and idempotent (?) across read-write cycles (a quick check just showed that it wasn't, due to write-uri in (web http)). Anyway, if you like to fix this bug, I have isolated the changes to string->uri, just enough to handle these headers without introducing any API changes (which can come later, after refinement). Attached a patch against current stable-2.0. Regards [-- Attachment #2: 0001-web-client-correctly-handle-uri-query-etc.-in-relati.patch --] [-- Type: application/octet-stream, Size: 4336 bytes --] From 7d9f3002a5bc0b897618759359c91bc94cd1fdec Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Tue, 27 Nov 2012 16:48:41 +0800 Subject: [PATCH] web client: correctly handle uri-query, etc. in relative URI headers * module/web/uri.scm (string->uri*): New private procedure to also parse relative URIs. * module/web/http.scm (declare-relative-uri-header!): Use that. --- module/web/http.scm | 12 +++--------- module/web/uri.scm | 30 ++++++++++++++++++++---------- 2 files changed, 23 insertions(+), 19 deletions(-) diff --git a/module/web/http.scm b/module/web/http.scm index f8dba30..216fddd 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -1182,21 +1182,15 @@ treated specially, and is just returned as a plain string." (define (declare-uri-header! name) (declare-header! name (lambda (str) (or (string->uri str) (bad-header-component 'uri str))) - uri? + (@@ (web uri) absolute-uri?) write-uri)) ;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1) (define (declare-relative-uri-header! name) (declare-header! name (lambda (str) - ;; XXX: Attempt to build an absolute URI, and fall back to a URI - ;; with no scheme to represent a relative URI. - ;; See <http://bugs.gnu.org/12827> for ideas to fully support - ;; relative URIs (aka. "URI references"). - (or (string->uri str) ; absolute URI - (build-uri #f ; relative URI - #:path str - #:validate? #f))) + (or ((@@ (web uri) string->uri*) str) + (bad-header-component 'uri str))) uri? write-uri)) diff --git a/module/web/uri.scm b/module/web/uri.scm index 6ff0076..b688ea8 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -53,6 +53,9 @@ (query uri-query) (fragment uri-fragment)) +(define (absolute-uri? x) + (and (uri? x) (uri-scheme x) #t)) + (define (uri-error message . args) (throw 'uri-error message args)) @@ -165,21 +168,21 @@ is valid." (define fragment-pat ".*") (define uri-pat - (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$" + (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$" scheme-pat authority-pat path-pat query-pat fragment-pat)) (define uri-regexp (make-regexp uri-pat)) -(define (string->uri string) +(define (string->uri* string) "Parse STRING into a URI object. Return ‘#f’ if the string could not be parsed." (% (let ((m (regexp-exec uri-regexp string))) (if (not m) (abort)) - (let ((scheme (string->symbol - (string-downcase (match:substring m 1)))) - (authority (match:substring m 2)) - (path (match:substring m 3)) - (query (match:substring m 5)) + (let ((scheme (let ((str (match:substring m 2))) + (and str (string->symbol (string-downcase str))))) + (authority (match:substring m 3)) + (path (match:substring m 4)) + (query (match:substring m 6)) (fragment (match:substring m 7))) (call-with-values (lambda () @@ -191,6 +194,12 @@ could not be parsed." (lambda (k) #f))) +(define (string->uri string) + "Parse STRING into a URI object. Return ‘#f’ if the string +could not be parsed." + (let ((uri (string->uri* string))) + (and uri (uri-scheme uri) uri))) + (define *default-ports* (make-hash-table)) (define (declare-default-port! scheme port) @@ -208,8 +217,7 @@ could not be parsed." "Serialize URI to a string. If the URI has a port that is the default port for its scheme, the port is not included in the serialization." - (let* ((scheme-str (string-append - (symbol->string (uri-scheme uri)) ":")) + (let* ((scheme (uri-scheme uri)) (userinfo (uri-userinfo uri)) (host (uri-host uri)) (port (uri-port uri)) @@ -217,7 +225,9 @@ serialization." (query (uri-query uri)) (fragment (uri-fragment uri))) (string-append - scheme-str + (if scheme + (string-append (symbol->string scheme) ":") + "") (if host (string-append "//" (if userinfo (string-append userinfo "@") -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* bug#12827: [2.0.6] web client: fails to parse 404 header 2012-11-27 15:18 ` Daniel Hartwig @ 2012-11-27 21:43 ` Ludovic Courtès 0 siblings, 0 replies; 18+ messages in thread From: Ludovic Courtès @ 2012-11-27 21:43 UTC (permalink / raw) To: Daniel Hartwig; +Cc: 12827 Hi, Daniel Hartwig <mandyke@gmail.com> skribis: > I still don't like the API in those patches, and don't think it is > tested enough against, e.g., reading and writing all combinations of > headers and idempotent (?) across read-write cycles (a quick check > just showed that it wasn't, due to write-uri in (web http)). Fair enough. > Anyway, if you like to fix this bug, I have isolated the changes to > string->uri, just enough to handle these headers without introducing > any API changes (which can come later, after refinement). Applied, thanks! Ludo’. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) 2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès 2012-11-08 5:52 ` Daniel Hartwig @ 2013-02-23 8:11 ` Daniel Hartwig 1 sibling, 0 replies; 18+ messages in thread From: Daniel Hartwig @ 2013-02-23 8:11 UTC (permalink / raw) To: guile-devel; +Cc: 12827, Ludovic Courtès [-- Attachment #1: Type: text/plain, Size: 3250 bytes --] On 24 November 2012 23:34, Daniel Hartwig <mandyke@gmail.com> wrote: > The API seems less clean, and it is not immediately clear > that uri? is not the top of the URI-like type hierarchy. The other > functions only indicate “uri” in their name. I did not > wish to introduce parallel “build-uri-reference”, etc. for each of > these, and did consider adding #:reference? on some to select > weaker validation. > > I think I prefer to keep the base type predicate as uri?, and use > relative-ref? and absolute-uri? to distinguish. This would mean > deviating from the RFC by: > - permitting fragments in absolute-uri?; > - not requiring a scheme in uri?; > > which are both forgivable. > > Maybe I am just too fussy! Hello now Revisting this old thread, I have complete the bulk of the work now. Attached are a few patches to generally clean up the web modules a bit, followed by a larger one introducing the promised relative URI support. * Terminology The terminology used in latest URI spec. (RFC 3986) is not widely used elsewhere. Not by Guile, not by the HTTP spec., or other sources. Specifically, it defines these terms: - URI: scheme rest ... [fragment] - Absolute-URI: scheme rest ... [fragment] - Relative-Ref: rest ... [fragment] - URI-Reference: Absolute-URI | Relative-Ref where as HTTP and other sources use the terms from an earlier URI spec. (RFC 2396): - Absolute-URI: scheme rest ... [fragment] - Relative-URI: rest ... [fragment] - URI, URI-Reference: Absolute-URI | Relative-URI With this patch I have opted to use the later, more common terms. This has the advantage of not requiring massive renaming or duplicating of most procedures to include, e.g. ‘uri-reference-scheme’, as we just use the term ‘uri’ to refer to either type. If this is undesired, it can easily be reworked to use the terminology from RFC 3986. * API compatability Presently, all APIs work only with absolute URIs. You can not use string->uri or build-uri to produce any relative URIs, neither are other procedures (generally) expected to work correctly if given them. What we have in this patch is that <uri> grows to encompass both relative and absolute URIs. ‘uri?’ is a general type predicate, ‘build-uri’ will produce and validate either type, and there are pairs of converters and predicates to distinguish between relative and absolute. Effectively, a pseudo-type heirarchy, with uri? at the top and absolute-uri? and relative-uri? beneath it. * To be done Barely touched request, response, client, or server modules. Though these will continue to work with current usage patterns and I have added some notes about future work. Also, I believe it will pay to extend http-get et. al to accept a relative URI with separate Host header or keyword option. Also allow write-request-line to display exactly the URI passed to it, rather than always chopping off the scheme and host (e.g. the HTTP spec. allows such lines and they are require to write some types of proxy software). Coming along soon is a procedure to resolve a relative URI against a base, absolute URI. The same algorithm documented in the RFC. Regards [-- Attachment #2: 0001-minor-tweaks-to-web-documentation.patch --] [-- Type: application/octet-stream, Size: 17969 bytes --] From a25a817be56821d2656da43bcbae4f2a816a801f Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Sat, 23 Feb 2013 13:39:14 +0800 Subject: [PATCH 1/4] minor tweaks to web documentation * doc/ref/web.texi: Say `World Wide Web'; the hyphenated form is almost never used (c.f. w3.org). General predicate arguments are named `obj'. Fill in arguments omitted from some procedure definitions (e.g. request-method). Minor tweaks, such as using en-dash and missing markup as appropriate. Wrap very long deffn lines. (HTTP): `parse-header' and `write-header' use `sym' to be consistent with other procedures and docstrings. * module/web/*.scm: Expand texinfo markup in docstrings. Synchronize with changes in web.texi. --- doc/ref/web.texi | 89 +++++++++++++++++++++++++++-------------------- module/web/client.scm | 6 ++-- module/web/http.scm | 6 ++-- module/web/response.scm | 6 ++-- module/web/uri.scm | 28 +++++++-------- 5 files changed, 73 insertions(+), 62 deletions(-) diff --git a/doc/ref/web.texi b/doc/ref/web.texi index 6c33f32..82ef31b 100644 --- a/doc/ref/web.texi +++ b/doc/ref/web.texi @@ -10,7 +10,7 @@ @cindex HTTP It has always been possible to connect computers together and share -information between them, but the rise of the World-Wide Web over the +information between them, but the rise of the World Wide Web over the last couple of decades has made it much easier to do so. The result is a richly connected network of computation, in which Guile forms a part. @@ -206,9 +206,10 @@ The following procedures can be found in the @code{(web uri)} module. Load it into your Guile, using a form like the above, to have access to them. -@deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @ - [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @ - [#:fragment=@code{#f}] [#:validate?=@code{#t}] +@deffn {Scheme Procedure} build-uri scheme @ + [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @ + [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @ + [#:validate?=@code{#t}] Construct a URI object. @var{scheme} should be a symbol, @var{port} either a positive, exact integer or @code{#f}, and the rest of the fields are either strings or @code{#f}. If @var{validate?} is true, @@ -216,7 +217,7 @@ also run some consistency checks to make sure that the constructed URI is valid. @end deffn -@deffn {Scheme Procedure} uri? x +@deffn {Scheme Procedure} uri? obj @deffnx {Scheme Procedure} uri-scheme uri @deffnx {Scheme Procedure} uri-userinfo uri @deffnx {Scheme Procedure} uri-host uri @@ -249,9 +250,9 @@ Percent-decode the given @var{str}, according to @var{encoding}, which should be the name of a character encoding. Note that this function should not generally be applied to a full URI -string. For paths, use split-and-decode-uri-path instead. For query -strings, split the query on @code{&} and @code{=} boundaries, and decode -the components separately. +string. For paths, use @code{split-and-decode-uri-path} instead. For +query strings, split the query on @code{&} and @code{=} boundaries, and +decode the components separately. Note also that percent-encoded strings encode @emph{bytes}, not characters. There is no guarantee that a given byte sequence is a valid @@ -378,7 +379,8 @@ For more on the set of headers that Guile knows about out of the box, @pxref{HTTP Headers}. To add your own, use the @code{declare-header!} procedure: -@deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}] +@deffn {Scheme Procedure} declare-header! name parser validator writer @ + [#:multiple?=@code{#f}] Declare a parser, validator, and writer for a given header. @end deffn @@ -421,12 +423,12 @@ Returns the end-of-file object for both values if the end of the message body was reached (i.e., a blank line). @end deffn -@deffn {Scheme Procedure} parse-header name val +@deffn {Scheme Procedure} parse-header sym val Parse @var{val}, a string, with the parser for the header named -@var{name}. Returns the parsed value. +@var{sym}. Returns the parsed value. @end deffn -@deffn {Scheme Procedure} write-header name val port +@deffn {Scheme Procedure} write-header sym val port Write the given header name and value to @var{port}, using the writer from @code{header-writer}. @end deffn @@ -450,7 +452,7 @@ like @code{GET}. @end deffn @deffn {Scheme Procedure} parse-http-version str [start] [end] -Parse an HTTP version from @var{str}, returning it as a major-minor +Parse an HTTP version from @var{str}, returning it as a major--minor pair. For example, @code{HTTP/1.1} parses as the pair of integers, @code{(1 . 1)}. @end deffn @@ -471,7 +473,7 @@ Write the first line of an HTTP request to @var{port}. @deffn {Scheme Procedure} read-response-line port Read the first line of an HTTP response from @var{port}, returning three -values: the HTTP version, the response code, and the "reason phrase". +values: the HTTP version, the response code, and the ``reason phrase''. @end deffn @deffn {Scheme Procedure} write-response-line version code reason-phrase port @@ -1130,13 +1132,13 @@ any loss of generality. @subsubsection Request API -@deffn {Scheme Procedure} request? -@deffnx {Scheme Procedure} request-method -@deffnx {Scheme Procedure} request-uri -@deffnx {Scheme Procedure} request-version -@deffnx {Scheme Procedure} request-headers -@deffnx {Scheme Procedure} request-meta -@deffnx {Scheme Procedure} request-port +@deffn {Scheme Procedure} request? obj +@deffnx {Scheme Procedure} request-method request +@deffnx {Scheme Procedure} request-uri request +@deffnx {Scheme Procedure} request-version request +@deffnx {Scheme Procedure} request-headers request +@deffnx {Scheme Procedure} request-meta request +@deffnx {Scheme Procedure} request-port request A predicate and field accessors for the request type. The fields are as follows: @table @code @@ -1170,7 +1172,9 @@ request, you may read the body separately, and likewise for writing requests. @end deffn -@deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t] +@deffn {Scheme Procedure} build-request uri [#:method='GET] @ + [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] @ + [#:validate-headers?=#t] Construct an HTTP request object. If @var{validate-headers?} is true, the headers are each run through their respective validators. @end deffn @@ -1237,7 +1241,8 @@ more information on the format of parsed headers. Return the given request header, or @var{default} if none was present. @end deffn -@deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f] +@deffn {Scheme Procedure} request-absolute-uri r @ + [default-host=#f] [default-port=#f] A helper routine to determine the absolute URI of a request, using the @code{host} header and the default host and port. @end deffn @@ -1253,12 +1258,12 @@ A helper routine to determine the absolute URI of a request, using the As with requests (@pxref{Requests}), Guile offers a data type for HTTP responses. Again, the body is represented separately from the request. -@deffn {Scheme Procedure} response? -@deffnx {Scheme Procedure} response-version -@deffnx {Scheme Procedure} response-code +@deffn {Scheme Procedure} response? obj +@deffnx {Scheme Procedure} response-version response +@deffnx {Scheme Procedure} response-code response @deffnx {Scheme Procedure} response-reason-phrase response -@deffnx {Scheme Procedure} response-headers -@deffnx {Scheme Procedure} response-port +@deffnx {Scheme Procedure} response-headers response +@deffnx {Scheme Procedure} response-port response A predicate and field accessors for the response type. The fields are as follows: @table @code @@ -1284,7 +1289,9 @@ As a side effect, sets the encoding on @var{port} to ISO-8859-1 discussion of character sets in @ref{Responses}, for more information. @end deffn -@deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t] +@deffn {Scheme Procedure} build-response [#:version='(1 . 1)] @ + [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] @ + [#:validate-headers?=#t] Construct an HTTP response object. If @var{validate-headers?} is true, the headers are each run through their respective validators. @end deffn @@ -1384,6 +1391,10 @@ Return @code{#t} if @var{type}, a symbol as returned by @code{(web client)} provides a simple, synchronous HTTP client, built on the lower-level HTTP, request, and response modules. +@example +(use-modules (web client)) +@end example + @deffn {Scheme Procedure} open-socket-for-uri uri Return an open input/output port for a connection to URI. @end deffn @@ -1419,9 +1430,9 @@ If you already have a port open, pass it as @var{port}. Otherwise, a connection will be opened to the server corresponding to @var{uri}. Any extra headers in the alist @var{headers} will be added to the request. -If @var{body} is not #f, a message body will also be sent with the HTTP -request. If @var{body} is a string, it is encoded according to the -content-type in @var{headers}, defaulting to UTF-8. Otherwise +If @var{body} is not @code{#f}, a message body will also be sent with +the HTTP request. If @var{body} is a string, it is encoded according to +the content-type in @var{headers}, defaulting to UTF-8. Otherwise @var{body} should be a bytevector, or @code{#f} for no body. Although a message body may be sent with any request, usually only @code{POST} and @code{PUT} requests have bodies. @@ -1480,8 +1491,8 @@ The life cycle of a server goes as follows: @enumerate @item -The @code{open} hook is called, to open the server. @code{open} takes 0 or -more arguments, depending on the backend, and returns an opaque +The @code{open} hook is called, to open the server. @code{open} takes +zero or more arguments, depending on the backend, and returns an opaque server socket object, or signals an error. @item @@ -1578,8 +1589,8 @@ in, allowing the user's handler to explicitly manage its state. @end deffn @deffn {Scheme Procedure} sanitize-response request response body -"Sanitize" the given response and body, making them appropriate for the -given request. +``Sanitize'' the given response and body, making them appropriate for +the given request. As a convenience to web handler authors, @var{response} may be given as an alist of headers, in which case it is used to construct a default @@ -1615,7 +1626,8 @@ and body, and write the response to the client. Return the new state produced by the handler procedure. @end deffn -@deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state +@deffn {Scheme Procedure} run-server handler [impl='http] @ + [open-params='()] . state Run Guile's built-in web server. @var{handler} should be a procedure that takes two or more arguments, @@ -1636,7 +1648,8 @@ explicitly manage its state. The default web server implementation is @code{http}, which binds to a socket, listening for request on that port. -@deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket] +@deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] @ + [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket] The default HTTP implementation. We document it as a function with keyword arguments, because that is precisely the way that it is -- all of the @var{open-params} to @code{run-server} get passed to the diff --git a/module/web/client.scm b/module/web/client.scm index 9fbb25b..7d5ea49 100644 --- a/module/web/client.scm +++ b/module/web/client.scm @@ -248,10 +248,10 @@ pass it as PORT. The port will be closed at the end of the request unless KEEP-ALIVE? is true. Any extra headers in the alist HEADERS will be added to the request. -If BODY is not #f, a message body will also be sent with the HTTP +If BODY is not ‘#f’, a message body will also be sent with the HTTP request. If BODY is a string, it is encoded according to the content-type in HEADERS, defaulting to UTF-8. Otherwise BODY should be -a bytevector, or #f for no body. Although it's allowed to send a +a bytevector, or ‘#f’ for no body. Although it's allowed to send a message body along with any request, usually only POST and PUT requests have bodies. See ‘http-put’ and ‘http-post’ documentation, for more. @@ -317,7 +317,7 @@ This function is similar to ‘http-get’, except it uses the \"HEAD\" method. See ‘http-get’ for full documentation on the various keyword arguments that are accepted by this function. -Returns two values: the resulting response, and #f. Responses to HEAD +Returns two values: the resulting response, and ‘#f’. Responses to HEAD requests do not have a body. The second value is only returned so that other procedures can treat all of the http-foo verbs identically.") diff --git a/module/web/http.scm b/module/web/http.scm index c79d57d..712208b 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -167,7 +167,7 @@ The default writer is ‘display’." (define *eof* (call-with-input-string "" read)) (define (read-header port) - "Reads one HTTP header from PORT. Returns two values: the header + "Read one HTTP header from PORT. Return two values: the header name and the parsed Scheme value. May raise an exception if the header was known but the value was invalid. @@ -220,7 +220,7 @@ as an ordered alist." (define (write-headers headers port) "Write the given header alist to PORT. Doesn't write the final -@samp{\\r\\n}, as the user might want to add another header." +‘\\r\\n’, as the user might want to add another header." (let lp ((headers headers)) (if (pair? headers) (begin @@ -971,7 +971,7 @@ as an ordered alist." (define *known-versions* '()) (define* (parse-http-version str #:optional (start 0) (end (string-length str))) - "Parse an HTTP version from STR, returning it as a major-minor + "Parse an HTTP version from STR, returning it as a major–minor pair. For example, ‘HTTP/1.1’ parses as the pair of integers, ‘(1 . 1)’." (or (let lp ((known *known-versions*)) diff --git a/module/web/response.scm b/module/web/response.scm index 7e14f4d..de38abc 100644 --- a/module/web/response.scm +++ b/module/web/response.scm @@ -267,10 +267,10 @@ closes PORT, unless KEEP-ALIVE? is true." (define* (response-body-port r #:key (decode? #t) (keep-alive? #t)) "Return an input port from which the body of R can be read. The encoding of the returned port is set according to R's ‘content-type’ -header, when it's textual, except if DECODE? is #f. Return #f when no -body is available. +header, when it's textual, except if DECODE? is ‘#f’. Return ‘#f’ when +no body is available. -When KEEP-ALIVE? is #f, closing the returned port also closes R's +When KEEP-ALIVE? is ‘#f’, closing the returned port also closes R's response port." (define port (if (member '(chunked) (response-transfer-encoding r)) diff --git a/module/web/uri.scm b/module/web/uri.scm index 25406b3..33db1d1 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -53,8 +53,8 @@ (query uri-query) (fragment uri-fragment)) -(define (absolute-uri? x) - (and (uri? x) (uri-scheme x) #t)) +(define (absolute-uri? obj) + (and (uri? obj) (uri-scheme obj) #t)) (define (uri-error message . args) (throw 'uri-error message args)) @@ -309,17 +309,16 @@ serialization." which should be the name of a character encoding. Note that this function should not generally be applied to a full URI -string. For paths, use split-and-decode-uri-path instead. For query +string. For paths, use ‘split-and-decode-uri-path’ instead. For query strings, split the query on ‘&’ and ‘=’ boundaries, and decode the components separately. -Note also that percent-encoded strings encode @emph{bytes}, not -characters. There is no guarantee that a given byte sequence is a valid -string encoding. Therefore this routine may signal an error if the -decoded bytes are not valid for the given encoding. Pass ‘#f’ for -ENCODING if you want decoded bytes as a bytevector directly. -@xref{Ports, ‘set-port-encoding!’}, for more information on -character encodings. +Note also that percent-encoded strings encode _bytes_, not characters. +There is no guarantee that a given byte sequence is a valid string +encoding. Therefore this routine may signal an error if the decoded +bytes are not valid for the given encoding. Pass ‘#f’ for ENCODING if +you want decoded bytes as a bytevector directly. See +‘set-port-encoding!’ for more information on character encodings. Returns a string of the decoded characters, or a bytevector if ENCODING was ‘#f’." @@ -380,11 +379,10 @@ ENCODING was ‘#f’." UNESCAPED-CHARS. The default character set includes alphanumerics from ASCII, as well as -the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any -other character will be percent-encoded, by writing out the character to -a bytevector within the given ENCODING, then encoding each byte as -‘%HH’, where HH is the hexadecimal representation of -the byte." +the special characters ‘-’, ‘.’, ‘_’, and ‘~’. Any other character will +be percent-encoded, by writing out the character to a bytevector within +the given ENCODING, then encoding each byte as ‘%HH’, where HH is the +hexadecimal representation of the byte." (define (needs-escaped? ch) (not (char-set-contains? unescaped-chars ch))) (if (string-index str needs-escaped?) -- 1.7.10.4 [-- Attachment #3: 0002-web-public-access-to-default-port-information.patch --] [-- Type: application/octet-stream, Size: 3078 bytes --] From c5b77b2735f688182846c9955eaf451f49d71bef Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Sat, 23 Feb 2013 13:51:50 +0800 Subject: [PATCH 2/4] web: public access to default-port information * module/web/uri.scm (default-port): New procedure. (default-port?): Export. Add docstring. * module/web/http.scm (write-uri): Use `default-port?'. * doc/ref/web.texi (Universal Resouce Identifiers): Document. --- doc/ref/web.texi | 11 +++++++++++ module/web/http.scm | 2 +- module/web/uri.scm | 12 ++++++++++-- 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/doc/ref/web.texi b/doc/ref/web.texi index 82ef31b..963ae3f 100644 --- a/doc/ref/web.texi +++ b/doc/ref/web.texi @@ -245,6 +245,17 @@ serialization. Declare a default port for the given URI scheme. @end deffn +@deffn {Scheme Procedure} default-port scheme +Return the default port for @var{scheme} if one has been declared, +otherwise @code{#f}. +@end deffn + +@deffn {Scheme Procedure} default-port? scheme port +Return @code{#t} when @var{port} @emph{matches} the default port for +@var{scheme}. Note that @var{port} may be @code{#f}, which implies a +match and consequent return value of @code{#t}. +@end deffn + @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}] Percent-decode the given @var{str}, according to @var{encoding}, which should be the name of a character encoding. diff --git a/module/web/http.scm b/module/web/http.scm index 712208b..5671330 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -1064,7 +1064,7 @@ three values: the method, the URI, and the version." (display #\@ port))) (display (uri-host uri) port) (let ((p (uri-port uri))) - (if (and p (not (eqv? p 80))) + (if (not (default-port? (uri-scheme uri) p)) (begin (display #\: port) (display p port)))))) diff --git a/module/web/uri.scm b/module/web/uri.scm index 33db1d1..2e8c4a6 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -36,7 +36,7 @@ uri-path uri-query uri-fragment build-uri - declare-default-port! + declare-default-port! default-port default-port? string->uri uri->string uri-decode uri-encode split-and-decode-uri-path @@ -206,9 +206,17 @@ could not be parsed." "Declare a default port for the given URI scheme." (hashq-set! *default-ports* scheme port)) +(define (default-port scheme) + "Return the default port for SCHEME if one has been declared, +otherwise ‘#f’." + (hashq-ref *default-ports* scheme)) + (define (default-port? scheme port) + "Return ‘#t’ when PORT _matches_ the default port for SCHEME. Note +that PORT may be ‘#f’, which implies a match and consequent return value +of ‘#t’." (or (not port) - (eqv? port (hashq-ref *default-ports* scheme)))) + (eqv? port (default-port scheme)))) (declare-default-port! 'http 80) (declare-default-port! 'https 443) -- 1.7.10.4 [-- Attachment #4: 0003-add-tests-for-read-request-line-etc.patch --] [-- Type: application/octet-stream, Size: 5619 bytes --] From 2bf9660cfdf7c4f0ffbc10a80433910eb2d23bf6 Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Sat, 23 Feb 2013 15:15:33 +0800 Subject: [PATCH 3/4] add tests for read-request-line, etc. * test-suite/web/web-http.test ("read-request-line"): ("write-request-line", "read-response-line", "write-response-line"): Add. --- test-suite/tests/web-http.test | 107 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test index 97f5559..6fa16bd 100644 --- a/test-suite/tests/web-http.test +++ b/test-suite/tests/web-http.test @@ -85,6 +85,113 @@ #t (error "unexpected exception" component arg)))))))) +(define-syntax pass-if-read-request-line + (syntax-rules () + ((_ str expected-method expected-uri expected-version) + (pass-if str + (equal? (call-with-values + (lambda () + (read-request-line (open-input-string + (string-append str "\r\n")))) + list) + (list 'expected-method + expected-uri + 'expected-version)))))) + +(define-syntax pass-if-write-request-line + (syntax-rules () + ((_ expected-str method uri version) + (pass-if expected-str + (equal? (string-append expected-str "\r\n") + (call-with-output-string + (lambda (port) + (write-request-line 'method uri 'version port)))))))) + +(define-syntax pass-if-read-response-line + (syntax-rules () + ((_ str expected-version expected-code expected-phrase) + (pass-if str + (equal? (call-with-values + (lambda () + (read-response-line (open-input-string + (string-append str "\r\n")))) + list) + (list 'expected-version + expected-code + expected-phrase)))))) + +(define-syntax pass-if-write-response-line + (syntax-rules () + ((_ expected-str version code phrase) + (pass-if expected-str + (equal? (string-append expected-str "\r\n") + (call-with-output-string + (lambda (port) + (write-response-line 'version code phrase port)))))))) + +(with-test-prefix "read-request-line" + (pass-if-read-request-line "GET / HTTP/1.1" + GET + (build-uri 'http + #:path "/") + (1 . 1)) + (pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1" + GET + (build-uri 'http + #:host "www.w3.org" + #:path "/pub/WWW/TheProject.html") + (1 . 1)) + (pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1" + GET + (build-uri 'http + #:path "/pub/WWW/TheProject.html") + (1 . 1)) + (pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1" + HEAD + (build-uri 'http + #:path "/etc/hosts" + #:query "foo=bar") + (1 . 1))) + +(with-test-prefix "write-request-line" + (pass-if-write-request-line "GET / HTTP/1.1" + GET + (build-uri 'http + #:path "/") + (1 . 1)) + ;;; FIXME: Test fails due to scheme, host always being removed. + ;;; However, it should be supported to request these be present, and + ;;; that is possible with absolute/relative URI support. + ;; (pass-if-write-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1" + ;; GET + ;; (build-uri 'http + ;; #:host "www.w3.org" + ;; #:path "/pub/WWW/TheProject.html") + ;; (1 . 1)) + (pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1" + GET + (build-uri 'http + #:path "/pub/WWW/TheProject.html") + (1 . 1)) + (pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1" + HEAD + (build-uri 'http + #:path "/etc/hosts" + #:query "foo=bar") + (1 . 1))) + +(with-test-prefix "read-response-line" + (pass-if-read-response-line "HTTP/1.0 404 Not Found" + (1 . 0) 404 "Not Found") + (pass-if-read-response-line "HTTP/1.1 200 OK" + (1 . 1) 200 "OK")) + +(with-test-prefix "write-response-line" + (pass-if-write-response-line "HTTP/1.0 404 Not Found" + (1 . 0) 404 "Not Found") + (pass-if-write-response-line "HTTP/1.1 200 OK" + (1 . 1) 200 "OK")) + (with-test-prefix "general headers" (pass-if-parse cache-control "no-transform" '(no-transform)) -- 1.7.10.4 [-- Attachment #5: 0004-extend-support-for-relative-URIs-to-public-interface.patch --] [-- Type: application/octet-stream, Size: 22649 bytes --] From b70bcaba50bf208a81841be2550b3f187967114c Mon Sep 17 00:00:00 2001 From: Daniel Hartwig <mandyke@gmail.com> Date: Sat, 23 Feb 2013 15:49:20 +0800 Subject: [PATCH 4/4] extend support for relative URIs to public interface * module/web/uri.scm (absolute-uri?, relative-uri?): New predicates. (validate-uri): SCHEME can be `#f' as for a relative URI. (build-uri): SCHEME is optional. Validation is relaxed to also permit relative URIs. (string->uri): Fold in `string->uri*' and parse _any_ URI string, absolute or relative. (string->absolute-uri, string->relative-uri): New procedures. Parse only strings representing the specific type of URI. * module/web/http.scm (declare-absolute-uri-header!): Rename from `declare-uri-header!' to reflect new terminology. Use new public uri interfaces to parse and validate. (declare-uri-header!): Rename from `declare-relative-uri-header!' to reflect new terminology. Use new public uri interfaces to parsing. (write-uri): Do not display an absent `uri-scheme', however, do display the scheme even when `uri-host' is absent. Add note to look at using `uri->string'. (parse-request-uri): Use new `string->absolute-uri' in the else clause. No longer imply that scheme is `http', which was wrong for HTTPS. Add note to later support "authority" form used by the CONNECT method. * test-suite/tests/web-http.test ("read-request-line"): ("write-request-line"): Remove scheme from URIs as appropriate. ("entity headers", "request headers"): content-location and refer should also parse relative URIs. ("response headers"): location should not parse relative URIs. * test-suite/tests/web-request.test ("example-1"): Do not expect any uri-scheme. * test-suite/tests/web-uri.test ("build-uri"): Add tests for relative URIs. ("string->uri", "uri->string"): Add symmetric tests for handling of relative URIs. ("string->absolute-uri", "string->relative-uri"): New tests. * doc/ref/web.texi (URIs): Add introductory note about absolute vs. relative URIs. Document new procedures. (HTTP Headers): Note that some headers are absolute URIs. --- doc/ref/web.texi | 43 ++++++++++++---- module/web/http.scm | 40 ++++++++------ module/web/uri.scm | 39 +++++++++----- test-suite/tests/web-http.test | 33 +++++++----- test-suite/tests/web-request.test | 2 +- test-suite/tests/web-uri.test | 103 +++++++++++++++++++++++++++++++++++-- 6 files changed, 205 insertions(+), 55 deletions(-) diff --git a/doc/ref/web.texi b/doc/ref/web.texi index 963ae3f..7cfd9e5 100644 --- a/doc/ref/web.texi +++ b/doc/ref/web.texi @@ -198,6 +198,15 @@ fragment identifies a part of a resource, not the resource itself. But it is useful to have a fragment field in the URI record itself, so we hope you will forgive the inconsistency. +Finally, there are URI-like objects that omit the scheme part. +Depending on these reference, these either are or are not considered +proper URIs. In Guile there is a single URI record type that holds any +URI-like object. This manual uses the term @dfn{absolute URI} to refer +to a URI object with a scheme, and @dfn{relative URI} to refer to one +without. In cases where either type will do, the term @dfn{URI} is +used. For example, @indicateurl{/path/to/foo} is a relative URI, where +as all of the previous examples are absolute URIs. + @example (use-modules (web uri)) @end example @@ -206,15 +215,15 @@ The following procedures can be found in the @code{(web uri)} module. Load it into your Guile, using a form like the above, to have access to them. -@deffn {Scheme Procedure} build-uri scheme @ +@deffn {Scheme Procedure} build-uri [scheme] @ [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @ [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @ [#:validate?=@code{#t}] -Construct a URI object. @var{scheme} should be a symbol, @var{port} -either a positive, exact integer or @code{#f}, and the rest of the -fields are either strings or @code{#f}. If @var{validate?} is true, -also run some consistency checks to make sure that the constructed URI -is valid. +Construct a URI object. @var{scheme} should be a symbol, @var{port} a +positive, exact integer, and the rest of the fields are strings. Any +field except @var{port} may also be @code{#f} to indicate it is not +present. If @var{validate?} is true, also run some consistency checks +to make sure that the constructed URI is valid. @end deffn @deffn {Scheme Procedure} uri? obj @@ -226,8 +235,13 @@ is valid. @deffnx {Scheme Procedure} uri-query uri @deffnx {Scheme Procedure} uri-fragment uri A predicate and field accessors for the URI record type. The URI scheme -will be a symbol, the port either a positive, exact integer or @code{#f}, -and the rest either strings or @code{#f} if not present. +will be a symbol, the port a positive, exact integer, and the rest +either strings. Any field except port may be @code{#f} if not present. +@end deffn + +@deffn {Scheme Procedure} absolute-uri? obj +@deffnx {Scheme Procedure} relative-uri? obj +Return @code{#t} iff @var{obj} is a URI object of the indicated type. @end deffn @deffn {Scheme Procedure} string->uri string @@ -235,6 +249,12 @@ Parse @var{string} into a URI object. Return @code{#f} if the string could not be parsed. @end deffn +@deffn {Scheme Procedure} string->absolute-uri string +@deffnx {Scheme Procedure} string->relative-uri string +Parse @var{string} into a URI object of the indicated type. Return +@code{#f} if the string could not be parsed. +@end deffn + @deffn {Scheme Procedure} uri->string uri Serialize @var{uri} to a string. If the URI has a port that is the default port for its scheme, the port is not included in the @@ -986,9 +1006,10 @@ The entity-tag of the resource. @end example @end deftypevr -@deftypevr {HTTP Header} URI location -A URI on which a request may be completed. Used in combination with a -redirecting status code to perform client-side redirection. +@deftypevr {HTTP Header} Absolute-URI location +An absolute URI on which a request may be completed. Used in +combination with a redirecting status code to perform client-side +redirection. @example (parse-header 'location "http://example.com/other") @result{} #<uri ...> diff --git a/module/web/http.scm b/module/web/http.scm index 5671330..2db64d0 100644 --- a/module/web/http.scm +++ b/module/web/http.scm @@ -1021,6 +1021,9 @@ symbol, like ‘GET’." ((string= str "TRACE" start end) 'TRACE) (else (bad-request "Invalid method: ~a" (substring str start end))))) +;; FIXME: At the moment we do not support the CONNECT form (see above) +;; but we should and this requires to support a request-URI which is +;; just "authority" (i.e. [userinfo "@"] host [":" port]). (define* (parse-request-uri str #:optional (start 0) (end (string-length str))) "Parse a URI from an HTTP request line. Note that URIs in requests do not have to have a scheme or host name. The result is a URI object." @@ -1033,12 +1036,11 @@ not have to have a scheme or host name. The result is a URI object." (let* ((q (string-index str #\? start end)) (f (string-index str #\# start end)) (q (and q (or (not f) (< q f)) q))) - (build-uri 'http - #:path (substring str start (or q f end)) + (build-uri #:path (substring str start (or q f end)) #:query (and q (substring str (1+ q) (or f end))) #:fragment (and f (substring str (1+ f) end))))) (else - (or (string->uri (substring str start end)) + (or (string->absolute-uri (substring str start end)) (bad-request "Invalid URI: ~a" (substring str start end)))))) (define (read-request-line port) @@ -1053,11 +1055,17 @@ three values: the method, the URI, and the version." (parse-http-version line (1+ d1) (string-length line))) (bad-request "Bad Request-Line: ~s" line)))) +;; FIXME: The validation here should be reconsidered and moved to +;; individual header validators if they do not already covered. Then +;; this procedure should be using uri->string. (define (write-uri uri port) - (if (uri-host uri) + (if (uri-scheme uri) (begin (display (uri-scheme uri) port) - (display "://" port) + (display #\:))) + (if (uri-host uri) + (begin + (display "//" port) (if (uri-userinfo uri) (begin (display (uri-userinfo uri) port) @@ -1171,18 +1179,20 @@ treated specially, and is just returned as a plain string." (declare-header! name parse-non-negative-integer non-negative-integer? display)) -;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1) -(define (declare-uri-header! name) +;; emacs: (put 'declare-absolute-uri-header! 'scheme-indent-function 1) +(define (declare-absolute-uri-header! name) (declare-header! name - (lambda (str) (or (string->uri str) (bad-header-component 'uri str))) - (@@ (web uri) absolute-uri?) + (lambda (str) + (or (string->absolute-uri str) + (bad-header-component 'absolute-uri str))) + absolute-uri? write-uri)) -;; emacs: (put 'declare-relative-uri-header! 'scheme-indent-function 1) -(define (declare-relative-uri-header! name) +;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1) +(define (declare-uri-header! name) (declare-header! name (lambda (str) - (or ((@@ (web uri) string->uri*) str) + (or (string->uri str) (bad-header-component 'uri str))) uri? write-uri)) @@ -1449,7 +1459,7 @@ treated specially, and is just returned as a plain string." ;; Content-Location = ( absoluteURI | relativeURI ) ;; -(declare-relative-uri-header! "Content-Location") +(declare-uri-header! "Content-Location") ;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864> ;; @@ -1738,7 +1748,7 @@ treated specially, and is just returned as a plain string." ;; Referer = ( absoluteURI | relativeURI ) ;; -(declare-relative-uri-header! "Referer") +(declare-uri-header! "Referer") ;; TE = #( t-codings ) ;; t-codings = "trailers" | ( transfer-extension [ accept-params ] ) @@ -1775,7 +1785,7 @@ treated specially, and is just returned as a plain string." ;; Location = absoluteURI ;; -(declare-uri-header! "Location") +(declare-absolute-uri-header! "Location") ;; Proxy-Authenticate = 1#challenge ;; diff --git a/module/web/uri.scm b/module/web/uri.scm index 2e8c4a6..440e620 100644 --- a/module/web/uri.scm +++ b/module/web/uri.scm @@ -31,13 +31,14 @@ #:use-module (ice-9 control) #:use-module (rnrs bytevectors) #:use-module (ice-9 binary-ports) - #:export (uri? + #:export (uri? absolute-uri? relative-uri? uri-scheme uri-userinfo uri-host uri-port uri-path uri-query uri-fragment build-uri declare-default-port! default-port default-port? string->uri uri->string + string->absolute-uri string->relative-uri uri-decode uri-encode split-and-decode-uri-path encode-and-join-uri-path)) @@ -54,8 +55,13 @@ (fragment uri-fragment)) (define (absolute-uri? obj) + "Return ‘#t’ iff OBJ is an absolute URI object." (and (uri? obj) (uri-scheme obj) #t)) +(define (relative-uri? obj) + "Return ‘#t’ iff OBJ is a relative URI object." + (and (uri? obj) (not (uri-scheme obj)))) + (define (uri-error message . args) (throw 'uri-error message args)) @@ -64,7 +70,7 @@ (define (validate-uri scheme userinfo host port path query fragment) (cond - ((not (symbol? scheme)) + ((and scheme (not (symbol? scheme))) (uri-error "Expected a symbol for the URI scheme: ~s" scheme)) ((and (or userinfo port) (not host)) (uri-error "Expected a host, given userinfo or port")) @@ -80,13 +86,14 @@ (not (eqv? (string-ref path 0) #\/))) (uri-error "Expected path of absolute URI to start with a /: ~a" path)))) -(define* (build-uri scheme #:key userinfo host port (path "") query fragment +(define* (build-uri #:optional scheme #:key + userinfo host port (path "") query fragment (validate? #t)) - "Construct a URI object. SCHEME should be a symbol, PORT -either a positive, exact integer or ‘#f’, and the rest of the -fields are either strings or ‘#f’. If VALIDATE? is true, -also run some consistency checks to make sure that the constructed URI -is valid." + "Construct a URI object. SCHEME should be a symbol, PORT a positive, +exact integer, and the rest of the fields are strings. Any field except +PORT may also be ‘#f’ to indicate it is not present. If VALIDATE? is +true, also run some consistency checks to make sure that the constructed +URI is valid." (if validate? (validate-uri scheme userinfo host port path query fragment)) (make-uri scheme userinfo host port path query fragment)) @@ -173,7 +180,7 @@ is valid." (define uri-regexp (make-regexp uri-pat)) -(define (string->uri* string) +(define (string->uri string) "Parse STRING into a URI object. Return ‘#f’ if the string could not be parsed." (% (let ((m (regexp-exec uri-regexp string))) @@ -194,11 +201,17 @@ could not be parsed." (lambda (k) #f))) -(define (string->uri string) - "Parse STRING into a URI object. Return ‘#f’ if the string +(define (string->absolute-uri string) + "Parse STRING into an absolute URI object. Return ‘#f’ if the string +could not be parsed." + (let ((uri (string->uri string))) + (and (absolute-uri? uri) uri))) + +(define (string->relative-uri string) + "Parse STRING into a relative URI object. Return ‘#f’ if the string could not be parsed." - (let ((uri (string->uri* string))) - (and uri (uri-scheme uri) uri))) + (let ((uri (string->uri string))) + (and (relative-uri? uri) uri))) (define *default-ports* (make-hash-table)) diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test index 6fa16bd..80dbd55 100644 --- a/test-suite/tests/web-http.test +++ b/test-suite/tests/web-http.test @@ -132,8 +132,7 @@ (with-test-prefix "read-request-line" (pass-if-read-request-line "GET / HTTP/1.1" GET - (build-uri 'http - #:path "/") + (build-uri #:path "/") (1 . 1)) (pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1" GET @@ -143,21 +142,18 @@ (1 . 1)) (pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1" GET - (build-uri 'http - #:path "/pub/WWW/TheProject.html") + (build-uri #:path "/pub/WWW/TheProject.html") (1 . 1)) (pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1" HEAD - (build-uri 'http - #:path "/etc/hosts" + (build-uri #:path "/etc/hosts" #:query "foo=bar") (1 . 1))) (with-test-prefix "write-request-line" (pass-if-write-request-line "GET / HTTP/1.1" GET - (build-uri 'http - #:path "/") + (build-uri #:path "/") (1 . 1)) ;;; FIXME: Test fails due to scheme, host always being removed. ;;; However, it should be supported to request these be present, and @@ -170,13 +166,11 @@ ;; (1 . 1)) (pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1" GET - (build-uri 'http - #:path "/pub/WWW/TheProject.html") + (build-uri #:path "/pub/WWW/TheProject.html") (1 . 1)) (pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1" HEAD - (build-uri 'http - #:path "/etc/hosts" + (build-uri #:path "/etc/hosts" #:query "foo=bar") (1 . 1))) @@ -252,6 +246,12 @@ (pass-if-parse content-length "010" 10) (pass-if-parse content-location "http://foo/" (build-uri 'http #:host "foo" #:path "/")) + (pass-if-parse content-location "//foo/" + (build-uri #:host "foo" #:path "/")) + (pass-if-parse content-location "/etc/foo" + (build-uri #:path "/etc/foo")) + (pass-if-parse content-location "foo" + (build-uri #:path "foo")) (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *)) (pass-if-parse content-range "bytes */*" '(bytes * *)) (pass-if-parse content-range "bytes */30" '(bytes * 30)) @@ -315,6 +315,12 @@ (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30))) (pass-if-parse referer "http://foo/bar?baz" (build-uri 'http #:host "foo" #:path "/bar" #:query "baz")) + (pass-if-parse referer "//foo/bar?baz" + (build-uri #:host "foo" #:path "/bar" #:query "baz")) + (pass-if-parse referer "/etc/foo" + (build-uri #:path "/etc/foo")) + (pass-if-parse referer "foo" + (build-uri #:path "foo")) (pass-if-parse te "trailers" '((trailers))) (pass-if-parse te "trailers,foo" '((trailers) (foo))) (pass-if-parse user-agent "guile" "guile")) @@ -329,6 +335,9 @@ (pass-if-parse etag "W/\"foo\"" '("foo" . #f)) (pass-if-parse location "http://other-place" (build-uri 'http #:host "other-place")) + (pass-if-any-error location "//other-place") + (pass-if-any-error location "/etc/foo") + (pass-if-any-error location "foo") (pass-if-parse proxy-authenticate "Basic realm=\"guile\"" '((basic (realm . "guile")))) (pass-if-parse retry-after "Tue, 15 Nov 1994 08:12:31 GMT" diff --git a/test-suite/tests/web-request.test b/test-suite/tests/web-request.test index 8cf1c2e..1574d69 100644 --- a/test-suite/tests/web-request.test +++ b/test-suite/tests/web-request.test @@ -53,7 +53,7 @@ Accept-Language: en-gb, en;q=0.9\r (pass-if (equal? (request-method r) 'GET)) - (pass-if (equal? (request-uri r) (build-uri 'http #:path "/qux"))) + (pass-if (equal? (request-uri r) (build-uri #:path "/qux"))) (pass-if (equal? (read-request-body r) #f)) diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test index 3f6e7e3..5db1442 100644 --- a/test-suite/tests/web-uri.test +++ b/test-suite/tests/web-uri.test @@ -121,8 +121,23 @@ (pass-if-uri-exception "http://foo@" "Expected.*host" - (build-uri 'http #:userinfo "foo"))) + (build-uri 'http #:userinfo "foo")) + (pass-if "//host/etc/foo" + (uri=? (build-uri #:host "host" + #:path "/etc/foo") + #:host "host" + #:path "/etc/foo")) + + (pass-if "/path/to/some/foo?query" + (uri=? (build-uri #:path "/path/to/some/foo" + #:query "query") + #:path "/path/to/some/foo" + #:query "query")) + + (pass-if "nextdoc/foo" + (uri=? (build-uri #:path "nextdoc/foo") + #:path "nextdoc/foo"))) (with-test-prefix "string->uri" (pass-if "ftp:" @@ -210,7 +225,30 @@ (pass-if "file:///etc/hosts" (uri=? (string->uri "file:///etc/hosts") #:scheme 'file - #:path "/etc/hosts"))) + #:path "/etc/hosts")) + + (pass-if "/" + (uri=? (string->uri "/") + #:path "/")) + + (pass-if "/path/to/foo" + (uri=? (string->uri "/path/to/foo") + #:path "/path/to/foo")) + + (pass-if "//example.org" + (uri=? (string->uri "//example.org") + #:host "example.org" + #:path "")) + + (pass-if "//bar@example.org/path/to/foo" + (uri=? (string->uri "//bar@example.org/path/to/foo") + #:userinfo "bar" + #:host "example.org" + #:path "/path/to/foo")) + + (pass-if "nextdoc/foo" + (uri=? (string->uri "nextdoc/foo") + #:path "nextdoc/foo"))) (with-test-prefix "uri->string" (pass-if "ftp:" @@ -248,7 +286,66 @@ (pass-if "http://foo:/" (equal? "http://foo/" - (uri->string (string->uri "http://foo:/"))))) + (uri->string (string->uri "http://foo:/")))) + + (pass-if "/" + (equal? "/" + (uri->string (string->uri "/")))) + + (pass-if "/path/to/foo" + (equal? "/path/to/foo" + (uri->string (string->uri "/path/to/foo")))) + + (pass-if "//example.org" + (equal? "//example.org" + (uri->string (string->uri "//example.org")))) + + (pass-if "//bar@example.org/path/to/foo" + (equal? "//bar@example.org/path/to/foo" + (uri->string (string->uri "//bar@example.org/path/to/foo")))) + + (pass-if "nextdoc/foo" + (equal? "nextdoc/foo" + (uri->string (string->uri "nextdoc/foo"))))) + +(with-test-prefix "string->absolute-uri" + (pass-if "ftp:" + (uri=? (string->absolute-uri "ftp:") + #:scheme 'ftp + #:path "")) + + (pass-if "/" + (not (string->absolute-uri "/"))) + + (pass-if "/path/to/foo" + (not (string->absolute-uri "/path/to/foo"))) + + (pass-if "//example.org" + (not (string->absolute-uri "//example.org"))) + + (pass-if "nextdoc/foo" + (not (string->absolute-uri "nextdoc/foo")))) + +(with-test-prefix "string->relative-uri" + (pass-if "ftp:" + (not (string->relative-uri "ftp:"))) + + (pass-if "/" + (uri=? (string->relative-uri "/") + #:path "/")) + + (pass-if "/path/to/foo" + (uri=? (string->relative-uri "/path/to/foo") + #:path "/path/to/foo")) + + (pass-if "//example.org" + (uri=? (string->relative-uri "//example.org") + #:host "example.org" + #:path "")) + + (pass-if "nextdoc/foo" + (uri=? (string->relative-uri "nextdoc/foo") + #:path "nextdoc/foo"))) (with-test-prefix "decode" (pass-if "foo%20bar" -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-02-23 8:11 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès 2012-11-08 5:52 ` Daniel Hartwig 2012-11-08 20:10 ` Ludovic Courtès 2012-11-09 0:39 ` Daniel Hartwig 2012-11-09 20:52 ` Ludovic Courtès 2012-11-10 1:45 ` Daniel Hartwig 2012-11-10 13:52 ` Ludovic Courtès 2012-11-23 22:19 ` Ludovic Courtès 2012-11-24 11:23 ` Daniel Hartwig 2012-11-24 15:10 ` Ludovic Courtès 2012-11-24 15:34 ` Daniel Hartwig 2012-11-26 0:15 ` Ludovic Courtès 2012-11-26 23:13 ` Ludovic Courtès 2012-11-27 1:06 ` Daniel Hartwig 2012-11-27 12:50 ` Ludovic Courtès 2012-11-27 15:18 ` Daniel Hartwig 2012-11-27 21:43 ` Ludovic Courtès 2013-02-23 8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).