From: Daniel Hartwig <mandyke@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 12827@debbugs.gnu.org
Subject: bug#12827: [2.0.6] web client: fails to parse 404 header
Date: Sat, 24 Nov 2012 19:23:04 +0800 [thread overview]
Message-ID: <CAN3veRfzRovVKAxRcTJnPd2+hc-MpDPz3J1HAgc_L0Kmd3h89w@mail.gmail.com> (raw)
In-Reply-To: <87ip8wezdd.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 686 bytes --]
On 24 November 2012 06:19, Ludovic Courtès <ludo@gnu.org> wrote:
> Any update on that? The plan is to release 2.0.7 next week, so it’d be
> great if this could be in.
I have made a first attempt at the doc strings and manual. This
involved first syncronizing the two, as only the manual had been
receiving updates.
Some more tweaking to the code.
Personally I am not 100% on this, but I attach it for comment anyway.
I will not be able to work on it again for a short while.
A quick solution may be to silently introduce just enough to fix the
current bug, and worry about the extra predicates, uri-record-type vs.
rfc-definition-of-uri, etc. later.
Regards
[-- Attachment #2: 0001-syncronize-web-module-docstrings-with-manual.patch --]
[-- Type: application/octet-stream, Size: 21066 bytes --]
From fcc01b345b93d7a75980d7607684e4a5b3243daa Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 24 Nov 2012 14:10:12 +0800
Subject: [PATCH 1/3] syncronize web module docstrings with manual
* doc/ref/web.texi: Fix spacing. Update with a few missing function
descriptions.
* module/web/client.scm:
* module/web/http.scm:
* module/web/request.scm:
* module/web/server.scm:
* module/web/uri.scm: Update docstrings from manual.
---
doc/ref/web.texi | 35 ++++++++++++++---------------
module/web/client.scm | 9 ++++++++
module/web/http.scm | 41 ++++++++++++++++------------------
module/web/request.scm | 14 ++++++++----
module/web/server.scm | 10 ++++-----
module/web/uri.scm | 57 +++++++++++++++++++++++++++++-------------------
6 files changed, 96 insertions(+), 70 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 161a28d..e6e594e 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -431,8 +431,8 @@ from @code{header-writer}.
@end deffn
@deffn {Scheme Procedure} read-headers port
-Read the headers of an HTTP message from @var{port}, returning the
-headers as an ordered alist.
+Read the headers of an HTTP message from @var{port}, returning them
+as an ordered alist.
@end deffn
@deffn {Scheme Procedure} write-headers headers port
@@ -1368,6 +1368,7 @@ Return the given response header, or @var{default} if none was present.
the lower-level HTTP, request, and response modules.
@deffn {Scheme Procedure} open-socket-for-uri uri
+Return an open input/output port for a connection to URI.
@end deffn
@deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
@@ -1470,17 +1471,17 @@ the server socket.
A user may define a server implementation with the following form:
-@deffn {Scheme Procedure} define-server-impl name open read write close
+@deffn {Scheme Syntax} define-server-impl name open read write close
Make a @code{<server-impl>} object with the hooks @var{open},
@var{read}, @var{write}, and @var{close}, and bind it to the symbol
@var{name} in the current module.
@end deffn
@deffn {Scheme Procedure} lookup-server-impl impl
-Look up a server implementation. If @var{impl} is a server
-implementation already, it is returned directly. If it is a symbol, the
+Look up a server implementation. If @var{impl} is a server
+implementation already, it is returned directly. If it is a symbol, the
binding named @var{impl} in the @code{(web server @var{impl})} module is
-looked up. Otherwise an error is signaled.
+looked up. Otherwise an error is signaled.
Currently a server implementation is a somewhat opaque type, useful only
for passing to other procedures in this module, like @code{read-client}.
@@ -1494,7 +1495,7 @@ any access to the impl objects.
@deffn {Scheme Procedure} open-server impl open-params
Open a server for the given implementation. Return one value, the new
-server object. The implementation's @code{open} procedure is applied to
+server object. The implementation's @code{open} procedure is applied to
@var{open-params}, which should be a list.
@end deffn
@@ -1502,7 +1503,7 @@ server object. The implementation's @code{open} procedure is applied to
Read a new client from @var{server}, by applying the implementation's
@code{read} procedure to the server. If successful, return three
values: an object corresponding to the client, a request object, and the
-request body. If any exception occurs, return @code{#f} for all three
+request body. If any exception occurs, return @code{#f} for all three
values.
@end deffn
@@ -1513,9 +1514,9 @@ The response and response body are produced by calling the given
@var{handler} with @var{request} and @var{body} as arguments.
The elements of @var{state} are also passed to @var{handler} as
-arguments, and may be returned as additional values. The new
+arguments, and may be returned as additional values. The new
@var{state}, collected from the @var{handler}'s return values, is then
-returned as a list. The idea is that a server loop receives a handler
+returned as a list. The idea is that a server loop receives a handler
from the user, along with whatever state values the user is interested
in, allowing the user's handler to explicitly manage its state.
@end deffn
@@ -1526,20 +1527,20 @@ given request.
As a convenience to web handler authors, @var{response} may be given as
an alist of headers, in which case it is used to construct a default
-response. Ensures that the response version corresponds to the request
-version. If @var{body} is a string, encodes the string to a bytevector,
-in an encoding appropriate for @var{response}. Adds a
+response. Ensures that the response version corresponds to the request
+version. If @var{body} is a string, encodes the string to a bytevector,
+in an encoding appropriate for @var{response}. Adds a
@code{content-length} and @code{content-type} header, as necessary.
If @var{body} is a procedure, it is called with a port as an argument,
-and the output collected as a bytevector. In the future we might try to
+and the output collected as a bytevector. In the future we might try to
instead use a compressing, chunk-encoded port, and call this procedure
-later, in the write-client procedure. Authors are advised not to rely on
+later, in the write-client procedure. Authors are advised not to rely on
the procedure being called at any particular time.
@end deffn
@deffn {Scheme Procedure} write-client impl server client response body
-Write an HTTP response and body to @var{client}. If the server and
+Write an HTTP response and body to @var{client}. If the server and
client support persistent connections, it is the implementation's
responsibility to keep track of the client thereafter, presumably by
attaching it to the @var{server} argument somehow.
@@ -1572,7 +1573,7 @@ before sending back to the client.
Additional arguments to @var{handler} are taken from @var{state}.
Additional return values are accumulated into a new @var{state}, which
-will be used for subsequent requests. In this way a handler can
+will be used for subsequent requests. In this way a handler can
explicitly manage its state.
@end deffn
diff --git a/module/web/client.scm b/module/web/client.scm
index cf7ea53..0991373 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -115,6 +115,15 @@
(define* (http-get uri #:key (port (open-socket-for-uri uri))
(version '(1 . 1)) (keep-alive? #f) (extra-headers '())
(decode-body? #t))
+ "Connect to the server corresponding to @var{uri} and ask for the
+resource, using the @code{GET} method. If you already have a port open,
+pass it as @var{port}. The port will be closed at the end of the
+request unless @var{keep-alive?} is true. Any extra headers in the
+alist @var{extra-headers} will be added to the request.
+
+If @var{decode-body?} is true, as is the default, the body of the
+response will be decoded to string, if it is a textual content-type.
+Otherwise it will be returned as a bytevector."
(let ((req (build-request uri #:version version
#:headers (if keep-alive?
extra-headers
diff --git a/module/web/http.scm b/module/web/http.scm
index cc5dd5a..3b78d08 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -100,12 +100,7 @@
validator
writer
#:key multiple?)
- "Define a parser, validator, and writer for the HTTP header, @var{name}.
-
-@var{parser} should be a procedure that takes a string and returns a
-Scheme value. @var{validator} is a predicate for whether the given
-Scheme value is valid for this header. @var{writer} takes a value and a
-port, and writes the value to the port."
+ "Declare a parser, validator, and writer for a given header."
(if (and (string? name) parser validator writer)
(let ((decl (make-header-decl name parser validator writer multiple?)))
(hashq-set! *declared-headers* (string->header name) decl)
@@ -120,27 +115,33 @@ port, and writes the value to the port."
(string-titlecase (symbol->string sym)))))
(define (known-header? sym)
- "Return @code{#t} if there are parsers and writers registered for this
-header, otherwise @code{#f}."
+ "Return @code{#t} iff @var{sym} is a known header, with associated
+parsers and serialization procedures."
(and (lookup-header-decl sym) #t))
(define (header-parser sym)
- "Returns a procedure to parse values for the given header."
+ "Return the value parser for headers named @var{sym}. The result is a
+procedure that takes one argument, a string, and returns the parsed
+value. If the header isn't known to Guile, a default parser is returned
+that passes through the string unchanged."
(let ((decl (lookup-header-decl sym)))
(if decl
(header-decl-parser decl)
(lambda (x) x))))
(define (header-validator sym)
- "Returns a procedure to validate values for the given header."
+ "Return a predicate which returns @code{#t} if the given value is valid
+for headers named @var{sym}. The default validator for unknown headers
+is @code{string?}."
(let ((decl (lookup-header-decl sym)))
(if decl
(header-decl-validator decl)
string?)))
(define (header-writer sym)
- "Returns a procedure to write values for the given header to a given
-port."
+ "Return a procedure that writes values for headers named @var{sym} to a
+port. The resulting procedure takes two arguments: a value and a port.
+The default writer is @code{display}."
(let ((decl (lookup-header-decl sym)))
(if decl
(header-decl-writer decl)
@@ -196,10 +197,7 @@ body was reached (i.e., a blank line)."
(define (parse-header sym val)
"Parse @var{val}, a string, with the parser registered for the header
-named @var{sym}.
-
-Returns the parsed value. If a parser was not found, the value is
-returned as a string."
+named @var{sym}. Returns the parsed value."
((header-parser sym) val))
(define (valid-header? sym val)
@@ -210,17 +208,16 @@ header with name @var{sym}."
(error "header name not a symbol" sym)))
(define (write-header sym val port)
- "Writes the given header name and value to @var{port}. If @var{sym}
-is a known header, uses the specific writer registered for that header.
-Otherwise the value is written using @code{display}."
+ "Write the given header name and value to @var{port}, using the writer
+from @code{header-writer}."
(display (header->string sym) port)
(display ": " port)
((header-writer sym) val port)
(display "\r\n" port))
(define (read-headers port)
- "Read an HTTP message from @var{port}, returning the headers as an
-ordered alist."
+ "Read the headers of an HTTP message from @var{port}, returning them
+as an ordered alist."
(let lp ((headers '()))
(call-with-values (lambda () (read-header port))
(lambda (k v)
@@ -230,7 +227,7 @@ ordered alist."
(define (write-headers headers port)
"Write the given header alist to @var{port}. Doesn't write the final
-\\r\\n, as the user might want to add another header."
+@samp{\\r\\n}, as the user might want to add another header."
(let lp ((headers headers))
(if (pair? headers)
(begin
diff --git a/module/web/request.scm b/module/web/request.scm
index 40d4a66..51ef473 100644
--- a/module/web/request.scm
+++ b/module/web/request.scm
@@ -195,7 +195,11 @@ metadata, @var{meta}.
As a side effect, sets the encoding on @var{port} to
ISO-8859-1 (latin-1), so that reading one character reads one byte. See
the discussion of character sets in \"HTTP Requests\" in the manual, for
-more information."
+more information.
+
+Note that the body is not part of the request. Once you have read a
+request, you may read the body separately, and likewise for writing
+requests."
(set-port-encoding! port "ISO-8859-1")
(call-with-values (lambda () (read-request-line port))
(lambda (method uri version)
@@ -205,7 +209,7 @@ more information."
(define (write-request r port)
"Write the given HTTP request to @var{port}.
-Returns a new request, whose @code{request-port} will continue writing
+Return a new request, whose @code{request-port} will continue writing
on @var{port}, perhaps using some transfer encoding."
(write-request-line (request-method r) (request-uri r)
(request-version r) port)
@@ -217,8 +221,8 @@ on @var{port}, perhaps using some transfer encoding."
(request-headers r) (request-meta r) port)))
(define (read-request-body r)
- "Reads the request body from @var{r}, as a bytevector. Returns
-@code{#f} if there was no request body."
+ "Reads the request body from @var{r}, as a bytevector. Return @code{#f}
+if there was no request body."
(let ((nbytes (request-content-length r)))
(and nbytes
(let ((bv (get-bytevector-n (request-port r) nbytes)))
@@ -297,6 +301,8 @@ request @var{r}."
;; Misc accessors
(define* (request-absolute-uri r #:optional default-host default-port)
+ "A helper routine to determine the absolute URI of a request, using the
+@code{host} header and the default host and port."
(let ((uri (request-uri r)))
(if (uri-host uri)
uri
diff --git a/module/web/server.scm b/module/web/server.scm
index 42887e6..ed88329 100644
--- a/module/web/server.scm
+++ b/module/web/server.scm
@@ -143,7 +143,7 @@ for passing to other procedures in this module, like
;; -> server
(define (open-server impl open-params)
- "Open a server for the given implementation. Returns one value, the
+ "Open a server for the given implementation. Return one value, the
new server object. The implementation's @code{open} procedure is
applied to @var{open-params}, which should be a list."
(apply (server-impl-open impl) open-params))
@@ -151,9 +151,9 @@ applied to @var{open-params}, which should be a list."
;; -> (client request body | #f #f #f)
(define (read-client impl server)
"Read a new client from @var{server}, by applying the implementation's
-@code{read} procedure to the server. If successful, returns three
+@code{read} procedure to the server. If successful, return three
values: an object corresponding to the client, a request object, and the
-request body. If any exception occurs, returns @code{#f} for all three
+request body. If any exception occurs, return @code{#f} for all three
values."
(call-with-error-handling
(lambda ()
@@ -364,7 +364,7 @@ attaching it to the @var{server} argument somehow."
;; -> new-state
(define (serve-one-client handler impl server state)
"Read one request from @var{server}, call @var{handler} on the request
-and body, and write the response to the client. Returns the new state
+and body, and write the response to the client. Return the new state
produced by the handler procedure."
(debug-elapsed 'serve-again)
(call-with-values
@@ -404,7 +404,7 @@ The response and body will be run through @code{sanitize-response}
before sending back to the client.
Additional arguments to @var{handler} are taken from
-@var{state}. Additional return values are accumulated into a new
+@var{state}. Additional return values are accumulated into a new
@var{state}, which will be used for subsequent requests. In this way a
handler can explicitly manage its state.
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 78614a5..ddab7be 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -79,8 +79,10 @@
(define* (build-uri scheme #:key userinfo host port (path "") query fragment
(validate? #t))
- "Construct a URI object. If @var{validate?} is true, also run some
-consistency checks to make sure that the constructed URI is valid."
+ "Construct a URI object. @var{scheme} should be a symbol, and the rest
+of the fields are either strings or @code{#f}. If @var{validate?} is
+true, also run some consistency checks to make sure that the constructed
+URI is valid."
(if validate?
(validate-uri scheme userinfo host port path query fragment))
(make-uri scheme userinfo host port path query fragment))
@@ -168,7 +170,7 @@ consistency checks to make sure that the constructed URI is valid."
(make-regexp uri-pat))
(define (string->uri string)
- "Parse @var{string} into a URI object. Returns @code{#f} if the string
+ "Parse @var{string} into a URI object. Return @code{#f} if the string
could not be parsed."
(% (let ((m (regexp-exec uri-regexp string)))
(if (not m) (abort))
@@ -191,10 +193,7 @@ could not be parsed."
(define *default-ports* (make-hash-table))
(define (declare-default-port! scheme port)
- "Declare a default port for the given URI scheme.
-
-Default ports are for printing URI objects: a default port is not
-printed."
+ "Declare a default port for the given URI scheme."
(hashq-set! *default-ports* scheme port))
(define (default-port? scheme port)
@@ -205,7 +204,9 @@ printed."
(declare-default-port! 'https 443)
(define (uri->string uri)
- "Serialize @var{uri} to a string."
+ "Serialize @var{uri} to a string. If the URI has a port that is the
+default port for its scheme, the port is not included in the
+serialization."
(let* ((scheme-str (string-append
(symbol->string (uri-scheme uri)) ":"))
(userinfo (uri-userinfo uri))
@@ -293,18 +294,24 @@ printed."
(string->char-set "0123456789abcdefABCDEF"))
(define* (uri-decode str #:key (encoding "utf-8"))
- "Percent-decode the given @var{str}, according to @var{encoding}.
+ "Percent-decode the given @var{str}, according to @var{encoding},
+which should be the name of a character encoding.
Note that this function should not generally be applied to a full URI
string. For paths, use split-and-decode-uri-path instead. For query
strings, split the query on @code{&} and @code{=} boundaries, and decode
the components separately.
-Note that percent-encoded strings encode @emph{bytes}, not characters.
-There is no guarantee that a given byte sequence is a valid string
-encoding. Therefore this routine may signal an error if the decoded
-bytes are not valid for the given encoding. Pass @code{#f} for
-@var{encoding} if you want decoded bytes as a bytevector directly."
+Note also that percent-encoded strings encode @emph{bytes}, not
+characters. There is no guarantee that a given byte sequence is a valid
+string encoding. Therefore this routine may signal an error if the
+decoded bytes are not valid for the given encoding. Pass @code{#f} for
+@var{encoding} if you want decoded bytes as a bytevector directly.
+@xref{Ports, @code{set-port-encoding!}}, for more information on
+character encodings.
+
+Returns a string of the decoded characters, or a bytevector if
+@var{encoding} was @code{#f}."
(let* ((len (string-length str))
(bv
(call-with-output-bytevector*
@@ -358,10 +365,13 @@ bytes are not valid for the given encoding. Pass @code{#f} for
;;
(define* (uri-encode str #:key (encoding "utf-8")
(unescaped-chars unreserved-chars))
- "Percent-encode any character not in the character set, @var{unescaped-chars}.
+ "Percent-encode any character not in the character set,
+@var{unescaped-chars}.
-Percent-encoding first writes out the given character to a bytevector
-within the given @var{encoding}, then encodes each byte as
+The default character set includes alphanumerics from ASCII, as well as
+the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
+other character will be percent-encoded, by writing out the character to
+a bytevector within the given @var{encoding}, then encoding each byte as
@code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
the byte."
(define (needs-escaped? ch)
@@ -387,15 +397,18 @@ the byte."
str))
(define (split-and-decode-uri-path path)
- "Split @var{path} into its components, and decode each
-component, removing empty components.
+ "Split @var{path} into its components, and decode each component,
+removing empty components.
-For example, @code{\"/foo/bar/\"} decodes to the two-element list,
-@code{(\"foo\" \"bar\")}."
+For example, @code{\"/foo/bar%20baz/\"} decodes to the two-element list,
+@code{(\"foo\" \"bar baz\")}."
(filter (lambda (x) (not (string-null? x)))
(map uri-decode (string-split path #\/))))
(define (encode-and-join-uri-path parts)
"URI-encode each element of @var{parts}, which should be a list of
-strings, and join the parts together with @code{/} as a delimiter."
+strings, and join the parts together with @code{/} as a delimiter.
+
+For example, the list @code{(\"scrambled eggs\" \"biscuits&gravy\")}
+encodes as @code{\"scrambled%20eggs/biscuits%26gravy\"}."
(string-join (map uri-encode parts) "/"))
--
1.7.10.4
[-- Attachment #3: 0002-web-uri-document-that-uri-port-is-an-integer.patch --]
[-- Type: application/octet-stream, Size: 2827 bytes --]
From 50c5235ca50e356acfad709ae820c2963a8ff11c Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Sat, 24 Nov 2012 15:11:21 +0800
Subject: [PATCH 2/3] (web uri): document that uri-port is an integer
* doc/ref/web.texi (URIs):
* module/web/uri.scm (build-uri): Document that uri-port is an integer.
---
doc/ref/web.texi | 13 +++++++------
module/web/uri.scm | 9 +++++----
2 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index e6e594e..3e93bea 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -209,10 +209,11 @@ access to them.
@deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
[#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
[#:fragment=@code{#f}] [#:validate?=@code{#t}]
-Construct a URI object. @var{scheme} should be a symbol, and the rest
-of the fields are either strings or @code{#f}. If @var{validate?} is
-true, also run some consistency checks to make sure that the constructed
-URI is valid.
+Construct a URI object. @var{scheme} should be a symbol, @var{port}
+either a positive, exact integer or @code{#f}, and the rest of the
+fields are either strings or @code{#f}. If @var{validate?} is true,
+also run some consistency checks to make sure that the constructed URI
+is valid.
@end deffn
@deffn {Scheme Procedure} uri? x
@@ -224,8 +225,8 @@ URI is valid.
@deffnx {Scheme Procedure} uri-query uri
@deffnx {Scheme Procedure} uri-fragment uri
A predicate and field accessors for the URI record type. The URI scheme
-will be a symbol, and the rest either strings or @code{#f} if not
-present.
+will be a symbol, the port either a positive, exact integer or @code{#f},
+and the rest either strings or @code{#f} if not present.
@end deffn
@deffn {Scheme Procedure} string->uri string
diff --git a/module/web/uri.scm b/module/web/uri.scm
index ddab7be..e84bc03 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -79,10 +79,11 @@
(define* (build-uri scheme #:key userinfo host port (path "") query fragment
(validate? #t))
- "Construct a URI object. @var{scheme} should be a symbol, and the rest
-of the fields are either strings or @code{#f}. If @var{validate?} is
-true, also run some consistency checks to make sure that the constructed
-URI is valid."
+ "Construct a URI object. @var{scheme} should be a symbol, @var{port}
+either a positive, exact integer or @code{#f}, and the rest of the
+fields are either strings or @code{#f}. If @var{validator?} is true,
+also run some consistency checks to make sure that the constructed URI
+is valid."
(if validate?
(validate-uri scheme userinfo host port path query fragment))
(make-uri scheme userinfo host port path query fragment))
--
1.7.10.4
[-- Attachment #4: 0003-uri-reference-support.patch --]
[-- Type: application/octet-stream, Size: 15823 bytes --]
From ebbcc923b776fd3fbfe28c5c050ee8df7b71529d Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke@gmail.com>
Date: Thu, 8 Nov 2012 13:11:52 +0800
Subject: [PATCH 3/3] uri-reference support
* module/web/uri.scm (build-uri, validate-uri): Also build
relative-refs, which have no scheme.
(string->uri, uri->string): Also support URI objects with no scheme.
(uri-reference?, relative-ref?, absolute-uri?): New predicates to
distinguish various URI-like objects.
(uri?): Redefine so that semantics are unchanged; only return #t for
objects previously built and validated by build-uri?. Such objects
always had a uri-scheme.
* module/web/http.scm (declare-uri-reference-header!): New header type
accepting any URI-reference.
("Content-Location", "Referer"): Change to URI-reference type.
* doc/ref/web.texi (URIs): Document other URI-like syntaxes defined in
RFC 3986. Include brief discussion. Update functions
* test-suite/tests/web-http.test:
* test-suite/tests/web-uri.test: Add relevant tests.
---
doc/ref/web.texi | 39 ++++++++++---
module/web/http.scm | 13 ++++-
module/web/uri.scm | 59 ++++++++++++++-----
test-suite/tests/web-http.test | 4 ++
test-suite/tests/web-uri.test | 125 ++++++++++++++++++++++++++++++++++++++++
5 files changed, 214 insertions(+), 26 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 3e93bea..d247080 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -173,8 +173,9 @@ Guile provides a standard data type for Universal Resource Identifiers
The generic URI syntax is as follows:
@example
-URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
- [ "?" query ] [ "#" fragment ]
+URI := scheme ":" hier-part [ "?" query ] [ "#" fragment ]
+
+hier-part := ["//" [userinfo "@@"] host [":" port]] path
@end example
For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
@@ -198,6 +199,25 @@ fragment identifies a part of a resource, not the resource itself. But
it is useful to have a fragment field in the URI record itself, so we
hope you will forgive the inconsistency.
+There are some additional URI-like syntaxes:
+
+@example
+URI-reference := URI / relative-ref
+
+relative-ref := hier-part [ "?" query ] [ "#" fragment ]
+
+absolute-URI := scheme ":" hier-part [ "?" query ]
+@end example
+
+These extra forms are useful in various situations. For example,
+relative-refs are convenient in documents to refer to other parts of the
+same document, or resources located on the same site.
+
+A relative-ref must be considered in relation to a given base URI to
+correctly identify a resource. The base URI is determined according to
+the context and properties of the document in which the relative-ref is
+located. See section 5 of RFC 3986 for details.
+
@example
(use-modules (web uri))
@end example
@@ -211,12 +231,14 @@ access to them.
[#:fragment=@code{#f}] [#:validate?=@code{#t}]
Construct a URI object. @var{scheme} should be a symbol, @var{port}
either a positive, exact integer or @code{#f}, and the rest of the
-fields are either strings or @code{#f}. If @var{validate?} is true,
-also run some consistency checks to make sure that the constructed URI
-is valid.
+fields are strings. If @var{validate?} is true, also run some
+consistency checks to make sure that the constructed URI is valid.
@end deffn
@deffn {Scheme Procedure} uri? x
+@deffnx {Scheme Procedure} uri-reference? x
+@deffnx {Scheme Procedure} relative-ref? x
+@deffnx {Scheme Procedure} absolute-uri? x
@deffnx {Scheme Procedure} uri-scheme uri
@deffnx {Scheme Procedure} uri-userinfo uri
@deffnx {Scheme Procedure} uri-host uri
@@ -224,9 +246,10 @@ is valid.
@deffnx {Scheme Procedure} uri-path uri
@deffnx {Scheme Procedure} uri-query uri
@deffnx {Scheme Procedure} uri-fragment uri
-A predicate and field accessors for the URI record type. The URI scheme
-will be a symbol, the port either a positive, exact integer or @code{#f},
-and the rest either strings or @code{#f} if not present.
+Predicates and field accessors for the URI record type. The URI scheme
+will be a symbol, the port a positive, exact integer, and the rest
+strings. Any field other than @code{uri-path} may also be @code{#f} if
+not present.
@end deffn
@deffn {Scheme Procedure} string->uri string
diff --git a/module/web/http.scm b/module/web/http.scm
index 3b78d08..389880b 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1179,12 +1179,19 @@ treated specially, and is just returned as a plain string."
parse-non-negative-integer non-negative-integer? display))
;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1)
-(define (declare-uri-header! name)
+(define* (declare-uri-header! name #:optional)
(declare-header! name
(lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
uri?
write-uri))
+;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
+(define (declare-uri-reference-header! name)
+ (declare-header! name
+ (lambda (str) (or (string->uri str) (bad-header-component 'uri-reference str)))
+ uri-reference?
+ write-uri))
+
;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
(define (declare-quality-list-header! name)
(declare-header! name
@@ -1437,7 +1444,7 @@ treated specially, and is just returned as a plain string."
;; Content-Location = ( absoluteURI | relativeURI )
;;
-(declare-uri-header! "Content-Location")
+(declare-uri-reference-header! "Content-Location")
;; Content-MD5 = <base64 of 128 bit MD5 digest as per RFC 1864>
;;
@@ -1726,7 +1733,7 @@ treated specially, and is just returned as a plain string."
;; Referer = ( absoluteURI | relativeURI )
;;
-(declare-uri-header! "Referer")
+(declare-uri-reference-header! "Referer")
;; TE = #( t-codings )
;; t-codings = "trailers" | ( transfer-extension [ accept-params ] )
diff --git a/module/web/uri.scm b/module/web/uri.scm
index e84bc03..e7990ad 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -31,7 +31,7 @@
#:use-module (ice-9 control)
#:use-module (rnrs bytevectors)
#:use-module (ice-9 binary-ports)
- #:export (uri?
+ #:export (uri? uri-reference? relative-ref? absolute-uri?
uri-scheme uri-userinfo uri-host uri-port
uri-path uri-query uri-fragment
@@ -44,7 +44,7 @@
(define-record-type <uri>
(make-uri scheme userinfo host port path query fragment)
- uri?
+ uri-reference?
(scheme uri-scheme)
(userinfo uri-userinfo)
(host uri-host)
@@ -53,15 +53,32 @@
(query uri-query)
(fragment uri-fragment))
+(define (uri? x)
+ (and (uri-reference? x)
+ (uri-scheme x)
+ #t))
+
+(define (relative-ref? x)
+ (and (uri-reference? x)
+ (not (uri-scheme x))
+ #t))
+
+(define (absolute-uri? x)
+ (and (uri-reference? x)
+ (uri-scheme x)
+ (not (uri-fragment x))
+ #t))
+
(define (uri-error message . args)
(throw 'uri-error message args))
(define (positive-exact-integer? port)
(and (number? port) (exact? port) (integer? port) (positive? port)))
-(define (validate-uri scheme userinfo host port path query fragment)
+(define (validate-uri scheme userinfo host port path query fragment reference?)
(cond
- ((not (symbol? scheme))
+ ((and (not (symbol? scheme))
+ (or (not reference?) scheme))
(uri-error "Expected a symbol for the URI scheme: ~s" scheme))
((and (or userinfo port) (not host))
(uri-error "Expected a host, given userinfo or port"))
@@ -85,7 +102,7 @@ fields are either strings or @code{#f}. If @var{validator?} is true,
also run some consistency checks to make sure that the constructed URI
is valid."
(if validate?
- (validate-uri scheme userinfo host port path query fragment))
+ (validate-uri scheme userinfo host port path query fragment #t))
(make-uri scheme userinfo host port path query fragment))
;; See RFC 3986 #3.2.2 for comments on percent-encodings, IDNA (RFC
@@ -153,6 +170,17 @@ is valid."
;;; / path-absolute
;;; / path-rootless
;;; / path-empty
+;;;
+;;; RFC 3986, #4.
+;;;
+;;; URI-reference = URI / relative-ref
+;;;
+;;; relative-ref = relative-part [ "?" query ] [ "#" fragment ]
+;;;
+;;; relative-part = "//" authority path-abempty
+;;; / path-absolute
+;;; / path-noscheme
+;;; / path-empty
(define scheme-pat
"[a-zA-Z][a-zA-Z0-9+.-]*")
@@ -165,7 +193,7 @@ is valid."
(define fragment-pat
".*")
(define uri-pat
- (format #f "^(~a):(//~a)?(~a)(\\?(~a))?(#(~a))?$"
+ (format #f "^((~a):)?(//~a)?(~a)(\\?(~a))?(#(~a))?$"
scheme-pat authority-pat path-pat query-pat fragment-pat))
(define uri-regexp
(make-regexp uri-pat))
@@ -175,12 +203,12 @@ is valid."
could not be parsed."
(% (let ((m (regexp-exec uri-regexp string)))
(if (not m) (abort))
- (let ((scheme (string->symbol
- (string-downcase (match:substring m 1))))
- (authority (match:substring m 2))
- (path (match:substring m 3))
- (query (match:substring m 5))
- (fragment (match:substring m 7)))
+ (let ((scheme (let ((s (match:substring m 2)))
+ (and s (string->symbol (string-downcase s)))))
+ (authority (match:substring m 3))
+ (path (match:substring m 4))
+ (query (match:substring m 6))
+ (fragment (match:substring m 8)))
(call-with-values
(lambda ()
(if authority
@@ -208,8 +236,7 @@ could not be parsed."
"Serialize @var{uri} to a string. If the URI has a port that is the
default port for its scheme, the port is not included in the
serialization."
- (let* ((scheme-str (string-append
- (symbol->string (uri-scheme uri)) ":"))
+ (let* ((scheme (uri-scheme uri))
(userinfo (uri-userinfo uri))
(host (uri-host uri))
(port (uri-port uri))
@@ -217,7 +244,9 @@ serialization."
(query (uri-query uri))
(fragment (uri-fragment uri)))
(string-append
- scheme-str
+ (if scheme
+ (string-append (symbol->string scheme) ":")
+ "")
(if host
(string-append "//"
(if userinfo (string-append userinfo "@")
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..5d80fde 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -145,6 +145,8 @@
(pass-if-parse content-length "010" 10)
(pass-if-parse content-location "http://foo/"
(build-uri 'http #:host "foo" #:path "/"))
+ (pass-if-parse content-location "//foo"
+ (build-uri #f #:host "foo"))
(pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
(pass-if-parse content-range "bytes */*" '(bytes * *))
(pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -208,6 +210,8 @@
(pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
(pass-if-parse referer "http://foo/bar?baz"
(build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+ (pass-if-parse referer "//foo/bar?baz"
+ (build-uri #f #:host "foo" #:path "/bar" #:query "baz"))
(pass-if-parse te "trailers" '((trailers)))
(pass-if-parse te "trailers,foo" '((trailers) (foo)))
(pass-if-parse user-agent "guile" "guile"))
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3f6e7e3..d4d44d6 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -78,6 +78,24 @@
#:port 22
#:path "/baz"))
+ (pass-if "foo"
+ (uri=? (build-uri #f #:path "foo")
+ #:path "foo"))
+
+ (pass-if "/foo"
+ (uri=? (build-uri #f #:path "/foo")
+ #:path "/foo"))
+
+ (pass-if "//foo/bar"
+ (uri=? (build-uri #f #:host "foo" #:path "/bar")
+ #:host "foo"
+ #:path "/bar"))
+
+ (pass-if "?foo"
+ (uri=? (build-uri #f #:query "foo")
+ #:path ""
+ #:query "foo"))
+
(pass-if-uri-exception "non-symbol scheme"
"Expected.*symbol"
(build-uri "nonsym"))
@@ -123,6 +141,80 @@
"Expected.*host"
(build-uri 'http #:userinfo "foo")))
+(with-test-prefix "absolute-uri?"
+ (pass-if "ftp:"
+ (absolute-uri? (build-uri 'ftp)))
+
+ (pass-if "ftp:foo"
+ (absolute-uri? (build-uri 'ftp #:path "foo")))
+
+ (pass-if "ftp://foo/bar"
+ (absolute-uri? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+ (pass-if "ftp://foo@bar:22/baz"
+ (absolute-uri? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+ (expect-fail "foo"
+ (absolute-uri? (build-uri #f #:path "foo")))
+
+ (expect-fail "/foo"
+ (absolute-uri? (build-uri #f #:path "/foo")))
+
+ (expect-fail "//foo/bar"
+ (absolute-uri? (build-uri #f #:host "foo" #:path "/bar")))
+
+ (expect-fail "?foo"
+ (absolute-uri? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "relative-ref?"
+ (expect-fail "ftp:"
+ (relative-ref? (build-uri 'ftp)))
+
+ (expect-fail "ftp:foo"
+ (relative-ref? (build-uri 'ftp #:path "foo")))
+
+ (expect-fail "ftp://foo/bar"
+ (relative-ref? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+ (expect-fail "ftp://foo@bar:22/baz"
+ (relative-ref? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+ (pass-if "foo"
+ (relative-ref? (build-uri #f #:path "foo")))
+
+ (pass-if "/foo"
+ (relative-ref? (build-uri #f #:path "/foo")))
+
+ (pass-if "//foo/bar"
+ (relative-ref? (build-uri #f #:host "foo" #:path "/bar")))
+
+ (pass-if "?foo"
+ (relative-ref? (build-uri #f #:query "foo"))))
+
+(with-test-prefix "uri-reference?"
+ (pass-if "ftp:"
+ (uri-reference? (build-uri 'ftp)))
+
+ (pass-if "ftp:foo"
+ (uri-reference? (build-uri 'ftp #:path "foo")))
+
+ (pass-if "ftp://foo/bar"
+ (uri-reference? (build-uri 'ftp #:host "foo" #:path "/bar")))
+
+ (pass-if "ftp://foo@bar:22/baz"
+ (uri-reference? (build-uri 'ftp #:userinfo "foo" #:host "bar" #:port 22 #:path "/baz")))
+
+ (pass-if "foo"
+ (uri-reference? (build-uri #f #:path "foo")))
+
+ (pass-if "/foo"
+ (uri-reference? (build-uri #f #:path "/foo")))
+
+ (pass-if "//foo/bar"
+ (uri-reference? (build-uri #f #:host "foo" #:path "/bar")))
+
+ (pass-if "?foo"
+ (uri-reference? (build-uri #f #:query "foo"))))
(with-test-prefix "string->uri"
(pass-if "ftp:"
@@ -149,6 +241,24 @@
#:port 22
#:path "/baz"))
+ (pass-if "foo"
+ (uri=? (string->uri "foo")
+ #:path "foo"))
+
+ (pass-if "/foo"
+ (uri=? (string->uri "/foo")
+ #:path "/foo"))
+
+ (pass-if "//foo/bar"
+ (uri=? (string->uri "//foo/bar")
+ #:host "foo"
+ #:path "/bar"))
+
+ (pass-if "?foo"
+ (uri=? (string->uri "?foo")
+ #:path ""
+ #:query "foo"))
+
(pass-if "http://bad.host.1"
(not (string->uri "http://bad.host.1")))
@@ -229,6 +339,21 @@
(equal? "ftp://foo@bar:22/baz"
(uri->string (string->uri "ftp://foo@bar:22/baz"))))
+ (pass-if "foo"
+ (equal? "foo"
+ (uri->string (string->uri "foo"))))
+
+ (pass-if "/foo"
+ (equal? "/foo" (uri->string (string->uri "/foo"))))
+
+ (pass-if "//foo/bar"
+ (equal? "//foo/bar"
+ (uri->string (string->uri "//foo/bar"))))
+
+ (pass-if "?foo"
+ (equal? "?foo"
+ (uri->string (string->uri "?foo"))))
+
(when (memq 'socket *features*)
(pass-if "http://192.0.2.1"
(equal? "http://192.0.2.1"
--
1.7.10.4
next prev parent reply other threads:[~2012-11-24 11:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-07 20:40 bug#12827: [2.0.6] web client: fails to parse 404 header Ludovic Courtès
2012-11-08 5:52 ` Daniel Hartwig
2012-11-08 20:10 ` Ludovic Courtès
2012-11-09 0:39 ` Daniel Hartwig
2012-11-09 20:52 ` Ludovic Courtès
2012-11-10 1:45 ` Daniel Hartwig
2012-11-10 13:52 ` Ludovic Courtès
2012-11-23 22:19 ` Ludovic Courtès
2012-11-24 11:23 ` Daniel Hartwig [this message]
2012-11-24 15:10 ` Ludovic Courtès
2012-11-24 15:34 ` Daniel Hartwig
2012-11-26 0:15 ` Ludovic Courtès
2012-11-26 23:13 ` Ludovic Courtès
2012-11-27 1:06 ` Daniel Hartwig
2012-11-27 12:50 ` Ludovic Courtès
2012-11-27 15:18 ` Daniel Hartwig
2012-11-27 21:43 ` Ludovic Courtès
2013-02-23 8:11 ` bug#12827: [PATCH] Tweak web modules, support relative URIs (was: bug#12827: [2.0.6] web client: fails to parse 404 header) Daniel Hartwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAN3veRfzRovVKAxRcTJnPd2+hc-MpDPz3J1HAgc_L0Kmd3h89w@mail.gmail.com \
--to=mandyke@gmail.com \
--cc=12827@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).