* Chunked Encoding
@ 2011-09-29 20:09 Ian Price
2012-01-04 20:18 ` Andy Wingo
0 siblings, 1 reply; 9+ messages in thread
From: Ian Price @ 2011-09-29 20:09 UTC (permalink / raw)
To: guile-devel
Hi guilers,
If you've used the (web ...) modules, you may have noticed that guile
does not currently support chunked-encoding. This is expected in a
HTTP/1.1 world, so I wrote an implementation to cover my immediate
need, but I'm not particularly convinced of it, so I wanted to discuss
this before sending a patch.
What I did was introduce two new exported procedures for reading (all I
needed at the moment), namely 'read-chunk' and 'read-chunked-response-body'.
(read-chunk port)
reads one chunk from a port.
(read-chunked-response-body response)
read the full body for the response and returns it as a bytevector. It
was written to be similar to 'read-response-body'.
For writing chunks then, the obvious thing is to write two procedures
'write-chunk' and 'write-chunked-response-body' which perform the
inverse. However, it seems to me that 'write-chunked-response-body' is a
practically useless procedure, because if you ever had a full body, you
can just use 'write-response-body'.
Another option I've been thinking over would be to go for a sort of
chunking version of R6RS' 'transcoded-port' which would handle it
transparently for users of the returned port.
I'd also suggest extending 'http-get' from (web client) to handle
chunked encoding (and trailers too, I guess) for the user.
Comments kindly requested,
Ian
--
Ian Price
"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2011-09-29 20:09 Chunked Encoding Ian Price
@ 2012-01-04 20:18 ` Andy Wingo
2012-05-06 4:52 ` Ian Price
0 siblings, 1 reply; 9+ messages in thread
From: Andy Wingo @ 2012-01-04 20:18 UTC (permalink / raw)
To: Ian Price; +Cc: guile-devel
On Thu 29 Sep 2011 16:09, Ian Price <ianprice90@googlemail.com> writes:
> If you've used the (web ...) modules, you may have noticed that guile
> does not currently support chunked-encoding. This is expected in a
> HTTP/1.1 world, so I wrote an implementation to cover my immediate
> need, but I'm not particularly convinced of it, so I wanted to discuss
> this before sending a patch.
Thanks for the note! We totally need to do this.
> What I did was introduce two new exported procedures for reading (all I
> needed at the moment), namely 'read-chunk' and 'read-chunked-response-body'.
>
> (read-chunk port)
> reads one chunk from a port.
>
> (read-chunked-response-body response)
> read the full body for the response and returns it as a bytevector. It
> was written to be similar to 'read-response-body'.
Is it possible to use soft ports?. That would be nice, and it would
allow client code to be able to read from the port until it gets EOF,
without being concerned about the transfer-encoding. Same thing goes
for gzip/deflate/compress encoding.
> For writing chunks then, the obvious thing is to write two procedures
> 'write-chunk' and 'write-chunked-response-body' which perform the
> inverse. However, it seems to me that 'write-chunked-response-body' is a
> practically useless procedure, because if you ever had a full body, you
> can just use 'write-response-body'.
Again, soft ports?
> Another option I've been thinking over would be to go for a sort of
> chunking version of R6RS' 'transcoded-port' which would handle it
> transparently for users of the returned port.
Is this equivalent? I don't have much experience with these. It seems
that custom binary input/output ports are more appropriate.
> I'd also suggest extending 'http-get' from (web client) to handle
> chunked encoding (and trailers too, I guess) for the user.
For trailers, the RFC2616 says:
A server using chunked transfer-coding in a response MUST NOT use the
trailer for any header fields unless at least one of the following is
true:
a)the request included a TE header field that indicates "trailers" is
acceptable in the transfer-coding of the response, as described in
section 14.39; or,
b)the server is the origin server for the response, the trailer
fields consist entirely of optional metadata, and the recipient
could use the message (in a manner acceptable to the origin server)
without receiving this metadata. In other words, the origin server
is willing to accept the possibility that the trailer fields might
be silently discarded along the path to the client.
This requirement prevents an interoperability failure when the
message is being received by an HTTP/1.1 (or later) proxy and
forwarded to an HTTP/1.0 recipient. It avoids a situation where
compliance with the protocol would have necessitated a possibly
infinite buffer on the proxy.
So, in the default case in which we do not mention "trailers" in the
request, trailer fields should be strictly not required. So the naive
case should just work.
However, I agree that it would be nice to get these fields, so I would
suggest that we store the parsed trailers in a weak map port ->
header-alist, and provide an accessor procedure. That way if the client
knows that they should parse trailers, then they can get at them. I
guess you're right that http-get should just DTRT.
> Comments kindly requested,
Again, very sorry for the delay! I hope to be more on top of things
this month.
Cheers,
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-01-04 20:18 ` Andy Wingo
@ 2012-05-06 4:52 ` Ian Price
2012-05-06 16:53 ` Ian Price
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Ian Price @ 2012-05-06 4:52 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
ccing Daniel Hartwig, since he has been a bigger cheerleader for chunked
encoding than I have so far :).
Andy Wingo <wingo@pobox.com> writes:
> On_Thu 29 Sep 2011 16:09, Ian Price <ianprice90@googlemail.com> writes:
>
>> If you've used the (web ...) modules, you may have noticed that guile
>> does not currently support chunked-encoding.
Sheesh that was a long time ago
>> What I did was introduce two new exported procedures for reading (all I
>> needed at the moment), namely 'read-chunk' and 'read-chunked-response-body'.
>>
>> (read-chunk port)
>> reads one chunk from a port.
>>
>> (read-chunked-response-body response)
>> read the full body for the response and returns it as a bytevector. It
>> was written to be similar to 'read-response-body'.
>
> Is it possible to use soft ports?. That would be nice, and it would
> allow client code to be able to read from the port until it gets EOF,
> without being concerned about the transfer-encoding. Same thing goes
> for gzip/deflate/compress encoding.
It is possible, and I have some code for this, but I had been
procrastinating for a while because of a soft ports bug with flush, and
once that got fixed, just plain procrastinating.
>> Another option I've been thinking over would be to go for a sort of
>> chunking version of R6RS' 'transcoded-port' which would handle it
>> transparently for users of the returned port.
>
> Is this equivalent? I don't have much experience with these. It seems
> that custom binary input/output ports are more appropriate.
Well, what I meant is a port that would be layered over the top of
another. Soft ports or custom binary ports would be used to implement
it. (Is there a reason (effiencywise) to prefer one over the other?)
Basically, my interface right now is
(make-chunked-input-port port) -> input-port
(make-chunked-output-port port) -> output-port
These operate pretty much as you'd expect. The port returned from
'make-chunked-input-port' reads whole chunks from its argument port, and
maintains a buffer, from which it can satisfy smaller reads.
The port returned from 'make-chunked-output-port' buffers up the writes
and writes a whole (properly formatted) chunk on 'force-output'.
The only behaviour I'm not entirely sure of is what happens on
close. Let's be more concrete:
(define a <port>)
(define b (make-chunked-input-port a))
(close-port b)
What is the state of a?
(define c <port>)
(define d (make-chunked-output-port c))
(close-port d)
Likewise for c.
I think common practice in things like Java's BufferedReader would be to
have a be closed when b is. This may be undesirable since we could wish
to continue using the socket. On the other hand, once you start layering
ports, it is convenient to have the higher layers close the lower
layers. I think it might make sense to have a keyword argument,
#:dont-close? (or something), that specifies this behaviour, defaulting
to close.
c seems less clear to me. Again, once you have multiple layers, it would
be convenient, and it would properly handle any inner state saved up.
Any thoughts on this?
>
>> I'd also suggest extending 'http-get' from (web client) to handle
>> chunked encoding (and trailers too, I guess) for the user.
>
<snip>
>
> So, in the default case in which we do not mention "trailers" in the
> request, trailer fields should be strictly not required. So the naive
> case should just work.
I think we can leave trailers until I have some actual data on how much
these are actually used in practice, and/or someone complains about it
being missing. WDYT?
FWIW I didn't see any code for handling them in Ruby's net/http.rb
>> Comments kindly requested,
>
> Again, very sorry for the delay! I hope to be more on top of things
> this month.
I'm sorry for the delay too. Let's see if we can't finish it sometime in
the next week and end this once and for all :)
--
Ian Price
"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-05-06 4:52 ` Ian Price
@ 2012-05-06 16:53 ` Ian Price
2012-05-08 0:33 ` Ian Price
2012-05-06 19:49 ` Thien-Thi Nguyen
2012-05-08 3:26 ` Daniel Hartwig
2 siblings, 1 reply; 9+ messages in thread
From: Ian Price @ 2012-05-06 16:53 UTC (permalink / raw)
To: guile-devel
[-- Attachment #1: Type: text/plain, Size: 603 bytes --]
I've attached the patch. It's not ready for inclusion yet, as it needs
more tests, and documentation, but this is the rough idea of what it
will be like. It doesn't close the wrapped port, but I think a keyword
argument might be the right approach.
Thoughts on this, and the patch in general are welcome
I'd just like to make a note that, instead of modifying the client, I
modified read-response-body directly. This feels like probably the right
thing to do.
--
Ian Price
"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: first draft patch of chunked encoding --]
[-- Type: text/x-patch, Size: 8530 bytes --]
From cf7835882a586bcb3d9077563acbb1aecf09cf97 Mon Sep 17 00:00:00 2001
From: Ian Price <ianprice90@googlemail.com>
Date: Sun, 6 May 2012 17:40:09 +0100
Subject: [PATCH] transfer encoding draft
---
module/web/request.scm | 32 +++++++++++++-
module/web/response.scm | 82 ++++++++++++++++++++++++++++++++---
test-suite/tests/web-request.test | 13 ++++++
test-suite/tests/web-response.test | 27 ++++++++++++
4 files changed, 146 insertions(+), 8 deletions(-)
diff --git a/module/web/request.scm b/module/web/request.scm
index 40d4a66..90ca326 100644
--- a/module/web/request.scm
+++ b/module/web/request.scm
@@ -26,6 +26,7 @@
#:use-module (srfi srfi-9)
#:use-module (web uri)
#:use-module (web http)
+ #:use-module (ice-9 q)
#:export (request?
request-method
request-uri
@@ -89,7 +90,9 @@
request-user-agent
;; Misc
- request-absolute-uri))
+ request-absolute-uri
+
+ make-chunked-output-port))
;;; {Character Encodings, Strings, and Bytevectors}
@@ -313,3 +316,30 @@ request @var{r}."
#:path (uri-path uri)
#:query (uri-query uri)
#:fragment (uri-fragment uri))))))
+
+;; Chunked Requests
+(define (make-chunked-output-port port)
+ (define (q-for-each f q)
+ (while (not (q-empty? q))
+ (f (deq! q))))
+ (define queue (make-q))
+ (define (put-char c)
+ (enq! queue c))
+ (define (put-string s)
+ (string-for-each (lambda (c) (enq! queue c))
+ s))
+ (define (flush)
+ (let ((len (q-length queue)))
+ (display (number->string len 16) port)
+ (display "\r\n" port)
+ (q-for-each (lambda (elem) (write-char elem port))
+ queue)
+ (display "\r\n" port)))
+ (define (close)
+ (flush)
+ (display "0\r\n" port)
+ (force-output port)
+ ;;(flush) ;; prints the final zero chunk
+ ;;(close-port port)
+ )
+ (make-soft-port (vector put-char put-string flush #f close) "w"))
diff --git a/module/web/response.scm b/module/web/response.scm
index 07e1245..9d04064 100644
--- a/module/web/response.scm
+++ b/module/web/response.scm
@@ -40,6 +40,8 @@
read-response-body
write-response-body
+ make-chunked-input-port
+
;; General headers
;;
response-cache-control
@@ -227,13 +229,16 @@ This is true for some response types, like those with code 304."
(define (read-response-body r)
"Reads the response body from @var{r}, as a bytevector. Returns
@code{#f} if there was no response body."
- (let ((nbytes (response-content-length r)))
- (and nbytes
- (let ((bv (get-bytevector-n (response-port r) nbytes)))
- (if (= (bytevector-length bv) nbytes)
- bv
- (bad-response "EOF while reading response body: ~a bytes of ~a"
- (bytevector-length bv) nbytes))))))
+ (if (member '(chunked) (response-transfer-encoding r))
+ (let ((chunk-port (make-chunked-input-port (response-port r))))
+ (get-bytevector-all chunk-port))
+ (let ((nbytes (response-content-length r)))
+ (and nbytes
+ (let ((bv (get-bytevector-n (response-port r) nbytes)))
+ (if (= (bytevector-length bv) nbytes)
+ bv
+ (bad-response "EOF while reading response body: ~a bytes of ~a"
+ (bytevector-length bv) nbytes)))))))
(define (write-response-body r bv)
"Write @var{bv}, a bytevector, to the port corresponding to the HTTP
@@ -291,3 +296,66 @@ response @var{r}."
(define-response-accessor server #f)
(define-response-accessor vary '())
(define-response-accessor www-authenticate #f)
+
+
+;; Chunked Responses
+(define (read-chunk-header port)
+ (let* ((str (read-line port))
+ (extension-start (string-index str (lambda (c) (or (char=? c #\;)
+ (char=? c #\return)))))
+ (size (string->number (if extension-start ; unnecessary?
+ (substring str 0 extension-start)
+ str)
+ 16)))
+ size))
+
+(define (read-chunk port)
+ (let ((size (read-chunk-header port)))
+ (read-chunk-body port size)))
+
+(define (read-chunk-body port size)
+ (let ((bv (get-bytevector-n port size)))
+ (get-u8 port) ; CR
+ (get-u8 port) ; LF
+ bv))
+
+(define (make-chunked-input-port port)
+ (define (next-chunk)
+ (read-chunk port))
+ (define finished? #f)
+ (define (close)
+ #f
+ ;(close-port port)
+ )
+ (define buffer #vu8())
+ (define buffer-size 0)
+ (define buffer-pointer 0)
+ (define (read! bv idx to-read)
+ (define (loop to-read num-read)
+ (cond ((or finished? (zero? to-read))
+ num-read)
+ ((<= to-read (- buffer-size buffer-pointer))
+ (bytevector-copy! buffer buffer-pointer
+ bv (+ idx num-read)
+ to-read)
+ (set! buffer-pointer (+ buffer-pointer num-read))
+ (loop (- to-read buffer-size)
+ (+ num-read buffer-size)))
+ (else
+ (let ((n (- buffer-size buffer-pointer)))
+ (bytevector-copy! buffer buffer-pointer
+ bv (+ idx num-read)
+ n)
+ (set! buffer (next-chunk))
+ (set! buffer-pointer 0)
+ (set! buffer-size (bytevector-length buffer))
+ (set! finished? (= buffer-size 0))
+ (loop (- to-read n)
+ (+ num-read n))))))
+ (loop to-read 0))
+ (make-custom-binary-input-port
+ "chunked input port"
+ read!
+ #f ;; get-position
+ #f ;; set-position!
+ close))
diff --git a/test-suite/tests/web-request.test b/test-suite/tests/web-request.test
index 8cf1c2e..9c64d61 100644
--- a/test-suite/tests/web-request.test
+++ b/test-suite/tests/web-request.test
@@ -83,3 +83,16 @@ Accept-Language: en-gb, en;q=0.9\r
(pass-if "by accessor"
(equal? (request-accept-encoding r) '((1000 . "gzip"))))))
+
+(with-test-prefix "chunked encoding"
+ (pass-if
+ (equal? (call-with-output-string
+ (lambda (out-raw)
+ (let ((out-chunked (make-chunked-output-port out-raw)))
+ (display "First chunk" out-chunked)
+ (force-output out-chunked)
+ (display "Second chunk" out-chunked)
+ (force-output out-chunked)
+ (display "Third chunk" out-chunked)
+ (close-port out-chunked))))
+ "b\r\nFirst chunk\r\nc\r\nSecond chunk\r\nb\r\nThird chunk\r\n0\r\n")))
diff --git a/test-suite/tests/web-response.test b/test-suite/tests/web-response.test
index a21a702..d621e87 100644
--- a/test-suite/tests/web-response.test
+++ b/test-suite/tests/web-response.test
@@ -40,6 +40,19 @@ Content-Type: text/html; charset=utf-8\r
\r
abcdefghijklmnopqrstuvwxyz0123456789")
+(define example-2
+ "HTTP/1.1 200 OK\r
+Transfer-Encoding: chunked\r
+Content-Type: text/plain
+\r
+1c\r
+Lorem ipsum dolor sit amet, \r
+1d\r
+consectetur adipisicing elit,\r
+43\r
+ sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\r
+0\r\n")
+
(define (responses-equal? r1 body1 r2 body2)
(and (equal? (response-version r1) (response-version r2))
(equal? (response-code r1) (response-code r2))
@@ -100,3 +113,17 @@ abcdefghijklmnopqrstuvwxyz0123456789")
(pass-if "by accessor"
(equal? (response-content-encoding r) '(gzip)))))
+
+(with-test-prefix "example-2"
+ (let* ((r (read-response (open-input-string example-2)))
+ (b (read-response-body r)))
+ (pass-if (equal? '((chunked))
+ (response-transfer-encoding r)))
+ (pass-if (equal? b
+ (string->utf8
+ (string-append
+ "Lorem ipsum dolor sit amet, consectetur adipisicing elit,"
+ " sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.")))))
+ (let* ((r (build-response #:port (open-input-string "0\r\n") #:headers '((transfer-encoding (chunked)))))
+ (b (read-response-body r))) ;; should be eof or #vu8()???
+ (pass-if (eof-object? b))))
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-05-06 4:52 ` Ian Price
2012-05-06 16:53 ` Ian Price
@ 2012-05-06 19:49 ` Thien-Thi Nguyen
2012-05-08 2:27 ` Ian Price
2012-05-08 3:26 ` Daniel Hartwig
2 siblings, 1 reply; 9+ messages in thread
From: Thien-Thi Nguyen @ 2012-05-06 19:49 UTC (permalink / raw)
To: guile-devel
() Ian Price <ianprice90@googlemail.com>
() Sun, 06 May 2012 05:52:00 +0100
I think we can leave trailers until I have some actual data on
how much these are actually used in practice, and/or someone
complains about it being missing. WDYT?
It's not so hard to conform. Trailers are just headers in the
tail position; you can use code that reads headers to read
trailers.
FWIW I didn't see any code for handling them in Ruby's net/http.rb
See also:
http://git.savannah.gnu.org/cgit/guile-www.git/tree/source/crlf.scm#n303
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-05-06 16:53 ` Ian Price
@ 2012-05-08 0:33 ` Ian Price
0 siblings, 0 replies; 9+ messages in thread
From: Ian Price @ 2012-05-08 0:33 UTC (permalink / raw)
To: guile-devel
[-- Attachment #1: Type: text/plain, Size: 293 bytes --]
Hello guilers,
Here is a more complete patch. I've also attached a patch to export
declare-opaque-header!, which I've occasionally found to be useful.
--
Ian Price
"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Chunked encoding support --]
[-- Type: text/x-patch, Size: 13147 bytes --]
From 56a51160a3cba02a4a10375a73106359a1c4a722 Mon Sep 17 00:00:00 2001
From: Ian Price <ianprice90@googlemail.com>
Date: Tue, 8 May 2012 00:06:01 +0100
Subject: [PATCH 1/2] Add HTTP Chunked Encoding support to web modules.
* doc/ref/web.texi(Transfer Codings): New subsection for transfer codings.
* module/web/http.scm(make-chunked-input-port,
make-chunked-output-port): New procedures.
* module/web/response.scm (read-response-body): Handle chunked responses.
* test-suite/tests/web-response.test: Add test.
* test-suite/tests/web-http.test: Add tests.
---
doc/ref/web.texi | 60 +++++++++++++++++++++
module/web/http.scm | 104 +++++++++++++++++++++++++++++++++++-
module/web/response.scm | 18 ++++---
test-suite/tests/web-http.test | 20 +++++++
test-suite/tests/web-response.test | 24 ++++++++
5 files changed, 218 insertions(+), 8 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 8bb99e2..e38aee2 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -37,6 +37,7 @@ back.
* URIs:: Universal Resource Identifiers.
* HTTP:: The Hyper-Text Transfer Protocol.
* HTTP Headers:: How Guile represents specific header values.
+* Transfer Coding:: HTTP Transfer Codings.
* Requests:: HTTP requests.
* Responses:: HTTP responses.
* Web Client:: Accessing web resources over HTTP.
@@ -1020,6 +1021,65 @@ A list of challenges to a user, indicating the need for authentication.
@end example
@end deftypevr
+@node Transfer Codings
+@subsection Transfer Codings
+
+HTTP 1.1 allows for various transfer codings to be applied to message
+bodies. These include various types of compression, and HTTP chunked
+encoding. Currently, only chunked encoding is supported by guile.
+
+Chunked coding is an optional coding that may be applied to message
+bodies, to allow messages whose length is not known beforehand to be
+returned. Such messages can be split into chunks, terminated by a final
+zero length chunk.
+
+In order to make dealing with encodings more simple, guile provides
+procedures to create ports that ``wrap'' existing ports, applying
+transformations transparently under the hood.
+
+@deffn {Scheme Procedure} make-chunked-input-port port [#:keep-alive?=#f]
+Returns a new port, that transparently reads and decodes chunk-encoded
+data from @var{port}. If no more chunk-encoded data is available, it
+returns the end-of-file object. When the port is closed, @var{port} will
+also be closed, unless @var{keep-alive?} is true.
+@end deffn
+
+@example
+(use-modules (ice-9 rdelim))
+
+(define s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n")
+(define p (make-chunked-input-port (open-input-string s)))
+(read-line s)
+@result{} "First line"
+(read-line s)
+@result{} "Second line"
+@end example
+
+@deffn {Scheme Procedure} make-chunked-output-port port [#:keep-alive?=#f]
+Returns a new port, which transparently encodes data as chunk-encoded
+before writing it to @var{port}. Whenever a write occurs on this port,
+it buffers it, until the port is flushed, at which point it writes a
+chunk containing all the data written so far. When the port is closed,
+the data remaining is written to @var{port}, as is the terminating zero
+chunk. It also causes @var{port} to be closed, unless @var{keep-alive?}
+is true.
+
+Note. Forcing a chunked output port when there is no data is buffered
+does not write a zero chunk, as this would cause the data to be
+interpreted incorrectly by the client.
+@end deffn
+
+@example
+(call-with-output-string
+ (lambda (out)
+ (define out* (make-chunked-output-port out #:keep-alive? #t))
+ (display "first chunk" out*)
+ (force-output out*)
+ (force-output out*) ; note this does not write a zero chunk
+ (display "second chunk" out*)
+ (close-port out*)))
+@result{} "b\r\nfirst chunk\r\nc\r\nsecond chunk\r\n0\r\n"
+@end example
@node Requests
@subsection HTTP Requests
diff --git a/module/web/http.scm b/module/web/http.scm
index d579c52..9232b28 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -34,6 +34,9 @@
#:use-module (srfi srfi-9)
#:use-module (srfi srfi-19)
#:use-module (ice-9 rdelim)
+ #:use-module (ice-9 q)
+ #:use-module (ice-9 binary-ports)
+ #:use-module (rnrs bytevectors)
#:use-module (web uri)
#:export (string->header
header->string
@@ -59,7 +62,10 @@
read-request-line
write-request-line
read-response-line
- write-response-line))
+ write-response-line
+
+ make-chunked-input-port
+ make-chunked-output-port))
;;; TODO
@@ -1799,3 +1805,99 @@ phrase\"."
;; WWW-Authenticate = 1#challenge
;;
(declare-challenge-list-header! "WWW-Authenticate")
+
+
+;; Chunked Responses
+(define (read-chunk-header port)
+ (let* ((str (read-line port))
+ (extension-start (string-index str (lambda (c) (or (char=? c #\;)
+ (char=? c #\return)))))
+ (size (string->number (if extension-start ; unnecessary?
+ (substring str 0 extension-start)
+ str)
+ 16)))
+ size))
+
+(define (read-chunk port)
+ (let ((size (read-chunk-header port)))
+ (read-chunk-body port size)))
+
+(define (read-chunk-body port size)
+ (let ((bv (get-bytevector-n port size)))
+ (get-u8 port) ; CR
+ (get-u8 port) ; LF
+ bv))
+
+(define* (make-chunked-input-port port #:key (keep-alive? #f))
+ "Returns a new port which translates HTTP chunked transfer encoded
+data from @var{port} into a non-encoded format. Returns eof when it has
+read the final chunk from @var{port}. This does not necessarily mean
+that there is no more data on @var{port}. When the returned port is
+closed it will also close @var{port}, unless the KEEP-ALIVE? is true."
+ (define (next-chunk)
+ (read-chunk port))
+ (define finished? #f)
+ (define (close)
+ (unless keep-alive?
+ (close-port port)))
+ (define buffer #vu8())
+ (define buffer-size 0)
+ (define buffer-pointer 0)
+ (define (read! bv idx to-read)
+ (define (loop to-read num-read)
+ (cond ((or finished? (zero? to-read))
+ num-read)
+ ((<= to-read (- buffer-size buffer-pointer))
+ (bytevector-copy! buffer buffer-pointer
+ bv (+ idx num-read)
+ to-read)
+ (set! buffer-pointer (+ buffer-pointer to-read))
+ (loop 0 (+ num-read to-read)))
+ (else
+ (let ((n (- buffer-size buffer-pointer)))
+ (bytevector-copy! buffer buffer-pointer
+ bv (+ idx num-read)
+ n)
+ (set! buffer (next-chunk))
+ (set! buffer-pointer 0)
+ (set! buffer-size (bytevector-length buffer))
+ (set! finished? (= buffer-size 0))
+ (loop (- to-read n)
+ (+ num-read n))))))
+ (loop to-read 0))
+ (make-custom-binary-input-port "chunked input port" read! #f #f close))
+
+(define* (make-chunked-output-port port #:key (keep-alive? #f))
+ "Returns a new port which translates non-encoded data into a HTTP
+chunked transfer encoded data and writes this to @var{port}. Data
+written to this port is buffered until the port is flushed, at which
+point it is all sent as one chunk. Take care to close the port when
+done, as it will output the remaining data, and encode the final zero
+chunk. When the port is closed it will also close @var{port}, unless
+KEEP-ALIVE? is true."
+ (define (q-for-each f q)
+ (while (not (q-empty? q))
+ (f (deq! q))))
+ (define queue (make-q))
+ (define (put-char c)
+ (enq! queue c))
+ (define (put-string s)
+ (string-for-each (lambda (c) (enq! queue c))
+ s))
+ (define (flush)
+ ;; It is important that we do _not_ write a chunk if the queue is
+ ;; empty, since it will be treated as the final chunk.
+ (unless (q-empty? queue)
+ (let ((len (q-length queue)))
+ (display (number->string len 16) port)
+ (display "\r\n" port)
+ (q-for-each (lambda (elem) (write-char elem port))
+ queue)
+ (display "\r\n" port))))
+ (define (close)
+ (flush)
+ (display "0\r\n" port)
+ (force-output port)
+ (unless keep-alive?
+ (close-port port)))
+ (make-soft-port (vector put-char put-string flush #f close) "w"))
diff --git a/module/web/response.scm b/module/web/response.scm
index 07e1245..6eba69d 100644
--- a/module/web/response.scm
+++ b/module/web/response.scm
@@ -227,13 +227,17 @@ This is true for some response types, like those with code 304."
(define (read-response-body r)
"Reads the response body from @var{r}, as a bytevector. Returns
@code{#f} if there was no response body."
- (let ((nbytes (response-content-length r)))
- (and nbytes
- (let ((bv (get-bytevector-n (response-port r) nbytes)))
- (if (= (bytevector-length bv) nbytes)
- bv
- (bad-response "EOF while reading response body: ~a bytes of ~a"
- (bytevector-length bv) nbytes))))))
+ (if (member '(chunked) (response-transfer-encoding r))
+ (let ((chunk-port (make-chunked-input-port (response-port r)
+ #:keep-alive? #t)))
+ (get-bytevector-all chunk-port))
+ (let ((nbytes (response-content-length r)))
+ (and nbytes
+ (let ((bv (get-bytevector-n (response-port r) nbytes)))
+ (if (= (bytevector-length bv) nbytes)
+ bv
+ (bad-response "EOF while reading response body: ~a bytes of ~a"
+ (bytevector-length bv) nbytes)))))))
(define (write-response-body r bv)
"Write @var{bv}, a bytevector, to the port corresponding to the HTTP
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 7984565..97f5559 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -20,6 +20,7 @@
(define-module (test-suite web-http)
#:use-module (web uri)
#:use-module (web http)
+ #:use-module (rnrs io ports)
#:use-module (ice-9 regex)
#:use-module (ice-9 control)
#:use-module (srfi srfi-19)
@@ -232,3 +233,22 @@
(pass-if-parse vary "foo, bar" '(foo bar))
(pass-if-parse www-authenticate "Basic realm=\"guile\""
'((basic (realm . "guile")))))
+
+(with-test-prefix "chunked encoding"
+ (let* ((s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n")
+ (p (make-chunked-input-port (open-input-string s))))
+ (pass-if (equal? "First line\n Second line"
+ (get-string-all p)))
+ (pass-if (port-eof? (make-chunked-input-port (open-input-string "0\r\n")))))
+ (pass-if
+ (equal? (call-with-output-string
+ (lambda (out-raw)
+ (let ((out-chunked (make-chunked-output-port out-raw
+ #:keep-alive? #t)))
+ (display "First chunk" out-chunked)
+ (force-output out-chunked)
+ (display "Second chunk" out-chunked)
+ (force-output out-chunked)
+ (display "Third chunk" out-chunked)
+ (close-port out-chunked))))
+ "b\r\nFirst chunk\r\nc\r\nSecond chunk\r\nb\r\nThird chunk\r\n0\r\n")))
diff --git a/test-suite/tests/web-response.test b/test-suite/tests/web-response.test
index a21a702..ddd55a7 100644
--- a/test-suite/tests/web-response.test
+++ b/test-suite/tests/web-response.test
@@ -40,6 +40,19 @@ Content-Type: text/html; charset=utf-8\r
\r
abcdefghijklmnopqrstuvwxyz0123456789")
+(define example-2
+ "HTTP/1.1 200 OK\r
+Transfer-Encoding: chunked\r
+Content-Type: text/plain
+\r
+1c\r
+Lorem ipsum dolor sit amet, \r
+1d\r
+consectetur adipisicing elit,\r
+43\r
+ sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\r
+0\r\n")
+
(define (responses-equal? r1 body1 r2 body2)
(and (equal? (response-version r1) (response-version r2))
(equal? (response-code r1) (response-code r2))
@@ -100,3 +113,14 @@ abcdefghijklmnopqrstuvwxyz0123456789")
(pass-if "by accessor"
(equal? (response-content-encoding r) '(gzip)))))
+
+(with-test-prefix "example-2"
+ (let* ((r (read-response (open-input-string example-2)))
+ (b (read-response-body r)))
+ (pass-if (equal? '((chunked))
+ (response-transfer-encoding r)))
+ (pass-if (equal? b
+ (string->utf8
+ (string-append
+ "Lorem ipsum dolor sit amet, consectetur adipisicing elit,"
+ " sed do eiusmod tempor incididunt ut labore et dolore magna aliqua."))))))
--
1.7.7.6
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: export declare-opaque-header! --]
[-- Type: text/x-patch, Size: 1805 bytes --]
From e47ef2722d90998256e226b2e2ec2bcec68b07b5 Mon Sep 17 00:00:00 2001
From: Ian Price <ianprice90@googlemail.com>
Date: Tue, 8 May 2012 00:18:59 +0100
Subject: [PATCH 2/2] Document and export `declare-opaque-header!'
* module/web/http.scm (declare-opaque-header!): Add docstring. New export.
* doc/ref/web.texi (HTTP): Add documentation.
---
doc/ref/web.texi | 5 +++++
module/web/http.scm | 3 +++
2 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index e38aee2..7f86748 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -398,6 +398,11 @@ HTTP stack like this:
(display (inet-ntoa ip) port)))
@end example
+@deffn {Scheme Procedure} declare-opaque-header! name
+A specialised version of @code{declare-header!} for the case in which
+you want a header's value to be returned/written ``as-is''.
+@end deffn
+
@deffn {Scheme Procedure} valid-header? sym val
Return a true value iff @var{val} is a valid Scheme value for the header
with name @var{sym}.
diff --git a/module/web/http.scm b/module/web/http.scm
index 9232b28..cc5dd5a 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -42,6 +42,7 @@
header->string
declare-header!
+ declare-opaque-header!
known-header?
header-parser
header-validator
@@ -1145,6 +1146,8 @@ phrase\"."
;; emacs: (put 'declare-header! 'scheme-indent-function 1)
;; emacs: (put 'declare-opaque!-header 'scheme-indent-function 1)
(define (declare-opaque-header! name)
+ "Declares a given header as \"opaque\", meaning that its value is not
+treated specially, and is just returned as a plain string."
(declare-header! name
parse-opaque-string validate-opaque-string write-opaque-string))
--
1.7.7.6
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-05-06 19:49 ` Thien-Thi Nguyen
@ 2012-05-08 2:27 ` Ian Price
2012-05-08 6:26 ` Thien-Thi Nguyen
0 siblings, 1 reply; 9+ messages in thread
From: Ian Price @ 2012-05-08 2:27 UTC (permalink / raw)
To: Thien-Thi Nguyen; +Cc: guile-devel
Thien-Thi Nguyen <ttn@gnuvola.org> writes:
> () Ian Price <ianprice90@googlemail.com>
> () Sun, 06 May 2012 05:52:00 +0100
>
> I think we can leave trailers until I have some actual data on
> how much these are actually used in practice, and/or someone
> complains about it being missing. WDYT?
>
> It's not so hard to conform. Trailers are just headers in the
> tail position; you can use code that reads headers to read
> trailers.
It's not the technical challenge I'm concerned about; I know it's pretty
easy. I'm concerned about the interface. If all we have is a 'http-get'
then this is pretty easy, you modify the header alist once you are
finished reading the body, then you present it to the user.
But guile provides more of it's http infrastructure than this, and it is
not obvious to me the best way to integrate this feature.
More importantly, I don't want to add a feature no-one will use,
anecdotally, trailers are used rarely if at all, and other languages
feel comfortable leaving it out of their standard libraries.
Thanks for your thoughts,
--
Ian Price
"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-05-06 4:52 ` Ian Price
2012-05-06 16:53 ` Ian Price
2012-05-06 19:49 ` Thien-Thi Nguyen
@ 2012-05-08 3:26 ` Daniel Hartwig
2 siblings, 0 replies; 9+ messages in thread
From: Daniel Hartwig @ 2012-05-08 3:26 UTC (permalink / raw)
To: guile-devel
On 6 May 2012 12:52, Ian Price <ianprice90@googlemail.com> wrote:
> Well, what I meant is a port that would be layered over the top of
> another. Soft ports or custom binary ports would be used to implement
> it. (Is there a reason (effiencywise) to prefer one over the other?)
>
Intuitively I would think custom binary ports are more efficient, but
I haven't tested this. However, back when I was experimenting with
this I ran in to trouble getting the binary port to close it's base
port -- the soft port interface was much nicer for this.
>
> Basically, my interface right now is
>
> (make-chunked-input-port port) -> input-port
> (make-chunked-output-port port) -> output-port
>
> These operate pretty much as you'd expect. The port returned from
> 'make-chunked-input-port' reads whole chunks from its argument port, and
> maintains a buffer, from which it can satisfy smaller reads.
> The port returned from 'make-chunked-output-port' buffers up the writes
> and writes a whole (properly formatted) chunk on 'force-output'.
>
> The only behaviour I'm not entirely sure of is what happens on
> close. Let's be more concrete:
>
> (define a <port>)
> (define b (make-chunked-input-port a))
> (close-port b)
>
> What is the state of a?
>
> (define c <port>)
> (define d (make-chunked-output-port c))
> (close-port d)
>
> Likewise for c.
>
>
> I think common practice in things like Java's BufferedReader would be to
> have a be closed when b is. This may be undesirable since we could wish
> to continue using the socket. On the other hand, once you start layering
> ports, it is convenient to have the higher layers close the lower
> layers. I think it might make sense to have a keyword argument,
> #:dont-close? (or something), that specifies this behaviour, defaulting
> to close.
>
> c seems less clear to me. Again, once you have multiple layers, it would
> be convenient, and it would properly handle any inner state saved up.
>
> Any thoughts on this?
>
A quick glance at your latest patch and it looks pretty good. I'll
give this a bit of a workout during the week.
Having #:keep-alive? is great. At some layers you would want it,
others not. With a chunked-gzip transfer in a keep-alive session, for
example, it is convenient for the lowest layer (gzip) to close the
next layer (chunked) but not the socket.
Regards
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Chunked Encoding
2012-05-08 2:27 ` Ian Price
@ 2012-05-08 6:26 ` Thien-Thi Nguyen
0 siblings, 0 replies; 9+ messages in thread
From: Thien-Thi Nguyen @ 2012-05-08 6:26 UTC (permalink / raw)
To: Ian Price; +Cc: guile-devel
() Ian Price <ianprice90@googlemail.com>
() Tue, 08 May 2012 03:27:11 +0100
More importantly, I don't want to add a feature no-one will
use, anecdotally, trailers are used rarely if at all, and other
languages feel comfortable leaving it out of their standard
libraries.
Seems a chicken and egg situation; no one can use an absent feature.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-05-08 6:26 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-29 20:09 Chunked Encoding Ian Price
2012-01-04 20:18 ` Andy Wingo
2012-05-06 4:52 ` Ian Price
2012-05-06 16:53 ` Ian Price
2012-05-08 0:33 ` Ian Price
2012-05-06 19:49 ` Thien-Thi Nguyen
2012-05-08 2:27 ` Ian Price
2012-05-08 6:26 ` Thien-Thi Nguyen
2012-05-08 3:26 ` Daniel Hartwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).