* new module: (web client)
@ 2011-07-15 11:14 Andy Wingo
2011-07-18 12:59 ` Ludovic Courtès
0 siblings, 1 reply; 6+ messages in thread
From: Andy Wingo @ 2011-07-15 11:14 UTC (permalink / raw)
To: guile-devel
[-- Attachment #1: Type: text/plain, Size: 165 bytes --]
Hi,
I wrote a simple HTTP client and dropped it in (web client). It's
synchronous, so it's a bit lame. I'm attaching it here for review.
Feedback welcome.
Andy
[-- Attachment #2: (web client) --]
[-- Type: text/plain, Size: 4143 bytes --]
;;; Web client
;; Copyright (C) 2011 Free Software Foundation, Inc.
;; This library is free software; you can redistribute it and/or
;; modify it under the terms of the GNU Lesser General Public
;; License as published by the Free Software Foundation; either
;; version 3 of the License, or (at your option) any later version.
;;
;; This library is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
;; Lesser General Public License for more details.
;;
;; You should have received a copy of the GNU Lesser General Public
;; License along with this library; if not, write to the Free Software
;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
;; 02110-1301 USA
;;; Commentary:
;;;
;;; (web client) is a simple HTTP URL fetcher for Guile.
;;;
;;; In its current incarnation, (web client) is synchronous. If you
;;; want to fetch a number of URLs at once, probably the best thing to
;;; do is to write an event-driven URL fetcher, similar in structure to
;;; the web server.
;;;
;;; Another option, good but not as performant, would be to use threads,
;;; possibly via par-map or futures.
;;;
;;; Code:
(define-module (web client)
#:use-module (rnrs bytevectors)
#:use-module (ice-9 binary-ports)
#:use-module (ice-9 rdelim)
#:use-module (web request)
#:use-module (web response)
#:use-module (web uri)
#:export (open-socket-for-uri
http-get))
(define (open-socket-for-uri uri)
(let* ((ai (car (getaddrinfo (uri-host uri)
(cond
((uri-port uri) => number->string)
(else (symbol->string (uri-scheme uri)))))))
(s (socket (addrinfo:fam ai) (addrinfo:socktype ai)
(addrinfo:protocol ai))))
(connect s (addrinfo:addr ai))
;; Buffer input and output on this port.
(setvbuf s _IOFBF)
;; Enlarge the receive buffer.
(setsockopt s SOL_SOCKET SO_RCVBUF (* 12 1024))
s))
(define (decode-string bv encoding)
(if (string-ci=? encoding "utf-8")
(utf8->string bv)
(let ((p (open-bytevector-input-port bv)))
(set-port-encoding! p encoding)
(let ((res (read-delimited "" p)))
(close-port p)
res))))
(define (text-type? type)
(let ((type (symbol->string type)))
(or (string-prefix? "text/" type)
(string-suffix? "/xml" type)
(string-suffix? "+xml" type))))
;; Logically the inverse of (web server)'s `sanitize-response'.
;;
(define (decode-response-body response body)
;; `body' is either #f or a bytevector.
(cond
((not body) body)
((bytevector? body)
(let ((rlen (response-content-length response))
(blen (bytevector-length body)))
(cond
((and rlen (not (= rlen blen)))
(error "bad content-length" rlen blen))
((response-content-type response)
=> (lambda (type)
(cond
((text-type? (car type))
(decode-string body (or (assq-ref (cdr type) 'charset)
"iso-8859-1")))
(else body))))
(else body))))
(else
(error "unexpected body type" body))))
(define* (http-get uri #:key (port (open-socket-for-uri uri))
(version '(1 . 1)) (keep-alive? #f) (extra-headers '())
(decode-body? #t))
(let ((req (build-request uri #:version version
#:headers (if keep-alive?
extra-headers
(cons '(connection close)
extra-headers)))))
(write-request req port)
(force-output port)
(if (not keep-alive?)
(shutdown port 1))
(let* ((res (read-response port))
(body (read-response-body res)))
(if (not keep-alive?)
(close-port port))
(values res
(if decode-response?
(decode-response-body res body)
body)))))
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: new module: (web client)
2011-07-15 11:14 new module: (web client) Andy Wingo
@ 2011-07-18 12:59 ` Ludovic Courtès
2011-07-19 7:51 ` Peter Brett
2011-12-06 10:51 ` Andy Wingo
0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2011-07-18 12:59 UTC (permalink / raw)
To: guile-devel
Hello!
Andy Wingo <wingo@pobox.com> skribis:
> I wrote a simple HTTP client and dropped it in (web client). It's
> synchronous, so it's a bit lame. I'm attaching it here for review.
> Feedback welcome.
This looks great!
> ;;; (web client) is a simple HTTP URL fetcher for Guile.
> ;;;
> ;;; In its current incarnation, (web client) is synchronous. If you
> ;;; want to fetch a number of URLs at once, probably the best thing to
> ;;; do is to write an event-driven URL fetcher, similar in structure to
> ;;; the web server.
> ;;;
> ;;; Another option, good but not as performant, would be to use threads,
> ;;; possibly via par-map or futures.
Futures are for computational tasks, so I think you’d want something
akin to futures but where the number of concurrent tasks isn’t related
to the number of CPU cores.
> (define (open-socket-for-uri uri)
> (let* ((ai (car (getaddrinfo (uri-host uri)
What if URI is file://foo?
> (define (text-type? type)
> (let ((type (symbol->string type)))
> (or (string-prefix? "text/" type)
> (string-suffix? "/xml" type)
> (string-suffix? "+xml" type))))
We need a MIME lib. :-)
Besides, it would be nice to have docstrings at least for the public
procedures.
Thanks!
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: new module: (web client)
2011-07-18 12:59 ` Ludovic Courtès
@ 2011-07-19 7:51 ` Peter Brett
2011-12-06 10:47 ` Andy Wingo
2011-12-06 10:51 ` Andy Wingo
1 sibling, 1 reply; 6+ messages in thread
From: Peter Brett @ 2011-07-19 7:51 UTC (permalink / raw)
To: guile-devel
ludo@gnu.org (Ludovic Courtès) writes:
> What if URI is file://foo?
Per RFC 1630 & RFC 1738, a file URL takes the form:
file://host/path
For local files, the HOST part is elided:
file:///path
So IMHO the posted code isn't *wrong* per se. ;-)
Regards,
Peter
--
Peter Brett <peter@peter-b.co.uk>
Remote Sensing Research Group
Surrey Space Centre
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: new module: (web client)
2011-07-19 7:51 ` Peter Brett
@ 2011-12-06 10:47 ` Andy Wingo
0 siblings, 0 replies; 6+ messages in thread
From: Andy Wingo @ 2011-12-06 10:47 UTC (permalink / raw)
To: Peter Brett; +Cc: guile-devel
On Tue 19 Jul 2011 09:51, Peter Brett <peter@peter-b.co.uk> writes:
> ludo@gnu.org (Ludovic Courtès) writes:
>
>> What if URI is file://foo?
>
> Per RFC 1630 & RFC 1738, a file URL takes the form:
>
> file://host/path
>
> For local files, the HOST part is elided:
>
> file:///path
>
> So IMHO the posted code isn't *wrong* per se. ;-)
In that it specifies "foo" as the host.
file:///path appears to be invalid according to RFC3986, as the `host'
part is not optional if // follows the scheme. However,
file:///etc/hosts is used as an example in the RFC. I'm not sure what
to think here.
What is clear is that file:///foo is definitely a "normal" URI, in
practice, so we should probably interpret ://[/?#] as indicating no
authority, instead of being invalid. I have done this in stable-2.0.
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: new module: (web client)
2011-07-18 12:59 ` Ludovic Courtès
2011-07-19 7:51 ` Peter Brett
@ 2011-12-06 10:51 ` Andy Wingo
2011-12-07 13:38 ` Ludovic Courtès
1 sibling, 1 reply; 6+ messages in thread
From: Andy Wingo @ 2011-12-06 10:51 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
On Mon 18 Jul 2011 14:59, ludo@gnu.org (Ludovic Courtès) writes:
> Andy Wingo <wingo@pobox.com> skribis:
>
>> (define (open-socket-for-uri uri)
>> (let* ((ai (car (getaddrinfo (uri-host uri)
>
> What if URI is file://foo?
It will look up the addrinfo for the "file" service of "foo".
scheme@(guile-user)> (getaddrinfo "foo" "file")
ERROR: In procedure getaddrinfo:
ERROR: Throw to key `getaddrinfo-error' with args `(-8)'.
If you use file:///foo:
scheme@(guile-user)> (getaddrinfo #f "file")
ERROR: In procedure getaddrinfo:
ERROR: Throw to key `getaddrinfo-error' with args `(-8)'.
I guess we need to add exception printers for these errors. Want to do
that? :-)
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: new module: (web client)
2011-12-06 10:51 ` Andy Wingo
@ 2011-12-07 13:38 ` Ludovic Courtès
0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2011-12-07 13:38 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Hi,
Andy Wingo <wingo@pobox.com> skribis:
> scheme@(guile-user)> (getaddrinfo #f "file")
> ERROR: In procedure getaddrinfo:
> ERROR: Throw to key `getaddrinfo-error' with args `(-8)'.
>
> I guess we need to add exception printers for these errors. Want to do
> that? :-)
Yes! Done.
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-12-07 13:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-15 11:14 new module: (web client) Andy Wingo
2011-07-18 12:59 ` Ludovic Courtès
2011-07-19 7:51 ` Peter Brett
2011-12-06 10:47 ` Andy Wingo
2011-12-06 10:51 ` Andy Wingo
2011-12-07 13:38 ` Ludovic Courtès
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).