* [PATCH] add some useful procedures for http client module
@ 2012-02-23 10:28 Nala Ginrut
0 siblings, 0 replies; only message in thread
From: Nala Ginrut @ 2012-02-23 10:28 UTC (permalink / raw)
To: guile-devel
[-- Attachment #1.1: Type: text/plain, Size: 3961 bytes --]
hi guys!
I try to write a module for http download tools. When I finished it, I
realized some of the procedures could be the part of module/web/client.scm.
So I format a patch.
The function of these procedures listed below:
http-get-uri-head ==> get the response struct under http method 'HEAD',
this head is useful to get information of the remote file.
http-client-get-block-from-uri ==> get response body whose size is 'block'
which can be specified by user. If you don't specify the 'block',
it will be the
length of the target file.
http-client-get-ready-to-continue ==> returns (values pos fd)
If the
download target has a part in local space, pos will point to it's broken
pointer.
fd is the
file port of the local target.
The rest procedures maybe useful. To avoid download a file incorrectly, I
think a checksum or md5 is useful. So I just used the ETag which
contains in HTTP protocol. Each target file would generate a
"filename.etag" contains the last time ETag. When continue to download
file, one may use these procedures to checkout the ETag.
http-client-get-check-string ==> get the ETag string from "filename.etag"
http-client-checkout-etag ==> checkout if the ETag of HEAD and ETag of
"filename.etag" are equal.
http-client-remove-check-file ==> when you get a different ETag, you may
delete the "filename.etag"
http-client-etag-stamp ==> when you first download a file, this procedure
could generate "filename.etag"
and after you finished your
downloading, you need to delete "filename.etag"
Here's an simple 'continue-to-download' example to show how to use these
procedures:
----------------------------------code
begin---------------------------------------
(define* (http-client-retrive-file-continue uri #:key (path (uri-path uri))
(try 5))
(let ([head (http-get-uri-head uri)])
(call-with-values
(lambda ()
(http-client-get-ready-to-continue uri #:path path #:head head))
(lambda (pos port)
(catch #t
(lambda ()
(if (zero? pos)
(begin
(display "download from beginning")
(http-client-etag-stamp uri #:path path))
(format #t "continue from position ~a~%" pos))
(let lp ([data (http-client-get-block-from-uri
uri #:start pos #:head head #:block 4096)]
[pos pos])
(if data
(let* ([dl (bytevector-length data)]
[new-pos (+ pos dl)]
)
(put-bytevector port data)
(force-output port)
(format #t "~a-~a~%" pos new-pos)
(lp (http-client-get-block-from-uri
uri #:start new-pos #:head head #:block 4096)
new-pos))
(format #t "~a has already been done!~%" path))))
(lambda e
(case (car e)
((system-error)
(let ([E (system-error-errno e)])
(if (or (= E ECONNABORTED)
(= E ECONNREFUSED)
(= E ECONNRESET))
(begin
(format #t "~a, try again!~%left ~a times to try~%"
(car (cadddr e)) try)
(close port)
(http-client-retrive-file-continue uri #:path path
#:try (1- try))))))
(else
(display "some error occured!\n")(newline)
(format #t "~a : ~a~%" (car e) (cdr e)))))
)))))
---------------------code end-----------------------------
And you may try this:
(http-client-retrive-file-continue (string->uri "
http://mirrors.kernel.org/gnu/gcc/gcc-4.6.2/gcc-4.6.2.tar.bz2")
#:path "mmr.tar.bz2"
#:try 10)
#:path could be used to specify the local target. If you ignore it, it
would be the original file name.
#:try is times to try. When try decrease to 0 but downloading has
unfinished, it'll quit anyway.
Besides, one may use these procedures to build his/her own threads based
downloading tools, say, split the remote file into blocks and use 10
threads to down them separably.
Any comments?
[-- Attachment #1.2: Type: text/html, Size: 7922 bytes --]
[-- Attachment #2: 0001-add-some-useful-procedures-for-http-client.patch --]
[-- Type: text/x-patch, Size: 3470 bytes --]
From ab03251a6c9a476753c5498ba3a75009c37db272 Mon Sep 17 00:00:00 2001
From: NalaGinrut <NalaGinrut@gmail.com>
Date: Thu, 23 Feb 2012 17:46:56 +0800
Subject: [PATCH] add some useful procedures for http client
---
module/web/client.scm | 72 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 71 insertions(+), 1 deletions(-)
diff --git a/module/web/client.scm b/module/web/client.scm
index b035668..fc96284 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -37,9 +37,18 @@
#:use-module (ice-9 rdelim)
#:use-module (web request)
#:use-module (web response)
+ #:use-module (web http)
#:use-module (web uri)
+ #:autoload (rnrs io ports) (get-string-all)
#:export (open-socket-for-uri
- http-get))
+ http-get
+ http-get-uri-head
+ http-client-get-block-from-uri
+ http-client-get-check-string
+ http-client-checkout-etag
+ http-client-remove-check-file
+ http-client-etag-stamp
+ http-client-get-ready-to-continue))
(define (open-socket-for-uri uri)
(let* ((ai (car (getaddrinfo (uri-host uri)
@@ -114,3 +123,64 @@
(if decode-body?
(decode-response-body res body)
body)))))
+
+(define* (http-get-uri-head uri #:key (sock (open-socket-for-uri uri)))
+ (let ((rq (build-request uri #:method 'HEAD)))
+ (write-request rq sock)
+ (force-output sock)
+ (let ((head (read-response sock)))
+ (close sock)
+ head)))
+
+(define* (http-client-get-block-from-uri uri #:key (block #f) (start 0)
+ (head (http-get-uri-head uri)))
+ (let* ((s (open-socket-for-uri uri))
+ (end (if block (+ start block) (response-content-length head)))
+ (range-str (format #f "bytes=~a-~a" start end))
+ (range (parse-header 'range range-str))
+ (rq (build-request uri #:headers `((range ,@range)))))
+ (write-request rq s)
+ (force-output s)
+ (read-response-body (read-response s))))
+
+(define (http-client-get-check-string path)
+ (let ((target (string-append path ".etag")))
+ (if (file-exists? target)
+ (call-with-input-file target
+ (lambda (port)
+ (get-string-all port)))
+ "")))
+
+(define* (http-client-checkout-etag uri #:key (path (uri-path uri))
+ (head (http-get-uri-head uri)))
+ (let* ((etag (car (response-etag head)))
+ (chk-str (http-client-get-check-string path))
+ )
+ (string=? etag chk-str))) ;; checkout ETag
+
+(define* (http-client-remove-check-file path #:key (ext ".etag"))
+ (let ((chk-file (string-append path ext)))
+ (and (file-exists? chk-file) (delete-file chk-file))))
+
+(define* (http-client-etag-stamp uri #:key (head (http-get-uri-head uri))
+ (path (uri-path uri))
+ (ext ".etag"))
+ (let ((chk-file (string-append path ext))
+ (etag (car (response-etag head))))
+ (and (file-exists? chk-file) (delete-file chk-file))
+ (call-with-output-file chk-file
+ (lambda (port)
+ (format port "~a" etag)
+ (close port)))))
+
+(define* (http-client-get-ready-to-continue uri #:key (path (uri-path uri))
+ (head (http-get-uri-head uri)))
+ (if (http-client-checkout-etag uri #:path path #:head head) ;; checkout ETag
+ (let* ((fp (open-file path "a"))
+ (pos (stat:size (stat path))))
+ (seek fp pos SEEK_SET)
+ (http-client-remove-check-file path)
+ (values pos fp))
+ (let* ((fp (open-file path "w"))
+ (pos 0))
+ (values pos fp))))
--
1.7.0.4
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2012-02-23 10:28 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-23 10:28 [PATCH] add some useful procedures for http client module Nala Ginrut
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).