unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH] add some useful procedures for http client module
@ 2012-02-23 10:28 Nala Ginrut
  0 siblings, 0 replies; only message in thread
From: Nala Ginrut @ 2012-02-23 10:28 UTC (permalink / raw)
  To: guile-devel


[-- Attachment #1.1: Type: text/plain, Size: 3961 bytes --]

hi guys!
I try to write a module for http download tools. When I finished it, I
realized some of the procedures could be the part of module/web/client.scm.
So I format a patch.
The function of these procedures listed below:
http-get-uri-head ==> get the response struct under http method 'HEAD',
this head is useful to get information of the remote file.
http-client-get-block-from-uri ==> get response body whose size is 'block'
which can be specified by user. If you don't specify the 'block',
                                                         it will be the
length of the target file.
http-client-get-ready-to-continue ==> returns (values pos fd)
                                                               If the
download target has a part in local space, pos will point to it's broken
pointer.
                                                               fd is the
file port of the local target.
The rest procedures maybe useful. To avoid download a file incorrectly, I
think a checksum or md5 is useful. So I just used the ETag which
contains in HTTP protocol. Each target file would generate a
"filename.etag" contains the last time ETag. When continue to download
file, one may use these procedures to checkout the ETag.
http-client-get-check-string  ==> get the ETag string from "filename.etag"
http-client-checkout-etag ==> checkout if the ETag of HEAD and ETag of
"filename.etag" are equal.
http-client-remove-check-file ==> when you get a different ETag, you may
delete the "filename.etag"
http-client-etag-stamp ==> when you first download a file, this procedure
could generate "filename.etag"
                                              and after you finished your
downloading, you need to delete "filename.etag"

Here's an simple 'continue-to-download' example to show how to use these
procedures:
----------------------------------code
begin---------------------------------------
(define* (http-client-retrive-file-continue uri #:key (path (uri-path uri))
    (try 5))
  (let ([head (http-get-uri-head uri)])
    (call-with-values
(lambda ()
  (http-client-get-ready-to-continue uri #:path path #:head head))
      (lambda (pos port)
(catch #t
       (lambda ()
 (if (zero? pos)
     (begin
       (display "download from beginning")
       (http-client-etag-stamp uri #:path path))
     (format #t "continue from position ~a~%" pos))
 (let lp ([data (http-client-get-block-from-uri
      uri #:start pos #:head head #:block 4096)]
  [pos pos])
   (if data
       (let* ([dl (bytevector-length data)]
      [new-pos (+ pos dl)]
      )
 (put-bytevector port data)
 (force-output port)
 (format #t "~a-~a~%" pos new-pos)
 (lp (http-client-get-block-from-uri
      uri #:start new-pos #:head head #:block 4096)
     new-pos))
       (format #t "~a has already been done!~%" path))))
       (lambda e
 (case (car e)
   ((system-error)
    (let ([E (system-error-errno e)])
      (if (or (= E ECONNABORTED)
      (= E ECONNREFUSED)
      (= E ECONNRESET))
  (begin
    (format #t "~a, try again!~%left ~a times to try~%"
    (car (cadddr e)) try)
    (close port)
    (http-client-retrive-file-continue uri #:path path
       #:try (1- try))))))
   (else
    (display "some error occured!\n")(newline)
    (format #t "~a : ~a~%" (car e) (cdr e)))))
       )))))
---------------------code end-----------------------------

And you may try this:
 (http-client-retrive-file-continue (string->uri "
http://mirrors.kernel.org/gnu/gcc/gcc-4.6.2/gcc-4.6.2.tar.bz2")
                                                     #:path "mmr.tar.bz2"
#:try 10)
#:path could be used to specify the local target. If you ignore it, it
would be the original file name.
#:try is times to try. When try decrease to 0 but downloading has
unfinished, it'll quit anyway.

Besides, one may use these procedures to build his/her own threads based
downloading tools, say, split the remote file into blocks and use 10
threads to down them separably.

Any comments?

[-- Attachment #1.2: Type: text/html, Size: 7922 bytes --]

[-- Attachment #2: 0001-add-some-useful-procedures-for-http-client.patch --]
[-- Type: text/x-patch, Size: 3470 bytes --]

From ab03251a6c9a476753c5498ba3a75009c37db272 Mon Sep 17 00:00:00 2001
From: NalaGinrut <NalaGinrut@gmail.com>
Date: Thu, 23 Feb 2012 17:46:56 +0800
Subject: [PATCH] add some useful procedures for http client

---
 module/web/client.scm |   72 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 71 insertions(+), 1 deletions(-)

diff --git a/module/web/client.scm b/module/web/client.scm
index b035668..fc96284 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -37,9 +37,18 @@
   #:use-module (ice-9 rdelim)
   #:use-module (web request)
   #:use-module (web response)
+  #:use-module (web http)
   #:use-module (web uri)
+  #:autoload (rnrs io ports) (get-string-all)
   #:export (open-socket-for-uri
-            http-get))
+            http-get
+            http-get-uri-head
+            http-client-get-block-from-uri
+            http-client-get-check-string
+            http-client-checkout-etag
+            http-client-remove-check-file
+            http-client-etag-stamp
+            http-client-get-ready-to-continue))
 
 (define (open-socket-for-uri uri)
   (let* ((ai (car (getaddrinfo (uri-host uri)
@@ -114,3 +123,64 @@
               (if decode-body?
                   (decode-response-body res body)
                   body)))))
+
+(define* (http-get-uri-head uri #:key (sock (open-socket-for-uri uri)))
+  (let ((rq (build-request uri #:method 'HEAD)))
+    (write-request rq sock)
+    (force-output sock)
+    (let ((head (read-response sock)))
+      (close sock)
+      head)))
+
+(define* (http-client-get-block-from-uri uri #:key (block #f) (start 0) 
+					 (head (http-get-uri-head uri)))
+  (let* ((s (open-socket-for-uri uri))
+	 (end (if block (+ start block) (response-content-length head)))
+	 (range-str (format #f "bytes=~a-~a" start end)) 
+	 (range (parse-header 'range range-str))
+	 (rq (build-request uri #:headers `((range ,@range)))))
+    (write-request rq s)
+    (force-output s)
+    (read-response-body (read-response s))))
+
+(define (http-client-get-check-string path)
+  (let ((target (string-append path ".etag")))
+    (if (file-exists? target)
+	(call-with-input-file target
+	  (lambda (port)
+	    (get-string-all port)))
+	"")))
+
+(define* (http-client-checkout-etag uri #:key (path (uri-path uri)) 
+			(head (http-get-uri-head uri)))
+  (let* ((etag (car (response-etag head)))
+	 (chk-str (http-client-get-check-string path))
+	 )
+    (string=? etag chk-str))) ;; checkout ETag
+
+(define* (http-client-remove-check-file path #:key (ext ".etag"))
+  (let ((chk-file (string-append path ext)))
+    (and (file-exists? chk-file) (delete-file chk-file))))
+
+(define* (http-client-etag-stamp uri #:key (head (http-get-uri-head uri))
+				 (path (uri-path uri))
+				 (ext ".etag"))
+  (let ((chk-file (string-append path ext))
+	(etag (car (response-etag head))))
+    (and (file-exists? chk-file) (delete-file chk-file))
+    (call-with-output-file chk-file
+      (lambda (port)
+	(format port "~a" etag)
+	(close port)))))
+
+(define* (http-client-get-ready-to-continue uri #:key (path (uri-path uri))
+				     (head (http-get-uri-head uri)))
+  (if (http-client-checkout-etag uri #:path path #:head head) ;; checkout ETag
+      (let* ((fp (open-file path "a"))
+	     (pos (stat:size (stat path))))
+	(seek fp pos SEEK_SET)
+	(http-client-remove-check-file path)
+	(values pos fp))
+      (let* ((fp (open-file path "w"))
+	     (pos 0))
+	(values pos fp))))
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2012-02-23 10:28 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-23 10:28 [PATCH] add some useful procedures for http client module Nala Ginrut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).