unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* Some help needed to use curl lib to download binary file
@ 2022-07-29  8:41 Sébastien Rey-Coyrehourcq
  2022-07-29 10:25 ` Vivien Kraus
  0 siblings, 1 reply; 3+ messages in thread
From: Sébastien Rey-Coyrehourcq @ 2022-07-29  8:41 UTC (permalink / raw)
  To: guile-user


[-- Attachment #1.1.1: Type: text/plain, Size: 1276 bytes --]

Hi guile community,

I’m a new user, jumping into guix and guile in the same time,
i’m interested to build a module to download file stored on Zenodo platform, using curl to request their REST Api.

This is usefull for reproducibility, i need to recover some data from Zenodo before running guix package that compile simulation that use this data.

I built a first part, using some info i found about curl lib of guile, but i’m now blocked by the last part, writing the binary body retrieved by curl into some file on my disk.

Here the full script  on debian paste :  <https://paste.debian.net/1248702/>

My problem is probably here, i don’t know how to pass the byte of body (from (get-file get-file-link)) into some file, i try with and without “string->bytevector” body conversion, without success  :

;; report body content from url string
(define (get-file url)
  ((receive (response body) (http-get url #true) (string->bytevector body))))

;; write content into file
(call-with-output-file “download.zip” (lambda (current-output-port)
                                        (get-file get-file-link)
                                        (put-bytevector (current-output-port) body)))

Thanks for your help
Best regards,
S.RC.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 889 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Some help needed to use curl lib to download binary file
  2022-07-29  8:41 Some help needed to use curl lib to download binary file Sébastien Rey-Coyrehourcq
@ 2022-07-29 10:25 ` Vivien Kraus
  2022-08-01  7:48   ` Sébastien Rey-Coyrehourcq
  0 siblings, 1 reply; 3+ messages in thread
From: Vivien Kraus @ 2022-07-29 10:25 UTC (permalink / raw)
  To: Sébastien Rey-Coyrehourcq, guile-user

Hello,

I see in the paste:

> ;; function taken on
> https://gist.github.com/amirouche/138a27bdbef5a672a0135f90ca26ec41
> ;; then adapted to use cookie jar
> (define-public (http-get url cookie-exist)
>   ;; Create a Curl handle
>   (let ((handle (curl-easy-init)))
>     ;; Set the URL from which to get the data
>     (curl-easy-setopt handle 'url url)
>     (if cookie-exist
>         (curl-easy-setopt handle 'cookie "cookie.txt")
>         (curl-easy-setopt handle 'cookiejar "cookie.txt"))
> 
>     ;; Request that the HTTP headers be included in the response
>     (curl-easy-setopt handle 'header #t)
>     ;; Get the result as a Latin-1 string
>     (let* ((response-string (curl-easy-perform handle))
>            ;; Create a string port from the response
>            (response-port (open-input-string response-string))
>            ;; Have the (web response) module to parse the response
>            (response (read-response response-port))
>            (body (utf8->string (read-response-body response))))
>       (close response-port)
>       ;; Have the (web response) module extract the body from the
>       ;; response
>       (values response body))))

So here the call expects the response to be UTF-8 text. If it is a
binary file that you are downloading, the function will raise an
exception. Guile has that python3 feeling where you are supposed to
know in advance whether what you are using is text or binary, which is
hurting you here. 

However, you can avoid the problem by either having bytevectors
everywhere, so removing the call to utf8->string and bind the "body"
variable directly to (read-response-body response), or pretend that you
know better than guile and pretend that it is latin-1-encoded, so you
lose no information and guile won’t complain. In that case, load (ice-9
iconv) and replace (utf8->string …) with (bytevector->string … "ISO-
8859-1").

If you go the first route, you get a bytevector back. If you go the
second one, you get a string, but you must remember that it contains
raw bytes and not text (unless the response body was indeed text).

There are some cases when you might want to have such strings-that-
contain-binary-or-text, such as if you want to use the strings API on
them and the bytevector API does not provide what you want, or you want
to interface with NUL-terminated strings. In other cases you might
prefer bytevectors. Anyway, there is no encoding cost to convert
between strings-that-contain-binary-or-toxt and bytevectors, it is as
simple as a copy.

> ;; write content into file
> (call-with-output-file “download.zip” (lambda (current-output-port)
>                                         (get-file get-file-link)
>                                         (put-bytevector (current-
> output-port) body)))

call-with-output-file calls its function argument with a port. Plus,
you just ignore the result of get-file, so instead you should do
something with it.

If you chose to have get-file return a bytevector, you can do:

(call-with-output-file "download.zip"
  (lambda (port)
    (put-bytevector port (get-file get-file-link)))
  #:binary #t)

If get-file returns a string, you can do:

(call-with-output-file "download.zip"
  (lambda (port)
    (put-string port (get-file get-file-link)))
  #:encoding "ISO-8859-1")

Best regards,

Vivien




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Some help needed to use curl lib to download binary file
  2022-07-29 10:25 ` Vivien Kraus
@ 2022-08-01  7:48   ` Sébastien Rey-Coyrehourcq
  0 siblings, 0 replies; 3+ messages in thread
From: Sébastien Rey-Coyrehourcq @ 2022-08-01  7:48 UTC (permalink / raw)
  To: Vivien Kraus; +Cc: guile-user


[-- Attachment #1.1.1: Type: text/plain, Size: 4072 bytes --]

Thanks a lot for this very detailled explanation Vivien, that’s help me a lot to understand what happens here !!

I’m trying to continue my learning of Lisp/Scheme like language, this is a new fascinating world to explore :)

Best regards,

Vivien Kraus <vivien@planete-kraus.eu> writes:

> Hello,
>
> I see in the paste:
>
>> ;; function taken on
>> <https://gist.github.com/amirouche/138a27bdbef5a672a0135f90ca26ec41>
>> ;; then adapted to use cookie jar
>> (define-public (http-get url cookie-exist)
>>   ;; Create a Curl handle
>>   (let ((handle (curl-easy-init)))
>>     ;; Set the URL from which to get the data
>>     (curl-easy-setopt handle ’url url)
>>     (if cookie-exist
>>         (curl-easy-setopt handle ’cookie “cookie.txt”)
>>         (curl-easy-setopt handle ’cookiejar “cookie.txt”))
>>
>>     ;; Request that the HTTP headers be included in the response
>>     (curl-easy-setopt handle ’header #t)
>>     ;; Get the result as a Latin-1 string
>>     (let* ((response-string (curl-easy-perform handle))
>>            ;; Create a string port from the response
>>            (response-port (open-input-string response-string))
>>            ;; Have the (web response) module to parse the response
>>            (response (read-response response-port))
>>            (body (utf8->string (read-response-body response))))
>>       (close response-port)
>>       ;; Have the (web response) module extract the body from the
>>       ;; response
>>       (values response body))))
>
> So here the call expects the response to be UTF-8 text. If it is a
> binary file that you are downloading, the function will raise an
> exception. Guile has that python3 feeling where you are supposed to
> know in advance whether what you are using is text or binary, which is
> hurting you here. 
>
> However, you can avoid the problem by either having bytevectors
> everywhere, so removing the call to utf8->string and bind the “body”
> variable directly to (read-response-body response), or pretend that you
> know better than guile and pretend that it is latin-1-encoded, so you
> lose no information and guile won’t complain. In that case, load (ice-9
> iconv) and replace (utf8->string …) with (bytevector->string … “ISO-
> 8859-1”).
>
> If you go the first route, you get a bytevector back. If you go the
> second one, you get a string, but you must remember that it contains
> raw bytes and not text (unless the response body was indeed text).
>
> There are some cases when you might want to have such strings-that-
> contain-binary-or-text, such as if you want to use the strings API on
> them and the bytevector API does not provide what you want, or you want
> to interface with NUL-terminated strings. In other cases you might
> prefer bytevectors. Anyway, there is no encoding cost to convert
> between strings-that-contain-binary-or-toxt and bytevectors, it is as
> simple as a copy.
>
>> ;; write content into file
>> (call-with-output-file “download.zip” (lambda (current-output-port)
>>                                         (get-file get-file-link)
>>                                         (put-bytevector (current-
>> output-port) body)))
>
> call-with-output-file calls its function argument with a port. Plus,
> you just ignore the result of get-file, so instead you should do
> something with it.
>
> If you chose to have get-file return a bytevector, you can do:
>
> (call-with-output-file “download.zip”
>   (lambda (port)
>     (put-bytevector port (get-file get-file-link)))
>   #:binary #t)
>
> If get-file returns a string, you can do:
>
> (call-with-output-file “download.zip”
>   (lambda (port)
>     (put-string port (get-file get-file-link)))
>   #:encoding “ISO-8859-1”)
>
> Best regards,
>
> Vivien

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 889 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-01  7:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-29  8:41 Some help needed to use curl lib to download binary file Sébastien Rey-Coyrehourcq
2022-07-29 10:25 ` Vivien Kraus
2022-08-01  7:48   ` Sébastien Rey-Coyrehourcq

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).