unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
From: Mathieu Othacehe <m.othacehe@gmail.com>
To: Danny Milosavljevic <dannym@scratchpost.org>
Cc: ludo@gnu.org, 40993@debbugs.gnu.org
Subject: [bug#40993] cuirass: Add build products download support.
Date: Fri, 01 May 2020 15:35:50 +0200	[thread overview]
Message-ID: <874ksz4w21.fsf@gmail.com> (raw)
In-Reply-To: <20200501120914.606ffe02@scratchpost.org> (Danny Milosavljevic's message of "Fri, 1 May 2020 12:09:14 +0200")

[-- Attachment #1: Type: text/plain, Size: 1289 bytes --]


Hey Danny,

> very cool!

Thanks :)

> Though I agree using sendfile would be much better, especially since the user
> can download 800 MB image files there.
>
> The guile (web server) module allows passing a procedure as the #:body, but
> then it makes a bytevector out of the result and hard-codes the content-type :P.
>
> Eventually (web server http) http-write is reached, which only supports encoding
> bytevectors and #f, that's it.  No files.
>
> So we'd have to overwrite http-write.
>
> But we are using our own (web server fiberized) impl already.
>
> So our impl chould be extended to be able to get and process FDs.
>
> client-loop there has
>
>               (lambda (response body)
>                 (write-response response client)
>                 (when body
>                   (put-bytevector client body))
>
> which means the "when body" part should be extended to also handle files, not just bytevectors.

The problem is that even with our fiberized implementation, what we pass
as "body" is checked in "sanitize-response" procedure of Guile's (web
server) module.

With the (very) hacky patch attached, I fool sanitize-response, by
sending the file name as a bytevector. This allows me to save gigabytes
of RAM when downloading disk images.

WDYT?

Thanks,

Mathieu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-cuirass-Use-sendfiles-instead-of-raw-copies.patch --]
[-- Type: text/x-diff, Size: 5427 bytes --]

From 0c5e91c170639d50d1cc339fa0b0e68ea4fba68c Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <m.othacehe@gmail.com>
Date: Fri, 1 May 2020 15:03:12 +0200
Subject: [PATCH] cuirass: Use sendfiles instead of raw copies.

* src/cuirass/http.scm (respond-file): Send the file name as an UTF8
bytevector, instead of the raw file content,
(respond-gzipped-file): ditto. Also set 'content-disposition header.
* src/web/server/fiberized.scm (client-loop): Check if 'content-disposition is
set. If it's the case, assume that the bytevector is the file name, and use
sendfiles to send it. Otherwise, keep the existing behaviour and send directly
the received bytevector.
---
 src/cuirass/http.scm         | 25 ++++++++++---------------
 src/web/server/fiberized.scm | 21 +++++++++++++++++++--
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/src/cuirass/http.scm b/src/cuirass/http.scm
index 79fa246..bdc780c 100644
--- a/src/cuirass/http.scm
+++ b/src/cuirass/http.scm
@@ -40,7 +40,8 @@
   #:use-module (web uri)
   #:use-module (fibers)
   #:use-module (fibers channels)
-  #:use-module ((rnrs bytevectors) #:select (utf8->string))
+  #:use-module ((rnrs bytevectors) #:select (utf8->string
+                                             string->utf8))
   #:use-module (sxml simple)
   #:use-module (cuirass templates)
   #:use-module (guix utils)
@@ -246,19 +247,14 @@ Hydra format."
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd")
        (sxml->xml body port))))
 
-  (define* (respond-file file
-                         #:key name)
+  (define* (respond-file file)
     (let ((content-type (or (assoc-ref %file-mime-types
                                        (file-extension file))
                             '(application/octet-stream))))
       (respond `((content-type . ,content-type)
-                 ,@(if name
-                       `((content-disposition
-                          . (form-data (filename . ,name))))
-                       '()))
-               ;; FIXME: FILE is potentially big so it'd be better to not load
-               ;; it in memory and instead 'sendfile' it.
-               #:body (call-with-input-file file get-bytevector-all))))
+                 (content-disposition
+                  . (form-data (filename . ,(basename file)))))
+               #:body (string->utf8 file))))
 
   (define (respond-static-file path)
     ;; PATH is a list of path components
@@ -273,10 +269,9 @@ Hydra format."
   (define (respond-gzipped-file file)
     ;; Return FILE with 'gzip' content-encoding.
     (respond `((content-type . (text/plain (charset . "UTF-8")))
-               (content-encoding . (gzip)))
-             ;; FIXME: FILE is potentially big so it'd be better to not load
-             ;; it in memory and instead 'sendfile' it.
-             #:body (call-with-input-file file get-bytevector-all)))
+               (content-encoding . (gzip))
+               (content-disposition . (form-data (filename . ,file))))
+             #:body (string->utf8 file)))
 
   (define (respond-build-not-found build-id)
     (respond-json-with-error
@@ -521,7 +516,7 @@ Hydra format."
 
     (('GET "download" id)
      (let ((path (db-get-build-product-path id)))
-       (respond-file path #:name (basename path))))
+       (respond-file path)))
 
     (('GET "static" path ...)
      (respond-static-file path))
diff --git a/src/web/server/fiberized.scm b/src/web/server/fiberized.scm
index 308b642..68ae132 100644
--- a/src/web/server/fiberized.scm
+++ b/src/web/server/fiberized.scm
@@ -37,6 +37,7 @@
   #:use-module (web request)
   #:use-module (web response)
   #:use-module (web server)
+  #:use-module ((rnrs bytevectors) #:select (utf8->string))
   #:use-module (ice-9 binary-ports)
   #:use-module (ice-9 match)
   #:use-module (fibers)
@@ -92,6 +93,8 @@
               ((0) (memq 'keep-alive (response-connection response)))))
            (else #f)))))
 
+(define extend-response (@@ (web server) extend-response))
+
 (define (client-loop client have-request)
   ;; Always disable Nagle's algorithm, as we handle buffering
   ;; ourselves.
@@ -119,9 +122,23 @@
                                               #:headers '((content-length . 0)))
                               #vu8()))))
               (lambda (response body)
-                (write-response response client)
                 (when body
-                  (put-bytevector client body))
+                  (let* ((headers (response-headers response))
+                         (file? (assq-ref headers 'content-disposition))
+                         (file (and file? (utf8->string body)))
+                         (file-size (and file? (stat:size (stat file)))))
+                    (cond
+                     (file?
+                      (call-with-input-file file
+                        (lambda (port)
+                          (write-response
+                           (extend-response response 'content-length
+                                            file-size)
+                           client)
+                          (sendfile client port file-size))))
+                     (else
+                      (write-response response client)
+                      (put-bytevector client body)))))
                 (force-output client)
                 (if (and (keep-alive? response)
                          (not (eof-object? (peek-char client))))
-- 
2.26.0


  reply	other threads:[~2020-05-01 13:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01  8:54 [bug#40993] cuirass: Add build products download support Mathieu Othacehe
2020-05-01 10:09 ` Danny Milosavljevic
2020-05-01 13:35   ` Mathieu Othacehe [this message]
2020-05-01 21:17     ` Ludovic Courtès
2020-06-03 11:54       ` Mathieu Othacehe
2020-06-03 20:14         ` Ludovic Courtès
2020-05-01 21:11   ` Ludovic Courtès
2020-06-03 20:26 ` Ludovic Courtès
2020-06-10 15:44   ` Mathieu Othacehe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874ksz4w21.fsf@gmail.com \
    --to=m.othacehe@gmail.com \
    --cc=40993@debbugs.gnu.org \
    --cc=dannym@scratchpost.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).