From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id O2DJAqtVzV8YeAAA0tVLHw (envelope-from ) for ; Sun, 06 Dec 2020 22:05:31 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id CCilOapVzV+IPgAA1q6Kng (envelope-from ) for ; Sun, 06 Dec 2020 22:05:30 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 8D8089403AA for ; Sun, 6 Dec 2020 22:05:30 +0000 (UTC) Received: from localhost ([::1]:33904 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1km29p-0001QO-EO for larch@yhetil.org; Sun, 06 Dec 2020 17:05:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46478) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1km29R-0001Lw-Ea for guix-patches@gnu.org; Sun, 06 Dec 2020 17:05:05 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:40045) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1km29R-0003L8-6C for guix-patches@gnu.org; Sun, 06 Dec 2020 17:05:05 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1km29R-0006ST-0o for guix-patches@gnu.org; Sun, 06 Dec 2020 17:05:05 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#45018] [PATCH v2 5/6] substitute: Cache and reuse connections while substituting. Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Sun, 06 Dec 2020 22:05:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 45018 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 45018@debbugs.gnu.org Cc: Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by 45018-submit@debbugs.gnu.org id=B45018.160729229724764 (code B ref 45018); Sun, 06 Dec 2020 22:05:04 +0000 Received: (at 45018) by debbugs.gnu.org; 6 Dec 2020 22:04:57 +0000 Received: from localhost ([127.0.0.1]:51585 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1km29I-0006RH-Ie for submit@debbugs.gnu.org; Sun, 06 Dec 2020 17:04:57 -0500 Received: from eggs.gnu.org ([209.51.188.92]:56376) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1km29B-0006Py-QH for 45018@debbugs.gnu.org; Sun, 06 Dec 2020 17:04:53 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:44346) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1km295-0003Dt-Jn; Sun, 06 Dec 2020 17:04:44 -0500 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=45440 helo=gnu.org) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1km295-00020D-6b; Sun, 06 Dec 2020 17:04:43 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Date: Sun, 6 Dec 2020 23:04:14 +0100 Message-Id: <20201206220415.23279-6-ludo@gnu.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201206220415.23279-1-ludo@gnu.org> References: <87o8janf50.fsf@gnu.org> <20201206220415.23279-1-ludo@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -1.80 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Queue-Id: 8D8089403AA X-Spam-Score: -1.80 X-Migadu-Scanner: ns3122888.ip-94-23-21.eu X-TUID: oxkPewES3QqX That way, when fetching a series of substitutes from the same server(s), the connection is reused instead of being closed/opened for each substitutes, which saves on network round trips and TLS handshakes. * guix/http-client.scm (http-fetch): Add #:keep-alive? and honor it. * guix/progress.scm (progress-report-port): Add #:close? parameter and honor it. * guix/scripts/substitute.scm (at-most): Return the tail as a second value. (fetch): Add #:port and #:keep-alive? and honor them. (%max-cached-connections): New variable. (open-connection-for-uri/cached, call-with-cached-connection): New procedures. (with-cached-connection): New macro. (process-substitution): Wrap 'fetch' call in 'with-cached-connection'. Pass #:close? to 'progress-report-port'. --- guix/http-client.scm | 12 ++--- guix/progress.scm | 8 +-- guix/scripts/substitute.scm | 103 ++++++++++++++++++++++++++++++------ nix/libstore/build.cc | 27 ++++++---- 4 files changed, 116 insertions(+), 34 deletions(-) diff --git a/guix/http-client.scm b/guix/http-client.scm index a767175d67..553640fe9e 100644 --- a/guix/http-client.scm +++ b/guix/http-client.scm @@ -1,5 +1,5 @@ ;;; GNU Guix --- Functional package management for GNU -;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018 Ludovic Courtès +;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2020 Ludovic Courtès ;;; Copyright © 2015 Mark H Weaver ;;; Copyright © 2012, 2015 Free Software Foundation, Inc. ;;; Copyright © 2017 Tobias Geerinckx-Rice @@ -70,6 +70,7 @@ (define* (http-fetch uri #:key port (text? #f) (buffered? #t) + (keep-alive? #f) (verify-certificate? #t) (headers '((user-agent . "GNU Guile"))) timeout) @@ -79,6 +80,9 @@ textual. Follow any HTTP redirection. When BUFFERED? is #f, return an unbuffered port, suitable for use in `filtered-port'. HEADERS is an alist of extra HTTP headers. +When KEEP-ALIVE? is true, the connection is marked as 'keep-alive' and PORT is +not closed upon completion. + When VERIFY-CERTIFICATE? is true, verify HTTPS server certificates. TIMEOUT specifies the timeout in seconds for connection establishment; when @@ -104,11 +108,7 @@ Raise an '&http-get-error' condition if downloading fails." (setvbuf port 'none)) (let*-values (((resp data) (http-get uri #:streaming? #t #:port port - ;; XXX: When #:keep-alive? is true, if DATA is - ;; a chunked-encoding port, closing DATA won't - ;; close PORT, leading to a file descriptor - ;; leak. - #:keep-alive? #f + #:keep-alive? keep-alive? #:headers headers)) ((code) (response-code resp))) diff --git a/guix/progress.scm b/guix/progress.scm index fec65b424c..cd80ae620a 100644 --- a/guix/progress.scm +++ b/guix/progress.scm @@ -337,9 +337,10 @@ should be a object." (report total) (loop total (get-bytevector-n! in buffer 0 buffer-size)))))))) -(define (progress-report-port reporter port) +(define* (progress-report-port reporter port #:key (close? #t)) "Return a port that continuously reports the bytes read from PORT using -REPORTER, which should be a object." +REPORTER, which should be a object. When CLOSE? is true, +PORT is closed when the returned port is closed." (match reporter (($ start report stop) (let* ((total 0) @@ -364,5 +365,6 @@ REPORTER, which should be a object." ;; trace. (unless (zero? total) (stop)) - (close-port port))))))) + (when close? + (close-port port)))))))) diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm index 4bf496f1bc..732bf073e8 100755 --- a/guix/scripts/substitute.scm +++ b/guix/scripts/substitute.scm @@ -188,9 +188,14 @@ again." (sigaction SIGALRM SIG_DFL) (apply values result))))) -(define* (fetch uri #:key (buffered? #t) (timeout? #t)) +(define* (fetch uri #:key (buffered? #t) (timeout? #t) + (keep-alive? #f) (port #f)) "Return a binary input port to URI and the number of bytes it's expected to -provide." +provide. + +When PORT is true, use it as the underlying I/O port for HTTP transfers; when +PORT is false, open a new connection for URI. When KEEP-ALIVE? is true, the +connection (typically PORT) is kept open once data has been fetched from URI." (case (uri-scheme uri) ((file) (let ((port (open-file (uri-path uri) @@ -206,7 +211,7 @@ provide." ;; sudo tc qdisc add dev eth0 root netem delay 1500ms ;; and then cancel with: ;; sudo tc qdisc del dev eth0 root - (let ((port #f)) + (let ((port port)) (with-timeout (if timeout? %fetch-timeout 0) @@ -217,10 +222,11 @@ provide." (begin (when (or (not port) (port-closed? port)) (set! port (guix:open-connection-for-uri - uri #:verify-certificate? #f)) - (unless (or buffered? (not (file-port? port))) - (setvbuf port 'none))) + uri #:verify-certificate? #f))) + (unless (or buffered? (not (file-port? port))) + (setvbuf port 'none)) (http-fetch uri #:text? #f #:port port + #:keep-alive? keep-alive? #:verify-certificate? #f)))))) (else (leave (G_ "unsupported substitute URI scheme: ~a~%") @@ -478,17 +484,17 @@ indicates that PATH is unavailable at CACHE-URL." (build-request (string->uri url) #:method 'GET #:headers headers))) (define (at-most max-length lst) - "If LST is shorter than MAX-LENGTH, return it; otherwise return its -MAX-LENGTH first elements." + "If LST is shorter than MAX-LENGTH, return it and the empty list; otherwise +return its MAX-LENGTH first elements and its tail." (let loop ((len 0) (lst lst) (result '())) (match lst (() - (reverse result)) + (values (reverse result) '())) ((head . tail) (if (>= len max-length) - (reverse result) + (values (reverse result) lst) (loop (+ 1 len) tail (cons head result))))))) (define* (http-multiple-get base-uri proc seed requests @@ -962,6 +968,68 @@ the URI, its compression method (a string), and the compressed file size." (((uri compression file-size) _ ...) (values uri compression file-size)))) +(define %max-cached-connections + ;; Maximum number of connections kept in cache by + ;; 'open-connection-for-uri/cached'. + 16) + +(define open-connection-for-uri/cached + (let ((cache '())) + (lambda* (uri #:key fresh?) + "Return a connection for URI, possibly reusing a cached connection. +When FRESH? is true, delete any cached connections for URI and open a new +one. Return #f if URI's scheme is 'file' or #f." + (define host (uri-host uri)) + (define scheme (uri-scheme uri)) + (define key (list host scheme (uri-port uri))) + + (and (not (memq scheme '(file #f))) + (match (assoc-ref cache key) + (#f + ;; Open a new connection to URI and evict old entries from + ;; CACHE, if any. + (let-values (((socket) + (guix:open-connection-for-uri + uri #:verify-certificate? #f)) + ((new-cache evicted) + (at-most (- %max-cached-connections 1) cache))) + (for-each (match-lambda + ((_ . port) + (false-if-exception (close-port port)))) + evicted) + (set! cache (alist-cons key socket new-cache)) + socket)) + (socket + (if (or fresh? (port-closed? socket)) + (begin + (false-if-exception (close-port socket)) + (set! cache (alist-delete key cache)) + (open-connection-for-uri/cached uri)) + (begin + ;; Drain input left from the previous use. + (drain-input socket) + socket)))))))) + +(define (call-with-cached-connection uri proc) + (let ((port (open-connection-for-uri/cached uri))) + (catch #t + (lambda () + (proc port)) + (lambda (key . args) + ;; If PORT was cached and the server closed the connection in the + ;; meantime, we get EPIPE. In that case, open a fresh connection and + ;; retry. We might also get 'bad-response or a similar exception from + ;; (web response) later on, once we've sent the request. + (if (or (and (eq? key 'system-error) + (= EPIPE (system-error-errno `(,key ,@args)))) + (memq key '(bad-response bad-header bad-header-component))) + (proc (open-connection-for-uri/cached uri #:fresh? #t)) + (apply throw key args)))))) + +(define-syntax-rule (with-cached-connection uri port exp ...) + "Bind PORT with EXP... to a socket connected to URI." + (call-with-cached-connection uri (lambda (port) exp ...))) + (define* (process-substitution store-item destination #:key cache-urls acl print-build-trace?) "Substitute STORE-ITEM (a store file name) from CACHE-URLS, and write it to @@ -984,10 +1052,12 @@ DESTINATION as a nar file. Verify the substitute against ACL." (G_ "Downloading ~a...~%") (uri->string uri))) (let*-values (((raw download-size) - ;; Note that Hydra currently generates Nars on the fly - ;; and doesn't specify a Content-Length, so - ;; DOWNLOAD-SIZE is #f in practice. - (fetch uri #:buffered? #f #:timeout? #f)) + ;; 'guix publish' without '--cache' doesn't specify a + ;; Content-Length, so DOWNLOAD-SIZE is #f in this case. + (with-cached-connection uri port + (fetch uri #:buffered? #f #:timeout? #f + #:port port + #:keep-alive? #t))) ((progress) (let* ((dl-size (or download-size (and (equal? compression "none") @@ -1001,7 +1071,9 @@ DESTINATION as a nar file. Verify the substitute against ACL." (uri->string uri) dl-size (current-error-port) #:abbreviation nar-uri-abbreviation)))) - (progress-report-port reporter raw))) + ;; Keep RAW open upon completion so we can later reuse + ;; the underlying connection. + (progress-report-port reporter raw #:close? #f))) ((input pids) ;; NOTE: This 'progress' port of current process will be ;; closed here, while the child process doing the @@ -1216,6 +1288,7 @@ default value." ;;; Local Variables: ;;; eval: (put 'with-timeout 'scheme-indent-function 1) +;;; eval: (put 'with-cached-connection 'scheme-indent-function 2) ;;; End: ;;; substitute.scm ends here diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc index 50d300253d..6cfe7aba7e 100644 --- a/nix/libstore/build.cc +++ b/nix/libstore/build.cc @@ -3114,17 +3114,24 @@ void SubstitutionGoal::handleChildOutput(int fd, const string & data) } if (fd == substituter->fromAgent.readSide) { - /* Trim whitespace to the right. */ - size_t end = data.find_last_not_of(" \t\n"); - string trimmed = (end != string::npos) ? data.substr(0, end + 1) : data; + /* DATA may consist of several lines. Process them one by one. */ + string input = data; + while (!input.empty()) { + /* Process up to the first newline. */ + size_t end = input.find_first_of("\n"); + string trimmed = (end != string::npos) ? input.substr(0, end) : input; - if (expectedHashStr == "") { - expectedHashStr = trimmed; - } else if (status == "") { - status = trimmed; - worker.wakeUp(shared_from_this()); - } else { - printMsg(lvlError, format("unexpected substituter message '%1%'") % data); + /* Update the goal's state accordingly. */ + if (expectedHashStr == "") { + expectedHashStr = trimmed; + } else if (status == "") { + status = trimmed; + worker.wakeUp(shared_from_this()); + } else { + printMsg(lvlError, format("unexpected substituter message '%1%'") % input); + } + + input = (end != string::npos) ? input.substr(end + 1) : ""; } } } -- 2.29.2