all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Mathieu Othacehe <othacehe@gnu.org>
To: 45006@debbugs.gnu.org
Subject: [bug#45006] cuirass: Add remote build support.
Date: Wed, 02 Dec 2020 12:04:56 +0100	[thread overview]
Message-ID: <87czzso4dj.fsf@gnu.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]


Hello,

Here's a patch adding remote build support to Cuirass, as presented
during Guix Days[1]. The concept is the following:

* Cuirass talks to a "remote server" instead of the "guix-daemon" to
build derivations when the "--build-remote" option is passed.

* The "remote server" is advertised using Avahi. It queues the received
build requests. It also starts a "publish" server.

* The "remote workers" discover the "remote server" using Avahi, connect
to it and request some builds. The "remote server" publish server is
added to the workers "guix-daemon" substitute urls list.

* On build completion, the "remote server" downloads the build outputs
as nar and narinfo files from the worker "publish" server and store them
in a cache directory. It can also add them to the store if the
"--add-to-store" option is passed.

* Cuirass is notified by the "remote server" when a build starts, fails
or completes and can update its database accordingly.

* The communication between Cuirass, the "remote server" and the "remote
workers" is done by sending SEXP over ZMQ.

This is still a bit rough on the edges, but I have tested it on berlin
spawning ~30 workers and building ~10K derivations, it seems to work
fine.

The corresponding patch and an architecture overview diagram are attached.

Thanks,

Mathieu

[1]:
https://xana.lepiller.eu/guix-days-2020/guix-days-2020-mathieu-otacehe-fixing-the-ci.mp4

[-- Attachment #2: 0001-Add-remote-build-support.patch --]
[-- Type: text/x-diff, Size: 61684 bytes --]

From 94898f67e1dca6152c434ff50e860691ce813018 Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <othacehe@gnu.org>
Date: Wed, 2 Dec 2020 11:13:33 +0100
Subject: [PATCH] Add remote build support.

* src/cuirass/remote.scm: New file.
* src/cuirass/remote-server.scm: New file.
* src/cuirass/remote-worker.scm: New file.
* bin/remote-server.in: New file.
* bin/remote-worker.in: New file.
* Makefile.am (bin_SCRIPTS): Add new binaries,
(dist_pkgmodule_DATA): add new files,
(EXTRA_DIST): add new binaries,
(bin/remote-server, bin/remote-worker): new targets.
* .gitignore: Add new binaries.
* bin/cuirass.in (%options): Add "--build-remote" option,
(show-help): document it,
(main): honor it.
* src/cuirass/base.scm (with-build-offload-thread): New macro,
(%build-remote?, %build-offload-channel): new parameters,
(make-build-offload-thread): new procedure,
(build-derivations/offload): new procedure,
(restart-builds): use it to offload builds when %build-remote? is set,
(build-packages): ditto.
---
 .gitignore                    |   2 +
 Makefile.am                   |  16 +-
 bin/cuirass.in                | 162 ++++++-----
 bin/remote-server.in          |  29 ++
 bin/remote-worker.in          |  29 ++
 src/cuirass/base.scm          |  65 ++++-
 src/cuirass/remote-server.scm | 518 ++++++++++++++++++++++++++++++++++
 src/cuirass/remote-worker.scm | 286 +++++++++++++++++++
 src/cuirass/remote.scm        | 292 +++++++++++++++++++
 9 files changed, 1318 insertions(+), 81 deletions(-)
 create mode 100644 bin/remote-server.in
 create mode 100644 bin/remote-worker.in
 create mode 100644 src/cuirass/remote-server.scm
 create mode 100644 src/cuirass/remote-worker.scm
 create mode 100644 src/cuirass/remote.scm

diff --git a/.gitignore b/.gitignore
index beabf29..7cd0e1f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -12,6 +12,8 @@
 /bin/cuirass
 /bin/cuirass-send-events
 /bin/evaluate
+/bin/remote-server
+/bin/remote-worker
 /build-aux/config.guess
 /build-aux/config.sub
 /build-aux/install-sh
diff --git a/Makefile.am b/Makefile.am
index 17a73f0..270c0ed 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -22,7 +22,13 @@
 # You should have received a copy of the GNU General Public License
 # along with Cuirass.  If not, see <http://www.gnu.org/licenses/>.
 
-bin_SCRIPTS = bin/cuirass bin/cuirass-send-events bin/evaluate
+bin_SCRIPTS =                    \
+  bin/cuirass      	  	 \
+  bin/cuirass-send-events 	 \
+  bin/evaluate      		 \
+  bin/remote-server		 \
+  bin/remote-worker
+
 noinst_SCRIPTS = pre-inst-env
 
 guilesitedir = $(datarootdir)/guile/site/@GUILE_EFFECTIVE_VERSION@
@@ -48,6 +54,9 @@ dist_pkgmodule_DATA =				\
   src/cuirass/http.scm				\
   src/cuirass/logging.scm			\
   src/cuirass/metrics.scm			\
+  src/cuirass/remote.scm			\
+  src/cuirass/remote-server.scm			\
+  src/cuirass/remote-worker.scm			\
   src/cuirass/send-events.scm			\
   src/cuirass/ui.scm				\
   src/cuirass/utils.scm		                \
@@ -166,6 +175,8 @@ EXTRA_DIST = \
   bin/cuirass.in \
   bin/cuirass-send-events.in \
   bin/evaluate.in \
+  bin/remote-server.in \
+  bin/remote-worker.in \
   bootstrap \
   build-aux/guix.scm \
   src/cuirass/config.scm.in \
@@ -226,6 +237,9 @@ generate_file = \
 bin/cuirass: $(srcdir)/bin/cuirass.in
 bin/cuirass-send-events: $(srcdir)/bin/cuirass-send-events.in
 bin/evaluate: $(srcdir)/bin/evaluate.in
+bin/remote-server: $(srcdir)/bin/remote-server.in
+bin/remote-worker: $(srcdir)/bin/remote-worker.in
+
 $(bin_SCRIPTS): Makefile
 	$(generate_file); chmod +x $@
 src/cuirass/config.scm: $(srcdir)/src/cuirass/config.scm.in Makefile
diff --git a/bin/cuirass.in b/bin/cuirass.in
index aef4a65..ac9811c 100644
--- a/bin/cuirass.in
+++ b/bin/cuirass.in
@@ -57,6 +57,7 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
       --listen=HOST         Listen on the network interface for HOST
   -I, --interval=N          Wait N seconds between each poll
       --log-queries=FILE    Log SQL queries in FILE.
+      --build-remote        Use the remote build mechanism
       --use-substitutes     Allow usage of pre-built substitutes
       --record-events       Record events for distribution
       --threads=N           Use up to N kernel threads
@@ -74,6 +75,7 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
     (port           (single-char #\p) (value #t))
     (listen                           (value #t))
     (interval       (single-char #\I) (value #t))
+    (build-remote                     (value #f))
     (use-substitutes                  (value #f))
     (threads                          (value #t))
     (fallback                         (value #f))
@@ -100,6 +102,7 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
          (%package-database (option-ref opts 'database (%package-database)))
          (%package-cachedir
           (option-ref opts 'cache-directory (%package-cachedir)))
+         (%build-remote? (option-ref opts 'build-remote #f))
          (%use-substitutes? (option-ref opts 'use-substitutes #f))
          (%fallback? (option-ref opts 'fallback #f))
          (%record-events? (option-ref opts 'record-events #f))
@@ -141,84 +144,87 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
            (lambda ()
              (with-database
                (with-queue-writer-worker
-                (and specfile
-                     (let ((new-specs (save-module-excursion
-                                       (lambda ()
-                                         (set-current-module (make-user-module '()))
-                                         (primitive-load specfile)))))
-                       (for-each db-add-specification new-specs)))
-
-                (when queries-file
-                  (log-message "Enable SQL query logging.")
-                  (db-log-queries queries-file))
-
-                (if one-shot?
-                    (process-specs (db-get-specifications))
-                    (let ((exit-channel (make-channel)))
-                      (start-watchdog)
-                      (if (option-ref opts 'web #f)
-                          (begin
-                            (spawn-fiber
-                             (essential-task
-                              'web exit-channel
-                              (lambda ()
-                                (run-cuirass-server #:host host #:port port)))
-                             #:parallel? #t)
-
-                            (spawn-fiber
-                             (essential-task
-                              'monitor exit-channel
-                              (lambda ()
-                                (while #t
-                                  (log-monitoring-stats)
-                                  (sleep 600))))))
-
-                          (begin
-                            (clear-build-queue)
-
-                            ;; If Cuirass was stopped during an evaluation,
-                            ;; abort it. Builds that were not registered
-                            ;; during this evaluation will be registered
-                            ;; during the next evaluation.
-                            (db-abort-pending-evaluations)
-
-                            ;; First off, restart builds that had not
-                            ;; completed or were not even started on a
-                            ;; previous run.
-                            (spawn-fiber
-                             (essential-task
-                              'restart-builds exit-channel
-                              (lambda ()
-                                (restart-builds))))
-
-                            (spawn-fiber
-                             (essential-task
-                              'build exit-channel
-                              (lambda ()
-                                (while #t
-                                  (process-specs (db-get-specifications))
-                                  (log-message
-                                   "next evaluation in ~a seconds" interval)
-                                  (sleep interval)))))
-
-                            (spawn-fiber
-                             (essential-task
-                              'metrics exit-channel
-                              (lambda ()
-                                (while #t
-                                  (with-time-logging
-                                   "Metrics update"
-                                   (db-update-metrics))
-                                  (sleep 3600)))))
-
-                            (spawn-fiber
-                             (essential-task
-                              'monitor exit-channel
-                              (lambda ()
-                                (while #t
-                                  (log-monitoring-stats)
-                                  (sleep 600)))))))
-                      (primitive-exit (get-message exit-channel)))))))
+                 (with-build-offload-thread
+                  (and specfile
+                       (let ((new-specs (save-module-excursion
+                                         (lambda ()
+                                           (set-current-module
+                                            (make-user-module '()))
+                                           (primitive-load specfile)))))
+                         (for-each db-add-specification new-specs)))
+
+                  (when queries-file
+                    (log-message "Enable SQL query logging.")
+                    (db-log-queries queries-file))
+
+                  (if one-shot?
+                      (process-specs (db-get-specifications))
+                      (let ((exit-channel (make-channel)))
+                        (start-watchdog)
+                        (if (option-ref opts 'web #f)
+                            (begin
+                              (spawn-fiber
+                               (essential-task
+                                'web exit-channel
+                                (lambda ()
+                                  (run-cuirass-server #:host host
+                                                      #:port port)))
+                               #:parallel? #t)
+
+                              (spawn-fiber
+                               (essential-task
+                                'monitor exit-channel
+                                (lambda ()
+                                  (while #t
+                                    (log-monitoring-stats)
+                                    (sleep 600))))))
+
+                            (begin
+                              (clear-build-queue)
+
+                              ;; If Cuirass was stopped during an evaluation,
+                              ;; abort it. Builds that were not registered
+                              ;; during this evaluation will be registered
+                              ;; during the next evaluation.
+                              (db-abort-pending-evaluations)
+
+                              ;; First off, restart builds that had not
+                              ;; completed or were not even started on a
+                              ;; previous run.
+                              (spawn-fiber
+                               (essential-task
+                                'restart-builds exit-channel
+                                (lambda ()
+                                  (restart-builds))))
+
+                              (spawn-fiber
+                               (essential-task
+                                'build exit-channel
+                                (lambda ()
+                                  (while #t
+                                    (process-specs (db-get-specifications))
+                                    (log-message
+                                     "next evaluation in ~a seconds" interval)
+                                    (sleep interval)))))
+
+                              (spawn-fiber
+                               (essential-task
+                                'metrics exit-channel
+                                (lambda ()
+                                  (while #t
+                                    (with-time-logging
+                                     "Metrics update"
+                                     (db-update-metrics))
+                                    (sleep 3600)))))
+
+                              (spawn-fiber
+                               (essential-task
+                                'monitor exit-channel
+                                (lambda ()
+                                  (while #t
+                                    (log-monitoring-stats)
+                                    (sleep 600)))))))
+                        (primitive-exit (get-message exit-channel))))))))
 
            ;; Most of our code is I/O so preemption doesn't matter much (it
            ;; could help while we're doing SQL requests, for instance, but it
diff --git a/bin/remote-server.in b/bin/remote-server.in
new file mode 100644
index 0000000..6425d51
--- /dev/null
+++ b/bin/remote-server.in
@@ -0,0 +1,29 @@
+#!/bin/sh
+# -*- scheme -*-
+# @configure_input@
+#GUILE_LOAD_PATH="@PACKAGE_LOAD_PATH@${GUILE_LOAD_PATH:+:}$GUILE_LOAD_PATH"
+#GUILE_LOAD_COMPILED_PATH="@PACKAGE_LOAD_COMPILED_PATH@${GUILE_LOAD_COMPILED_PATH:+:}$GUILE_LOAD_COMPILED_PATH"
+exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
+!#
+;;; remote-server.in -- Remote build server.
+;;; Copyright © 2020 Mathieu Othacehe <othacehe@gnu.org>
+;;;
+;;; This file is part of Cuirass.
+;;;
+;;; Cuirass is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Cuirass is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Cuirass.  If not, see <http://www.gnu.org/licenses/>.
+
+(use-modules (cuirass remote-server))
+
+(define* (main #:optional (args (command-line)))
+  (remote-server (cdr args)))
diff --git a/bin/remote-worker.in b/bin/remote-worker.in
new file mode 100644
index 0000000..8a3830c
--- /dev/null
+++ b/bin/remote-worker.in
@@ -0,0 +1,29 @@
+#!/bin/sh
+# -*- scheme -*-
+# @configure_input@
+#GUILE_LOAD_PATH="@PACKAGE_LOAD_PATH@${GUILE_LOAD_PATH:+:}$GUILE_LOAD_PATH"
+#GUILE_LOAD_COMPILED_PATH="@PACKAGE_LOAD_COMPILED_PATH@${GUILE_LOAD_COMPILED_PATH:+:}$GUILE_LOAD_COMPILED_PATH"
+exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
+!#
+;;; remote-worker.in -- Remote build worker.
+;;; Copyright © 2020 Mathieu Othacehe <othacehe@gnu.org>
+;;;
+;;; This file is part of Cuirass.
+;;;
+;;; Cuirass is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Cuirass is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Cuirass.  If not, see <http://www.gnu.org/licenses/>.
+
+(use-modules (cuirass remote-worker))
+
+(define* (main #:optional (args (command-line)))
+  (remote-worker (cdr args)))
diff --git a/src/cuirass/base.scm b/src/cuirass/base.scm
index c3ce900..c526f5c 100644
--- a/src/cuirass/base.scm
+++ b/src/cuirass/base.scm
@@ -22,8 +22,10 @@
 
 (define-module (cuirass base)
   #:use-module (fibers)
+  #:use-module (fibers channels)
   #:use-module (cuirass logging)
   #:use-module (cuirass database)
+  #:use-module (cuirass remote)
   #:use-module (cuirass utils)
   #:use-module ((cuirass config) #:select (%localstatedir))
   #:use-module (gnu packages)
@@ -36,9 +38,13 @@
   #:use-module ((guix config) #:select (%state-directory))
   #:use-module (git)
   #:use-module (ice-9 binary-ports)
+  #:use-module ((ice-9 suspendable-ports)
+                #:select (current-read-waiter
+                          current-write-waiter))
   #:use-module (ice-9 format)
   #:use-module (ice-9 match)
   #:use-module (ice-9 popen)
+  #:use-module (ice-9 ports internal)
   #:use-module (ice-9 rdelim)
   #:use-module (ice-9 receive)
   #:use-module (ice-9 regex)
@@ -65,11 +71,13 @@
             prepare-git
             process-specs
             evaluation-log-file
+            with-build-offload-thread
 
             ;; Parameters.
             %package-cachedir
             %gc-root-directory
             %gc-root-ttl
+            %build-remote?
             %use-substitutes?
             %fallback?))
 
@@ -102,6 +110,10 @@
    (define time-monotonic time-tai))
   (else #t))
 
+(define %build-remote?
+  ;; Define whether to use the remote build mechanism.
+  (make-parameter #f))
+
 (define %use-substitutes?
   ;; Define whether to use substitutes
   (make-parameter #f))
@@ -110,6 +122,10 @@
   ;; Define whether to fall back to building when the substituter fails.
   (make-parameter #f))
 
+(define %build-offload-channel
+  ;; Channel to communicate with the remote build server.
+  (make-parameter #f))
+
 (define %package-cachedir
   ;; Define to location of cache directory of this package.
   (make-parameter (or (getenv "CUIRASS_CACHEDIR")
@@ -436,6 +452,39 @@ Essentially this procedure inverts the inversion-of-control that
                   (raise c))
                  (x x)))))))
 
+(define (make-build-offload-thread)
+  "Return a channel used to offload builds by communicating with the remote
+build server in a separate thread.  The spawned thread also polls for build
+events sent by the remote server and calls HANDLE-BUILD-EVENT to register them
+in the database."
+  (let ((channel (make-channel)))
+    (call-with-new-thread
+     (lambda ()
+       (parameterize (((@@ (fibers internal) current-fiber) #f)
+                      (current-read-waiter (lambda (port)
+                                             (port-poll port "r")))
+                      (current-write-waiter (lambda (port)
+                                              (port-poll port "w"))))
+         (let ((socket (remote-build-socket)))
+           (let loop ()
+             (remote-build-poll socket handle-build-event)
+             (match (get-message-with-timeout channel
+                                              #:seconds 1
+                                              #:retry? #f)
+               ((drvs . systems)
+                (remote-build socket drvs systems))
+               ('timeout #f))
+             (loop))))))
+    channel))
+
+(define-syntax-rule (with-build-offload-thread body ...)
+  (parameterize ((%build-offload-channel
+                  (make-build-offload-thread)))
+    body ...))
+
+(define (build-derivations/offload drvs systems)
+  (put-message (%build-offload-channel) (cons drvs systems)))
+
 \f
 ;;;
 ;;; Building packages.
@@ -641,7 +690,14 @@ started)."
       ;; Those in VALID can be restarted.  If some of them were built in the
       ;; meantime behind our back, that's fine: 'spawn-builds' will DTRT.
       (log-message "restarting ~a pending builds" (length valid))
-      (spawn-builds store valid)
+      (if (%build-remote?)
+          (let* ((builds (map db-get-build valid))
+                 (systems (map (cut assq-ref <> #:system) builds)))
+            ;; The system could by read from the store by the remote build
+            ;; server using the derivation name, but it is far less expensive
+            ;; to read it from the database.
+            (build-derivations/offload valid systems))
+          (spawn-builds store valid))
       (log-message "done with restarted builds"))))
 
 (define (create-build-outputs build product-specs)
@@ -690,7 +746,12 @@ by PRODUCT-SPECS."
   (db-set-evaluation-status eval-id
                             (evaluation-status succeeded))
 
-  (spawn-builds store derivations)
+  (if (%build-remote?)
+      (let* ((builds (map db-get-build derivations))
+             (systems (map (cut assq-ref <> #:system) builds)))
+        ;; See the comment above regarding system read.
+        (build-derivations/offload derivations systems))
+      (spawn-builds store derivations))
 
   (let* ((results (filter-map (cut db-get-build <>) derivations))
          (status (map (cut assq-ref <> #:status) results))
diff --git a/src/cuirass/remote-server.scm b/src/cuirass/remote-server.scm
new file mode 100644
index 0000000..6217918
--- /dev/null
+++ b/src/cuirass/remote-server.scm
@@ -0,0 +1,518 @@
+;;; remote-server.scm -- Remote build server.
+;;; Copyright © 2020 Mathieu Othacehe <othacehe@gnu.org>
+;;;
+;;; This file is part of Cuirass.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (cuirass remote-server)
+  #:use-module (cuirass base)
+  #:use-module (cuirass remote)
+  #:use-module (gcrypt pk-crypto)
+  #:use-module (guix avahi)
+  #:use-module (guix base32)
+  #:use-module (guix base64)
+  #:use-module (guix config)
+  #:use-module (guix derivations)
+  #:use-module (guix records)
+  #:use-module (guix packages)
+  #:use-module (guix pki)
+  #:use-module (guix scripts)
+  #:use-module (guix store)
+  #:use-module (guix ui)
+  #:use-module (guix workers)
+  #:use-module (guix build download)
+  #:use-module (guix build syscalls)
+  #:use-module (gcrypt hash)
+  #:use-module (gcrypt pk-crypto)
+  #:use-module (simple-zmq)
+  #:use-module (rnrs bytevectors)
+  #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-26)
+  #:use-module (srfi srfi-37)
+  #:use-module (ice-9 atomic)
+  #:use-module (ice-9 match)
+  #:use-module (ice-9 q)
+  #:use-module (ice-9 rdelim)
+  #:use-module (ice-9 regex)
+  #:use-module (ice-9 threads)
+
+  #:export (remote-server))
+
+;; Indicate if the process has to be stopped.
+(define %stop-process?
+  (make-atomic-box #f))
+
+;; Whether to add build items to the store.
+(define %add-to-store?
+  (make-parameter #f))
+
+(define %cache-directory
+  (make-parameter #f))
+
+(define %private-key
+  (make-parameter #f))
+
+(define %public-key
+  (make-parameter #f))
+
+(define service-name
+  "Cuirass remote server")
+
+(define (show-help)
+  (format #t (G_ "Usage: remote-server [OPTION]...
+Start a remote build server.\n"))
+  (display (G_ "
+  -a, --add-to-store        register built items to the store"))
+  (display (G_ "
+  -b, --backend-port=PORT   listen worker connections on PORT"))
+  (display (G_ "
+  -p, --publish-port=PORT   publish substitutes on PORT"))
+  (display (G_ "
+  -c, --cache=DIRECTORY     cache built items to DIRECTORY"))
+  (display (G_ "
+      --public-key=FILE     use FILE as the public key for signatures"))
+  (display (G_ "
+      --private-key=FILE    use FILE as the private key for signatures"))
+  (newline)
+  (display (G_ "
+  -h, --help                display this help and exit"))
+  (display (G_ "
+  -V, --version             display version information and exit"))
+  (newline)
+  (show-bug-report-information))
+
+(define %options
+  (list (option '(#\h "help") #f #f
+                (lambda _
+                  (show-help)
+                  (exit 0)))
+        (option '(#\V "version") #f #f
+                (lambda _
+                  (show-version-and-exit "guix publish")))
+        (option '(#\a "add-to-store") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'add-to-store? arg result)))
+        (option '(#\b "backend-port") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'backend-port (string->number* arg) result)))
+        (option '(#\p "publish-port") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'publish-port (string->number* arg) result)))
+        (option '(#\c "cache") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'cache arg result)))
+        (option '("public-key") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'public-key-file arg result)))
+        (option '("private-key") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'private-key-file arg result)))))
+
+(define %default-options
+  `((backend-port . 5555)
+    (publish-port . 5556)
+    (public-key-file . ,%public-key-file)
+    (private-key-file . ,%private-key-file)))
+
+\f
+;;;
+;;; Build workers.
+;;;
+
+(define %workers
+  ;; Set of connected workers.
+  (make-hash-table))
+
+(define %build-queues
+  ;; Builds request queue.
+  (map (lambda (system)
+         (cons system (make-q)))
+       %supported-systems))
+
+(define (find-system-queues systems)
+  "Return the list of build queues for SYSTEMS that are not empty."
+  (filter-map (match-lambda
+                ((system . queue)
+                 (and (member system systems)
+                      (not (q-empty? queue))
+                      queue)))
+              %build-queues))
+
+(define (build-available? name)
+  "Return #t if there is some available work for the worker with the given
+NAME and #f otherwise."
+  (let* ((worker (hash-ref %workers name))
+         (systems (worker-systems worker))
+         (queues (find-system-queues systems)))
+    (not (null? queues))))
+
+(define (pop-random-build name)
+  "Pop randomly and return a build from all the build queues with available
+work for the worker with the given NAME."
+  (define (random-queue queues)
+    (list-ref queues (random (length queues))))
+
+  (let* ((worker (hash-ref %workers name))
+         (systems (worker-systems worker))
+         (queues (find-system-queues systems)))
+    (q-pop! (random-queue queues))))
+
+(define* (read-client-exp client exp)
+  "Read the given EXP sent by CLIENT."
+  (catch 'system-error
+    (lambda ()
+      (match (zmq-read-message exp)
+        (('build ('drv drv) ('system system))
+         (let ((system (or system
+                           (derivation-system
+                            (read-derivation-from-file drv)))))
+           ;; Push the derivation to the matching queue according to the
+           ;; targeted system. Also save the client ID in the queue to be able
+           ;; to send it build events later on.
+           (q-push! (assoc-ref %build-queues system)
+                    (list client drv))))))
+    (const #f)))
+
+(define* (read-worker-exp exp #:key reply-worker)
+  "Read the given EXP sent by a worker.  REPLY-WORKER is a procedure that can
+be used to reply to the worker."
+  (match (zmq-read-message exp)
+    (('worker-ready worker)
+     (let* ((worker* (sexp->worker worker))
+            (name (worker-name worker*)))
+       (info (G_ "Worker `~a' is ready.~%") name)
+       (hash-set! %workers name worker*)))
+    (('worker-request-work name)
+     (if (build-available? name)
+         (match (pop-random-build name)
+           ((client drv)
+            (reply-worker client (zmq-build-request-message drv))))
+         (reply-worker
+          (zmq-empty-delimiter)
+          (zmq-no-build-message))))))
+
+\f
+;;;
+;;; Fetch workers.
+;;;
+
+(define (zmq-fetch-workers-endpoint)
+  "inproc://fetch-workers")
+
+(define (zmq-fetch-worker-socket)
+  "Return a socket used to communicate with the fetch workers."
+  (let ((socket (zmq-create-socket %zmq-context ZMQ_DEALER))
+        (endpoint (zmq-fetch-workers-endpoint)))
+    (zmq-connect socket endpoint)
+    socket))
+
+(define (strip-store-prefix file)
+  ; Given a file name like "/gnu/store/…-foo-1.2/bin/foo", return
+  ;; "/bin/foo".
+  (let* ((len  (string-length %store-directory))
+         (base (string-drop file (+ 1 len))))
+    (match (string-index base #\/)
+      (#f    base)
+      (index (string-drop base index)))))
+
+(define (publish-nar-url publish-url store-hash)
+  "Return the URL of STORE-HASH nar substitute on PUBLISH-URL."
+  (format #f "~a/nar/gzip/~a" publish-url store-hash))
+
+(define (publish-narinfo-url publish-url store-hash)
+  "Return the URL of STORE-HASH narinfo file on PUBLISH-URL."
+  (let ((hash (and=> (string-index store-hash #\-)
+                     (cut string-take store-hash <>))))
+    (format #f "~a/~a.narinfo" publish-url hash)))
+
+(define (nar-path cache-directory output)
+  "Return the path of the NAR file for OUTPUT in CACHE-DIRECTORY."
+  (string-append cache-directory "/" (basename output) ".nar"))
+
+(define (narinfo-path cache-directory output)
+  "Return the path of the NARINFO file for OUTPUT in CACHE-DIRECTORY."
+  (string-append cache-directory "/" (basename output) ".narinfo"))
+
+(define* (sign-narinfo! narinfo)
+  "Edit the given NARINFO file to replace the worker signature by the remote
+build server signature."
+  (define (signed-string s)
+    (let* ((hash (bytevector->hash-data (sha256 (string->utf8 s))
+                                        #:key-type (key-type (%public-key)))))
+      (signature-sexp hash (%private-key) (%public-key))))
+
+  (define base64-encode-string
+    (compose base64-encode string->utf8))
+
+  (define lines
+    (call-with-input-file narinfo
+      (lambda (port)
+        (let loop ((line (read-line port))
+                   (lines '()))
+          (if (eof-object? line)
+              (reverse lines)
+              (loop (read-line port)
+                    (cons line lines)))))))
+  (let* ((lines
+          (filter (lambda (line)
+                    (not (string-match "^Signature:" line)))
+                  lines))
+         (info (format #f "~a~%" (string-join lines "\n")))
+         (signature (base64-encode-string
+                     (canonical-sexp->string (signed-string info)))))
+    (call-with-output-file narinfo
+      (lambda (port)
+        (format port "~aSignature: 1;~a;~a~%"
+                info (gethostname) signature)))))
+
+(define (download-nar cache-directory outputs url)
+  "Download in CACHE-DIRECTORY the OUTPUTS from the substitute server at URL."
+  (for-each
+   (lambda (output)
+     (let* ((path (derivation-output-path output))
+            (store-hash (strip-store-prefix path))
+            (nar-file (nar-path cache-directory store-hash))
+            (narinfo-file (narinfo-path cache-directory store-hash))
+            (nar-url (publish-nar-url url store-hash))
+            (narinfo-url (publish-narinfo-url url store-hash)))
+       (unless (file-exists? nar-file)
+         (url-fetch nar-url nar-file))
+
+       (unless (file-exists? narinfo-file)
+         (url-fetch narinfo-url narinfo-file)
+         (sign-narinfo! narinfo-file))))
+   outputs))
+
+(define (add-to-store outputs url)
+  "Add the OUTPUTS that are available from the substitute server at URL to the
+store."
+  (with-store store
+    (for-each (lambda (output)
+                (add-substitute-url store url)
+                (ensure-path store output))
+              (map derivation-output-path outputs))))
+
+(define (need-fetching? message)
+  "Return #t if the received MESSAGE implies that some output fetching is
+required and #f otherwise."
+  (match (zmq-read-message message)
+    (('build-succeeded ('drv drv) ('url url))
+     #t)
+    (else #f)))
+
+(define* (run-fetch message #:key reply)
+  "Read MESSAGE and download the corresponding build outputs.  If
+%CACHE-DIRECTORY is set, download the matching NAR and NARINFO files in this
+directory.  If %ADD-TO-STORE? is set, add the build outputs to the store.
+
+REPLY is procedure used to forward MESSAGE to the client once the build
+outputs are downloaded."
+  (define (build-outputs drv)
+    (catch 'system-error
+      (lambda ()
+        (map (match-lambda
+               ((output-name . output)
+                output))
+             (derivation-outputs
+              (read-derivation-from-file drv))))
+      (const '())))
+
+  (match (zmq-read-message message)
+    (('build-succeeded ('drv drv) ('url url))
+     (let ((outputs (build-outputs drv)))
+       (when (%add-to-store?)
+         (add-to-store outputs url))
+       (when (%cache-directory)
+         (download-nar (%cache-directory) outputs url))
+       (reply message)))))
+
+(define (start-fetch-worker name)
+  "Start a fetch worker thread with the given NAME.  This worker takes care of
+downloading build outputs.  It communicates with the remote server using a ZMQ
+socket."
+  (define (reply socket client)
+    (lambda (message)
+      (zmq-send-msg-parts-bytevector
+       socket
+       (list client (zmq-empty-delimiter) (string->bv message)))))
+
+  (call-with-new-thread
+   (lambda ()
+     (set-thread-name name)
+     (let ((socket (zmq-fetch-worker-socket)))
+       (let loop ()
+         (match (zmq-get-msg-parts-bytevector socket '())
+           ((client empty rest)
+            (let ((message (bv->string rest)))
+              (run-fetch (bv->string rest)
+                         #:reply (reply socket client)))))
+         (loop))))))
+
+\f
+;;;
+;;; ZMQ connection.
+;;;
+
+(define %zmq-context
+  (zmq-create-context))
+
+(define (zmq-backend-endpoint backend-port)
+  "Return a ZMQ endpoint string allowing TCP connections on BACKEND-PORT from
+all network interfaces."
+  (string-append "tcp://*:" (number->string backend-port)))
+
+(define (zmq-start-proxy backend-port)
+  "This procedure starts a proxy between client connections from the IPC
+frontend to the workers connected through the TCP backend."
+  (define (socket-ready? items socket)
+    (find (lambda (item)
+            (eq? (poll-item-socket item) socket))
+          items))
+
+  (let* ((client-socket
+          (zmq-create-socket %zmq-context ZMQ_ROUTER))
+         (build-socket
+          (zmq-create-socket %zmq-context ZMQ_ROUTER))
+         (fetch-socket
+          (zmq-create-socket %zmq-context ZMQ_DEALER))
+         (poll-items (list
+                      (poll-item client-socket ZMQ_POLLIN)
+                      (poll-item build-socket ZMQ_POLLIN)
+                      (poll-item fetch-socket ZMQ_POLLIN))))
+
+    (zmq-bind-socket client-socket (zmq-frontend-endpoint))
+    (zmq-bind-socket build-socket (zmq-backend-endpoint backend-port))
+    (zmq-bind-socket fetch-socket (zmq-fetch-workers-endpoint))
+
+    ;; Change frontend socket permissions.
+    (chmod (zmq-frontend-socket-name) #o666)
+
+    ;; Do not use the built-in zmq-proxy as we want to edit the envelope of
+    ;; frontend messages before forwarding them to the backend.
+    (let loop ()
+      (let ((items (zmq-poll* poll-items)))
+        ;; CLIENT -> REMOTE-SERVER.
+        (when (zmq-socket-ready? items client-socket)
+          (match (zmq-get-msg-parts-bytevector client-socket)
+            ((client empty rest)
+             (read-client-exp client (bv->string rest)))))
+        ;; BUILD-WORKER -> REMOTE-SERVER.
+        (when (zmq-socket-ready? items build-socket)
+          (match (zmq-get-msg-parts-bytevector build-socket)
+            ((worker empty rest)
+             (let ((reply-worker
+                    (lambda (client message)
+                      (zmq-send-msg-parts-bytevector
+                       build-socket
+                       (list worker
+                             (zmq-empty-delimiter)
+                             client
+                             (zmq-empty-delimiter)
+                             (string->bv message))))))
+               (read-worker-exp (bv->string rest)
+                                #:reply-worker reply-worker)))
+            ((worker empty client empty rest)
+             (let ((message (list client (zmq-empty-delimiter) rest)))
+               (if (need-fetching? (bv->string rest))
+                   (zmq-send-msg-parts-bytevector fetch-socket message)
+                   (zmq-send-msg-parts-bytevector client-socket message))))))
+        ;; FETCH-WORKER -> REMOTE-SERVER.
+        (when (zmq-socket-ready? items fetch-socket)
+          (let ((msg (zmq-get-msg-parts-bytevector fetch-socket)))
+            (zmq-send-msg-parts-bytevector client-socket msg)))
+
+        (loop)))))
+
+\f
+;;;
+;;; Entry point.
+;;;
+
+;; The PID of the publish process.
+(define %publish-pid
+  (make-atomic-box #f))
+
+;; The thread running the Avahi publish service.
+(define %avahi-thread
+  (make-atomic-box #f))
+
+(define (signal-handler)
+  "Catch SIGINT to stop the Avahi event loop and the publish process before
+exiting."
+  (sigaction SIGINT
+    (lambda (signum)
+      (let ((publish-pid (atomic-box-ref %publish-pid))
+            (avahi-thread (atomic-box-ref %avahi-thread)))
+        (atomic-box-set! %stop-process? #t)
+
+        (and publish-pid
+             (begin
+               (kill publish-pid SIGHUP)
+               (waitpid publish-pid)))
+
+        (and avahi-thread
+             (join-thread avahi-thread))
+
+        (exit 1)))))
+
+(define (remote-server args)
+  (signal-handler)
+
+  (with-error-handling
+    (let* ((opts (args-fold* args %options
+                             (lambda (opt name arg result)
+                               (leave (G_ "~A: unrecognized option~%") name))
+                             (lambda (arg result)
+                               (leave (G_ "~A: extraneous argument~%") arg))
+                             %default-options))
+           (add-to-store? (assoc-ref opts 'add-to-store?))
+           (backend-port (assoc-ref opts 'backend-port))
+           (publish-port (assoc-ref opts 'publish-port))
+           (cache (assoc-ref opts 'cache))
+           (public-key
+            (read-file-sexp
+             (assoc-ref opts 'public-key-file)))
+           (private-key
+            (read-file-sexp
+             (assoc-ref opts 'private-key-file))))
+
+      (parameterize ((%add-to-store? add-to-store?)
+                     (%cache-directory cache)
+                     (%public-key public-key)
+                     (%private-key private-key))
+
+        (atomic-box-set!
+         %publish-pid
+         (publish-server publish-port
+                         #:public-key public-key
+                         #:private-key private-key))
+
+        (atomic-box-set!
+         %avahi-thread
+         (avahi-publish-service-thread
+          service-name
+          #:type remote-server-service-type
+          #:port backend-port
+          #:stop-loop? (lambda ()
+                         (atomic-box-ref %stop-process?))
+          #:txt (list (string-append "publish="
+                                     (number->string publish-port)))))
+
+        (for-each (lambda (number)
+                    (start-fetch-worker
+                     (string-append "fetch-worker-" (number->string number))))
+                  (iota 4))
+
+        (zmq-start-proxy backend-port)))))
diff --git a/src/cuirass/remote-worker.scm b/src/cuirass/remote-worker.scm
new file mode 100644
index 0000000..c253ee5
--- /dev/null
+++ b/src/cuirass/remote-worker.scm
@@ -0,0 +1,286 @@
+;;; remote-worker.scm -- Remote build worker.
+;;; Copyright © 2020 Mathieu Othacehe <othacehe@gnu.org>
+;;;
+;;; This file is part of Cuirass.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (cuirass remote-worker)
+  #:use-module (cuirass remote)
+  #:use-module (gcrypt pk-crypto)
+  #:use-module (guix)
+  #:use-module (guix avahi)
+  #:use-module (guix config)
+  #:use-module (guix diagnostics)
+  #:use-module (guix pki)
+  #:use-module (guix records)
+  #:use-module (guix scripts)
+  #:use-module (guix ui)
+  #:use-module (guix build syscalls)
+  #:use-module (guix scripts publish)
+  #:use-module (simple-zmq)
+  #:use-module (rnrs bytevectors)
+  #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-26)
+  #:use-module (srfi srfi-34)
+  #:use-module (srfi srfi-37)
+  #:use-module (ice-9 atomic)
+  #:use-module (ice-9 match)
+  #:use-module (ice-9 threads)
+
+  #:export (remote-worker))
+
+;; Indicate if the process has to be stopped.
+(define %stop-process?
+  (make-atomic-box #f))
+
+(define (show-help)
+  (format #t (G_ "Usage: remote-worker [OPTION]...
+Start a remote build worker.\n"))
+  (display (G_ "
+  -w, --workers=COUNT       start COUNT parallel workers"))
+  (display (G_ "
+  -p, --publish-port=PORT   publish substitutes on PORT"))
+  (display (G_ "
+      --public-key=FILE     use FILE as the public key for signatures"))
+  (display (G_ "
+      --private-key=FILE    use FILE as the private key for signatures"))
+  (newline)
+  (display (G_ "
+  -h, --help                display this help and exit"))
+  (display (G_ "
+  -V, --version             display version information and exit"))
+  (newline)
+  (show-bug-report-information))
+
+(define %options
+  (list (option '(#\h "help") #f #f
+                (lambda _
+                  (show-help)
+                  (exit 0)))
+        (option '(#\V "version") #f #f
+                (lambda _
+                  (show-version-and-exit "guix publish")))
+        (option '(#\w "workers") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'workers (string->number* arg) result)))
+        (option '(#\p "publish-port") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'publish-port (string->number* arg) result)))
+        (option '("public-key") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'public-key-file arg result)))
+        (option '("private-key") #t #f
+                (lambda (opt name arg result)
+                  (alist-cons 'private-key-file arg result)))))
+
+(define %default-options
+  `((workers . 1)
+    (publish-port . 5558)
+    (public-key-file . ,%public-key-file)
+    (private-key-file . ,%private-key-file)))
+
+\f
+;;;
+;;; ZMQ connection.
+;;;
+
+(define %zmq-context
+  (zmq-create-context))
+
+(define (zmq-backend-endpoint address port)
+  "Return a ZMQ endpoint identifying the build server available by TCP at
+ADDRESS and PORT."
+  (string-append "tcp://" address ":" (number->string port)))
+
+(define (zmq-dealer-socket)
+  "The ZMQ socket to communicate with the worker threads."
+  (zmq-create-socket %zmq-context ZMQ_DEALER))
+
+\f
+;;;
+;;; Worker.
+;;;
+
+;; The port of the local publish server.
+(define %local-publish-port
+  (make-atomic-box #f))
+
+(define (server-publish-url address port)
+  "Return the server publish url at ADDRESS and PORT."
+  (string-append "http://" address ":" (number->string port)))
+
+(define (service-txt->publish-port txt)
+  "Parse the service TXT record and return the server publish port."
+  (define (parse-txt)
+    (fold (lambda (param params)
+            (match (string-split param #\=)
+              ((key value)
+               (cons (cons (string->symbol key) value)
+                     params))))
+          '()
+          txt))
+
+  (let ((params (parse-txt)))
+    (string->number (assq-ref params 'publish))))
+
+(define (service->publish-url service)
+  "Return the URL of the publish server corresponding to the service with the
+given NAME."
+  (let* ((address (avahi-service-address service))
+         (txt (avahi-service-txt service))
+         (publish-port
+          (service-txt->publish-port txt)))
+    (server-publish-url address publish-port)))
+
+(define (service->local-publish-url service)
+  "Return the URL of the local publish server."
+  (let* ((local-address (avahi-service-local-address service))
+         (port (atomic-box-ref %local-publish-port)))
+    (server-publish-url local-address port)))
+
+(define* (run-build drv service #:key reply)
+  "Build DRV and send messages upon build start, failure or completion to the
+build server identified by SERVICE-NAME using the REPLY procedure.
+
+The publish server of the build server is added to the list of the store
+substitutes-urls.  This way derivations that are not present on the worker can
+still be substituted."
+  (with-store store
+    (let ((publish-url (service->publish-url service))
+          (local-publish-url (service->local-publish-url service)))
+      (add-substitute-url store publish-url)
+      (reply (zmq-build-started-message drv))
+      (guard (c ((store-protocol-error? c)
+                 (info (G_ "Derivation `~a' build failed: ~a~%")
+                       drv (store-protocol-error-message c))
+                 (reply (zmq-build-failed-message drv))))
+        (if (build-derivations store (list drv))
+            (reply (zmq-build-succeeded-message drv local-publish-url))
+            (reply (zmq-build-failed-message drv)))))))
+
+(define* (run-command command service #:key reply)
+  "Run COMMAND.  SERVICE-NAME is the name of the build server that sent the
+command.  REPLY is a procedure that can be used to reply to this server."
+  (match (zmq-read-message command)
+    (('build ('drv drv) ('system system))
+     (info (G_ "Building `~a' derivation.~%") drv)
+     (run-build drv service #:reply reply))
+    (('no-build)
+     #t)))
+
+(define (start-worker worker service)
+  "Start a worker thread named NAME, reading commands from the DEALER socket
+and executing them.  The worker can reply on the same socket."
+  (define (reply socket client)
+    (lambda (message)
+      (zmq-send-msg-parts-bytevector
+       socket
+       (list (zmq-empty-delimiter) client
+             (zmq-empty-delimiter) (string->bv message)))))
+
+  (define (ready socket)
+    (zmq-send-msg-parts-bytevector
+     socket
+     (list (make-bytevector 0)
+           (string->bv
+            (zmq-worker-ready-message (worker->sexp worker))))))
+
+  (define (request-work socket)
+    (let ((name (worker-name worker)))
+      (zmq-send-msg-parts-bytevector
+       socket
+       (list (make-bytevector 0)
+             (string->bv (zmq-worker-request-work-message name))))))
+
+  (call-with-new-thread
+   (lambda ()
+     (set-thread-name (worker-name worker))
+     (let* ((socket (zmq-dealer-socket))
+            (address (avahi-service-address service))
+            (port (avahi-service-port service))
+            (endpoint (zmq-backend-endpoint address port)))
+       (zmq-connect socket endpoint)
+       (ready socket)
+       (let loop ()
+         (request-work socket)
+         (match (zmq-get-msg-parts-bytevector socket '())
+           ((empty client empty command)
+            (run-command (bv->string command) service
+                         #:reply (reply socket client))))
+         (sleep 1)
+         (loop))))))
+
+\f
+;;;
+;;; Entry point.
+;;;
+
+;; The PID of the publish process.
+(define %publish-pid
+  (make-atomic-box #f))
+
+(define (signal-handler)
+  "Catch SIGINT to stop the Avahi event loop and the publish process before
+exiting."
+  (sigaction SIGINT
+    (lambda (signum)
+      (let ((publish-pid (atomic-box-ref %publish-pid)))
+        (atomic-box-set! %stop-process? #t)
+
+        (and publish-pid
+             (begin
+               (kill publish-pid SIGHUP)
+               (waitpid publish-pid)))
+
+        (exit 1)))))
+
+(define (remote-worker args)
+  (with-error-handling
+    (let* ((opts (args-fold* args %options
+                             (lambda (opt name arg result)
+                               (leave (G_ "~A: unrecognized option~%") name))
+                             (lambda (arg result)
+                               (leave (G_ "~A: extraneous argument~%") arg))
+                             %default-options))
+           (workers (assoc-ref opts 'workers))
+           (publish-port (assoc-ref opts 'publish-port))
+           (public-key
+            (read-file-sexp
+             (assoc-ref opts 'public-key-file)))
+           (private-key
+            (read-file-sexp
+             (assoc-ref opts 'private-key-file))))
+
+      (atomic-box-set! %local-publish-port publish-port)
+
+      (atomic-box-set!
+       %publish-pid
+       (publish-server publish-port
+                       #:public-key public-key
+                       #:private-key private-key))
+
+      (avahi-browse-service-thread
+       (lambda (action service)
+         (case action
+           ((new-service)
+            (for-each (lambda (n)
+                        (start-worker (worker
+                                       (name (generate-worker-name))
+                                       (systems '("x86_64-linux")))
+                                      service))
+                      (iota workers)))))
+       #:types (list remote-server-service-type)
+       #:stop-loop? (lambda ()
+                      (atomic-box-ref %stop-process?))))))
diff --git a/src/cuirass/remote.scm b/src/cuirass/remote.scm
new file mode 100644
index 0000000..7a71391
--- /dev/null
+++ b/src/cuirass/remote.scm
@@ -0,0 +1,292 @@
+;;; remote.scm -- Build on remote machines.
+;;; Copyright © 2020 Mathieu Othacehe <othacehe@gnu.org>
+;;;
+;;; This file is part of Cuirass.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (cuirass remote)
+  #:use-module (guix config)
+  #:use-module (guix derivations)
+  #:use-module (guix records)
+  #:use-module (guix store)
+  #:use-module (guix ui)
+  #:use-module (guix build download)
+  #:use-module ((guix build utils) #:select (mkdir-p))
+  #:use-module (guix scripts publish)
+  #:use-module (simple-zmq)
+  #:use-module (rnrs bytevectors)
+  #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-26)
+  #:use-module (ice-9 match)
+  #:use-module (ice-9 rdelim)
+  #:export (worker
+            worker?
+            worker-name
+            worker-systems
+            worker->sexp
+            sexp->worker
+            generate-worker-name
+
+            publish-server
+            add-substitute-url
+
+            zmq-frontend-socket-name
+            zmq-frontend-endpoint
+            zmq-poll*
+            zmq-socket-ready?
+            zmq-empty-delimiter
+
+            zmq-build-request-message
+            zmq-no-build-message
+            zmq-build-started-message
+            zmq-build-failed-message
+            zmq-build-succeeded-message
+            zmq-worker-ready-message
+            zmq-worker-request-work-message
+            zmq-read-message
+
+            remote-server-service-type
+            remote-build-socket
+            remote-build
+            remote-build-poll))
+
+\f
+;;;
+;;; Workers.
+;;;
+
+(define-record-type* <worker>
+  worker make-worker
+  worker?
+  (name           worker-name)
+  (systems        worker-systems))
+
+(define (worker->sexp worker)
+  "Return an sexp describing WORKER."
+  (let ((name (worker-name worker))
+        (systems (worker-systems worker)))
+    `(worker
+      (name ,name)
+      (systems ,systems))))
+
+(define (sexp->worker sexp)
+  "Turn SEXP, an sexp as returned by 'worker->sexp', into a <worker> record."
+  (match sexp
+    (('worker ('name name) ('systems systems))
+     (worker
+      (name name)
+      (systems systems)))))
+
+
+(define %seed
+  (seed->random-state
+   (logxor (getpid) (car (gettimeofday)))))
+
+(define (integer->alphanumeric-char n)
+  "Map N, an integer in the [0..62] range, to an alphanumeric character."
+  (cond ((< n 10)
+         (integer->char (+ (char->integer #\0) n)))
+        ((< n 36)
+         (integer->char (+ (char->integer #\A) (- n 10))))
+        ((< n 62)
+         (integer->char (+ (char->integer #\a) (- n 36))))
+        (else
+         (error "integer out of bounds" n))))
+
+(define (random-string len)
+  "Compute a random string of size LEN where each character is alphanumeric."
+  (let loop ((chars '())
+             (len len))
+    (if (zero? len)
+        (list->string chars)
+        (let ((n (random 62 %seed)))
+          (loop (cons (integer->alphanumeric-char n) chars)
+                (- len 1))))))
+
+(define (generate-worker-name)
+  "Return the service name of the server."
+  (string-append (gethostname) "-" (random-string 4)))
+
+\f
+;;;
+;;; Store publishing.
+;;;
+
+(define (add-substitute-url store url)
+  "Add URL to the list of STORE substitutes-urls."
+  (set-build-options store
+                     #:use-substitutes? #t
+                     #:fallback? #f
+                     #:keep-going? #t
+                     #:print-build-trace #t
+                     #:build-verbosity 1
+                     #:substitute-urls
+                     (cons url %default-substitute-urls)))
+
+(define* (publish-server port
+                         #:key
+                         public-key
+                         private-key)
+  "This procedure starts a publishing server listening on PORT in a new
+process and returns the pid of the forked process.  Use PUBLIC-KEY and
+PRIVATE-KEY to sign narinfos."
+  (match (primitive-fork)
+    (0
+     (parameterize ((%public-key public-key)
+                    (%private-key private-key))
+       (with-store store
+         (let* ((address (make-socket-address AF_INET INADDR_ANY 0))
+                (socket-address
+                 (make-socket-address (sockaddr:fam address)
+                                      (sockaddr:addr address)
+                                      port))
+                (socket (open-server-socket socket-address)))
+           (run-publish-server socket store
+                               #:compressions
+                               (list %default-gzip-compression))))))
+    (pid pid)))
+
+\f
+;;;
+;;; ZMQ.
+;;;
+
+(define %zmq-context
+  (zmq-create-context))
+
+(define (zmq-frontend-socket-name)
+  "Return the name of the ZMQ frontend socket."
+  (string-append %state-directory "/remote-build-socket"))
+
+(define (zmq-frontend-endpoint)
+  "Return a ZMQ endpoint allowing client connections using the IPC transport."
+  (string-append "ipc://" (zmq-frontend-socket-name)))
+
+(define (EINTR-safe proc)
+  "Return a variant of PROC that catches EINTR 'zmq-error' exceptions and
+retries a call to PROC."
+  (define (safe . args)
+    (catch 'zmq-error
+      (lambda ()
+        (apply proc args))
+      (lambda (key errno . rest)
+        (if (= errno EINTR)
+            (apply safe args)
+            (apply throw key errno rest)))))
+
+  safe)
+
+(define zmq-poll*
+  ;; Return a variant of ZMQ-POLL that catches EINTR errors.
+  (EINTR-safe zmq-poll))
+
+(define (zmq-socket-ready? items socket)
+  "Return #t if the given SOCKET is part of ITEMS, a list returned by a
+'zmq-poll' call, return #f otherwise."
+  (find (lambda (item)
+          (eq? (poll-item-socket item) socket))
+        items))
+
+(define (zmq-read-message msg)
+  (call-with-input-string msg read))
+
+(define (zmq-empty-delimiter)
+  "Return an empty ZMQ delimiter used to format message envelopes."
+  (make-bytevector 0))
+
+;; ZMQ Messages.
+(define* (zmq-build-request-message drv #:optional system)
+  "Return a message requesting the build of DRV for SYSTEM."
+  (format #f "~s" `(build (drv ,drv) (system ,system))))
+
+(define (zmq-no-build-message)
+  "Return a message that indicates that no builds are available."
+  (format #f "~s" `(no-build)))
+
+(define (zmq-build-started-message drv)
+  "Return a message that indicates that the build of DRV has started."
+  (format #f "~s" `(build-started (drv ,drv))))
+
+(define (zmq-build-failed-message drv)
+  "Return a message that indicates that the build of DRV has failed."
+  (format #f "~s" `(build-failed (drv ,drv))))
+
+(define (zmq-build-succeeded-message drv url)
+  "Return a message that indicates that the build of DRV is done."
+  (format #f "~s" `(build-succeeded (drv ,drv) (url ,url))))
+
+(define (zmq-worker-ready-message worker)
+  "Return a message that indicates that WORKER is ready."
+  (format #f "~s" `(worker-ready ,worker)))
+
+(define (zmq-worker-request-work-message name)
+  "Return a message that indicates that WORKER is requesting work."
+  (format #f "~s" `(worker-request-work ,name)))
+
+\f
+;;;
+;;; Remote builds.
+;;;
+
+(define remote-server-service-type
+  "_remote-server._tcp")
+
+(define (remote-build-socket)
+  "Return a socket used to communicate with the remote build server."
+  (let ((socket (zmq-create-socket %zmq-context ZMQ_DEALER))
+        (endpoint (zmq-frontend-endpoint)))
+    (zmq-connect socket endpoint)
+    socket))
+
+(define* (remote-build socket drvs systems)
+  "Builds DRVS using the remote build mechanism.  A build command is sent on
+SOCKET to the build server for each derivation.
+
+SYSTEMS is a list describing the systems of each derivations in the DRVS list.
+It is used for performance reasons, so that the remote server doesn't need to
+call 'read-derivation-from-file' for each derivation, which can be an
+expensive operation."
+  (for-each
+   (lambda (drv system)
+     ;; We need to prefix the command with an empty delimiter
+     ;; because the DEALER socket is connected to a ROUTER
+     ;; socket. See "zmq-start-proxy" procedure.
+     (zmq-send-msg-parts-bytevector
+      socket
+      (list (make-bytevector 0)
+            (string->bv (zmq-build-request-message drv system)))))
+   drvs systems))
+
+(define* (remote-build-poll socket event-proc
+                            #:key
+                            (timeout 1000))
+  "Poll SOCKET for messages and call EVENT-PROC each time a build event is
+received, return if no event occured for TIMEOUT milliseconds."
+  (define (parse-result result)
+    (match (zmq-read-message result)
+      (('build-started ('drv drv))
+       (event-proc (list 'build-started drv)))
+      (('build-succeeded ('drv drv) ('url url))
+       (event-proc (list 'build-succeeded drv)))
+      (('build-failed ('drv drv))
+       (event-proc (list 'build-failed drv)))))
+
+  (let* ((poll-items (list
+                      (poll-item socket ZMQ_POLLIN)))
+         (items (zmq-poll* poll-items timeout)))
+    (when (zmq-socket-ready? items socket)
+      (match (zmq-get-msg-parts-bytevector socket '())
+        ((empty result)
+         (parse-result (bv->string result)))))))
-- 
2.29.2


[-- Attachment #3: remote.png --]
[-- Type: image/png, Size: 33572 bytes --]

             reply	other threads:[~2020-12-02 11:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02 11:04 Mathieu Othacehe [this message]
2020-12-02 11:25 ` [bug#45006] cuirass: Add remote build support zimoun
2020-12-21 13:40   ` Mathieu Othacehe
2020-12-21 14:13     ` zimoun
2020-12-21 15:08       ` Mathieu Othacehe
2020-12-21 16:41         ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czzso4dj.fsf@gnu.org \
    --to=othacehe@gnu.org \
    --cc=45006@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.