unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Experiment in generating multi-layer Docker images with guix pack
@ 2020-03-21 23:24 Christopher Baines
  2020-03-21 23:24 ` [PATCH 1/3] Rename (guix docker) to (guix build docker) Christopher Baines
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Christopher Baines @ 2020-03-21 23:24 UTC (permalink / raw)
  To: guix-devel

These patches are very rough, and not ready, but do at least work in some
limited capacity. I've been testing with the following commands:

  guix pack --format=docker guile@2.2.6
  guix pack --format=docker guile@2.2.7

With the previous Docker image generation implementation, two different ~130MB
images would be generated. These patches mean that each .tar.gz file generated
by guix pack contains a ~53MB layer which contains the profile and directly
referenced store items, and then a ~77MB layer with all the other store items
which is identical for both the 2.2.6 and 2.2.7 pack file.

I think it could be useful to support multiple different strategies for
generating layers for Docker images, with different trade-offs. This approach
using two layers should make the resulting images more efficient to use in the
case where like the guile example above, where the packages you run guix pack
with have exactly matching inputs.

This could often be the case if you're developing an application, packaging it
with Guix, then using guix pack to generate a Docker image which you
deploy. With the single layer approach, if you change the application code,
you'll get an entirely different image. I haven't tried this out, but my hope
is that by generating a common base layer, if you change the application code
only the top layer of the Docker image will change, meaning you'll only have
to deploy that, rather than having to deploy the entire image. If you're
deploying the images across a network, having less data to send around can
save time, and reduce the amount of space required to store the images.

As well as these behaviour changes, these patches also modify the
implementation. Rather than having some build side code that's used in the
pack and vm module gexpressions, these patches introduce two new record types:
<docker-image-layer> and <docker-image>. This at least structures the
derivations so that each layer is represented by a derivation, and then
there's a derivation for the image itself, which is a little more efficient in
terms of computation.

What do people think about generating multi-layer images, and using record
types to represent the layers and image?

Thanks,

Chris

[PATCH 1/3] Rename (guix docker) to (guix build docker)
[PATCH 2/3] Make guix pack work with the new docker image
[PATCH 3/3] Generate two layers for docker images in guix pack

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] Rename (guix docker) to (guix build docker)
  2020-03-21 23:24 Experiment in generating multi-layer Docker images with guix pack Christopher Baines
@ 2020-03-21 23:24 ` Christopher Baines
  2020-03-21 23:24 ` [PATCH 2/3] Make guix pack work with the new docker image gexpressions Christopher Baines
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Christopher Baines @ 2020-03-21 23:24 UTC (permalink / raw)
  To: guix-devel

---
 Makefile.am           |   2 +-
 gnu/system/vm.scm     |   6 +-
 guix/docker.scm       | 274 ------------------------------------------
 guix/scripts/pack.scm |   6 +-
 4 files changed, 7 insertions(+), 281 deletions(-)
 delete mode 100644 guix/docker.scm

diff --git a/Makefile.am b/Makefile.am
index d5829f3633..bce2a31184 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -67,7 +67,6 @@ MODULES =					\
   guix/ci.scm					\
   guix/cpio.scm					\
   guix/deprecation.scm				\
-  guix/docker.scm	   			\
   guix/json.scm					\
   guix/records.scm				\
   guix/pki.scm					\
@@ -156,6 +155,7 @@ MODULES =					\
   guix/status.scm				\
   guix/build/android-ndk-build-system.scm	\
   guix/build/ant-build-system.scm		\
+  guix/build/docker.scm	   		\
   guix/build/download.scm			\
   guix/build/download-nar.scm			\
   guix/build/cargo-build-system.scm		\
diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm
index d1c131ecb4..0b0e3d10b8 100644
--- a/gnu/system/vm.scm
+++ b/gnu/system/vm.scm
@@ -525,10 +525,10 @@ system."
         (name  (string-append name ".tar.gz"))
         (graph "system-graph"))
     (define build
-      (with-extensions (cons guile-json-3         ;for (guix docker)
+      (with-extensions (cons guile-json-3         ;for (guix build docker)
                              gcrypt-sqlite3&co)   ;for (guix store database)
         (with-imported-modules `(,@(source-module-closure
-                                    '((guix docker)
+                                    '((guix build docker)
                                       (guix store database)
                                       (guix build utils)
                                       (guix build store-copy)
@@ -536,7 +536,7 @@ system."
                                     #:select? not-config?)
                                  ((guix config) => ,(make-config.scm)))
           #~(begin
-              (use-modules (guix docker)
+              (use-modules (guix build docker)
                            (guix build utils)
                            (gnu build vm)
                            (srfi srfi-19)
diff --git a/guix/docker.scm b/guix/docker.scm
deleted file mode 100644
index 97ac6d982b..0000000000
--- a/guix/docker.scm
+++ /dev/null
@@ -1,274 +0,0 @@
-;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2017 Ricardo Wurmus <rekado@elephly.net>
-;;; Copyright © 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
-;;; Copyright © 2018 Chris Marusich <cmmarusich@gmail.com>
-;;;
-;;; This file is part of GNU Guix.
-;;;
-;;; GNU Guix is free software; you can redistribute it and/or modify it
-;;; under the terms of the GNU General Public License as published by
-;;; the Free Software Foundation; either version 3 of the License, or (at
-;;; your option) any later version.
-;;;
-;;; GNU Guix is distributed in the hope that it will be useful, but
-;;; WITHOUT ANY WARRANTY; without even the implied warranty of
-;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-;;; GNU General Public License for more details.
-;;;
-;;; You should have received a copy of the GNU General Public License
-;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
-
-(define-module (guix docker)
-  #:use-module (gcrypt hash)
-  #:use-module (guix base16)
-  #:use-module ((guix build utils)
-                #:select (mkdir-p
-                          delete-file-recursively
-                          with-directory-excursion
-                          invoke))
-  #:use-module (gnu build install)
-  #:use-module (json)                             ;guile-json
-  #:use-module (srfi srfi-1)
-  #:use-module (srfi srfi-19)
-  #:use-module (srfi srfi-26)
-  #:use-module ((texinfo string-utils)
-                #:select (escape-special-chars))
-  #:use-module (rnrs bytevectors)
-  #:use-module (ice-9 ftw)
-  #:use-module (ice-9 match)
-  #:export (build-docker-image))
-
-;; Generate a 256-bit identifier in hexadecimal encoding for the Docker image.
-(define docker-id
-  (compose bytevector->base16-string sha256 string->utf8))
-
-(define (layer-diff-id layer)
-  "Generate a layer DiffID for the given LAYER archive."
-  (string-append "sha256:" (bytevector->base16-string (file-sha256 layer))))
-
-;; This is the semantic version of the JSON metadata schema according to
-;; https://github.com/docker/docker/blob/master/image/spec/v1.2.md
-;; It is NOT the version of the image specification.
-(define schema-version "1.0")
-
-(define (image-description id time)
-  "Generate a simple image description."
-  `((id . ,id)
-    (created . ,time)
-    (container_config . #nil)))
-
-(define (canonicalize-repository-name name)
-  "\"Repository\" names are restricted to roughtl [a-z0-9_.-].
-Return a version of TAG that follows these rules."
-  (define ascii-letters
-    (string->char-set "abcdefghijklmnopqrstuvwxyz"))
-
-  (define separators
-    (string->char-set "_-."))
-
-  (define repo-char-set
-    (char-set-union char-set:digit ascii-letters separators))
-
-  (string-map (lambda (chr)
-                (if (char-set-contains? repo-char-set chr)
-                    chr
-                    #\.))
-              (string-trim (string-downcase name) separators)))
-
-(define* (manifest path id #:optional (tag "guix"))
-  "Generate a simple image manifest."
-  (let ((tag (canonicalize-repository-name tag)))
-    `#(((Config . "config.json")
-        (RepoTags . #(,(string-append tag ":latest")))
-        (Layers . #(,(string-append id "/layer.tar")))))))
-
-;; According to the specifications this is required for backwards
-;; compatibility.  It duplicates information provided by the manifest.
-(define* (repositories path id #:optional (tag "guix"))
-  "Generate a repositories file referencing PATH and the image ID."
-  `((,(canonicalize-repository-name tag) . ((latest . ,id)))))
-
-;; See https://github.com/opencontainers/image-spec/blob/master/config.md
-(define* (config layer time arch #:key entry-point (environment '()))
-  "Generate a minimal image configuration for the given LAYER file."
-  ;; "architecture" must be values matching "platform.arch" in the
-  ;; runtime-spec at
-  ;; https://github.com/opencontainers/runtime-spec/blob/v1.0.0-rc2/config.md#platform
-  `((architecture . ,arch)
-    (comment . "Generated by GNU Guix")
-    (created . ,time)
-    (config . ,`((env . ,(list->vector
-                          (map (match-lambda
-                                 ((name . value)
-                                  (string-append name "=" value)))
-                               environment)))
-                 ,@(if entry-point
-                       `((entrypoint . ,(list->vector entry-point)))
-                       '())))
-    (container_config . #nil)
-    (os . "linux")
-    (rootfs . ((type . "layers")
-               (diff_ids . #(,(layer-diff-id layer)))))))
-
-(define %tar-determinism-options
-  ;; GNU tar options to produce archives deterministically.
-  '("--sort=name" "--mtime=@1"
-    "--owner=root:0" "--group=root:0"))
-
-(define directive-file
-  ;; Return the file or directory created by a 'evaluate-populate-directive'
-  ;; directive.
-  (match-lambda
-    ((source '-> target)
-     (string-trim source #\/))
-    (('directory name _ ...)
-     (string-trim name #\/))))
-
-(define* (build-docker-image image paths prefix
-                             #:key
-                             (repository "guix")
-                             (extra-files '())
-                             (transformations '())
-                             (system (utsname:machine (uname)))
-                             database
-                             entry-point
-                             (environment '())
-                             compressor
-                             (creation-time (current-time time-utc)))
-  "Write to IMAGE a Docker image archive containing the given PATHS.  PREFIX
-must be a store path that is a prefix of any store paths in PATHS.  REPOSITORY
-is a descriptive name that will show up in \"REPOSITORY\" column of the output
-of \"docker images\".
-
-When DATABASE is true, copy it to /var/guix/db in the image and create
-/var/guix/gcroots and friends.
-
-When ENTRY-POINT is true, it must be a list of strings; it is stored as the
-entry point in the Docker image JSON structure.
-
-ENVIRONMENT must be a list of name/value pairs.  It specifies the environment
-variables that must be defined in the resulting image.
-
-EXTRA-FILES must be a list of directives for 'evaluate-populate-directive'
-describing non-store files that must be created in the image.
-
-TRANSFORMATIONS must be a list of (OLD -> NEW) tuples describing how to
-transform the PATHS.  Any path in PATHS that begins with OLD will be rewritten
-in the Docker image so that it begins with NEW instead.  If a path is a
-non-empty directory, then its contents will be recursively added, as well.
-
-SYSTEM is a GNU triplet (or prefix thereof) of the system the binaries in
-PATHS are for; it is used to produce metadata in the image.  Use COMPRESSOR, a
-command such as '(\"gzip\" \"-9n\"), to compress IMAGE.  Use CREATION-TIME, a
-SRFI-19 time-utc object, as the creation time in metadata."
-  (define (sanitize path-fragment)
-    (escape-special-chars
-     ;; GNU tar strips the leading slash off of absolute paths before applying
-     ;; the transformations, so we need to do the same, or else our
-     ;; replacements won't match any paths.
-     (string-trim path-fragment #\/)
-     ;; Escape the basic regexp special characters (see: "(sed) BRE syntax").
-     ;; We also need to escape "/" because we use it as a delimiter.
-     "/*.^$[]\\"
-     #\\))
-  (define transformation->replacement
-    (match-lambda
-      ((old '-> new)
-       ;; See "(tar) transform" for details on the expression syntax.
-       (string-append "s/^" (sanitize old) "/" (sanitize new) "/"))))
-  (define (transformations->expression transformations)
-    (let ((replacements (map transformation->replacement transformations)))
-      (string-append
-       ;; Avoid transforming link targets, since that would break some links
-       ;; (e.g., symlinks that point to an absolute store path).
-       "flags=rSH;"
-       (string-join replacements ";")
-       ;; Some paths might still have a leading path delimiter even after tar
-       ;; transforms them (e.g., "/a/b" might be transformed into "/b"), so
-       ;; strip any leading path delimiters that remain.
-       ";s,^//*,,")))
-  (define transformation-options
-    (if (eq? '() transformations)
-        '()
-        `("--transform" ,(transformations->expression transformations))))
-  (let* ((directory "/tmp/docker-image") ;temporary working directory
-         (id (docker-id prefix))
-         (time (date->string (time-utc->date creation-time) "~4"))
-         (arch (let-syntax ((cond* (syntax-rules ()
-                                     ((_ (pattern clause) ...)
-                                      (cond ((string-prefix? pattern system)
-                                             clause)
-                                            ...
-                                            (else
-                                             (error "unsupported system"
-                                                    system)))))))
-                 (cond* ("x86_64" "amd64")
-                        ("i686"   "386")
-                        ("arm"    "arm")
-                        ("mips64" "mips64le")))))
-    ;; Make sure we start with a fresh, empty working directory.
-    (mkdir directory)
-    (with-directory-excursion directory
-      (mkdir id)
-      (with-directory-excursion id
-        (with-output-to-file "VERSION"
-          (lambda () (display schema-version)))
-        (with-output-to-file "json"
-          (lambda () (scm->json (image-description id time))))
-
-        ;; Create a directory for the non-store files that need to go into the
-        ;; archive.
-        (mkdir "extra")
-
-        (with-directory-excursion "extra"
-          ;; Create non-store files.
-          (for-each (cut evaluate-populate-directive <> "./")
-                    extra-files)
-
-          (when database
-            ;; Initialize /var/guix, assuming PREFIX points to a profile.
-            (install-database-and-gc-roots "." database prefix))
-
-          (apply invoke "tar" "-cf" "../layer.tar"
-                 `(,@transformation-options
-                   ,@%tar-determinism-options
-                   ,@paths
-                   ,@(scandir "."
-                              (lambda (file)
-                                (not (member file '("." ".."))))))))
-
-        ;; It is possible for "/" to show up in the archive, especially when
-        ;; applying transformations.  For example, the transformation
-        ;; "s,^/a,," will (perhaps surprisingly) cause GNU tar to transform
-        ;; the path "/a" into "/".  The presence of "/" in the archive is
-        ;; probably benign, but it is definitely safe to remove it, so let's
-        ;; do that.  This fails when "/" is not in the archive, so use system*
-        ;; instead of invoke to avoid an exception in that case, and redirect
-        ;; stderr to the bit bucket to avoid "Exiting with failure status"
-        ;; error messages.
-        (with-error-to-port (%make-void-port "w")
-          (lambda ()
-            (system* "tar" "--delete" "/" "-f" "layer.tar")))
-
-        (delete-file-recursively "extra"))
-
-      (with-output-to-file "config.json"
-        (lambda ()
-          (scm->json (config (string-append id "/layer.tar")
-                             time arch
-                             #:environment environment
-                             #:entry-point entry-point))))
-      (with-output-to-file "manifest.json"
-        (lambda ()
-          (scm->json (manifest prefix id repository))))
-      (with-output-to-file "repositories"
-        (lambda ()
-          (scm->json (repositories prefix id repository)))))
-
-    (apply invoke "tar" "-cf" image "-C" directory
-           `(,@%tar-determinism-options
-             ,@(if compressor
-                   (list "-I" (string-join compressor))
-                   '())
-             "."))
-    (delete-file-recursively directory)))
diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index 652b4c63c4..ee0395ea00 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -503,17 +503,17 @@ the image."
   (define defmod 'define-module)                  ;trick Geiser
 
   (define build
-    ;; Guile-JSON and Guile-Gcrypt are required by (guix docker).
+    ;; Guile-JSON and Guile-Gcrypt are required by (guix build docker).
     (with-extensions (list guile-json-3 guile-gcrypt)
       (with-imported-modules `(((guix config) => ,(make-config.scm))
                                ,@(source-module-closure
-                                  `((guix docker)
+                                  `((guix build docker)
                                     (guix build store-copy)
                                     (guix profiles)
                                     (guix search-paths))
                                   #:select? not-config?))
         #~(begin
-            (use-modules (guix docker) (guix build store-copy)
+            (use-modules (guix build docker) (guix build store-copy)
                          (guix profiles) (guix search-paths)
                          (srfi srfi-1) (srfi srfi-19)
                          (ice-9 match))
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Make guix pack work with the new docker image gexpressions
  2020-03-21 23:24 Experiment in generating multi-layer Docker images with guix pack Christopher Baines
  2020-03-21 23:24 ` [PATCH 1/3] Rename (guix docker) to (guix build docker) Christopher Baines
@ 2020-03-21 23:24 ` Christopher Baines
  2020-03-21 23:24 ` [PATCH 3/3] Generate two layers for docker images in guix pack Christopher Baines
  2020-03-26 12:03 ` Experiment in generating multi-layer Docker images with " Ludovic Courtès
  3 siblings, 0 replies; 7+ messages in thread
From: Christopher Baines @ 2020-03-21 23:24 UTC (permalink / raw)
  To: guix-devel

---
 Makefile.am           |   1 +
 guix/build/docker.scm | 289 ++++++++++++++++++++++++++++++++++++++++++
 guix/docker.scm       | 246 +++++++++++++++++++++++++++++++++++
 guix/scripts/pack.scm | 178 +++++++++++++-------------
 4 files changed, 626 insertions(+), 88 deletions(-)
 create mode 100644 guix/build/docker.scm
 create mode 100644 guix/docker.scm

diff --git a/Makefile.am b/Makefile.am
index bce2a31184..725d68d0e8 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -76,6 +76,7 @@ MODULES =					\
   guix/utils.scm				\
   guix/sets.scm					\
   guix/modules.scm				\
+  guix/docker.scm				\
   guix/download.scm				\
   guix/discovery.scm				\
   guix/bzr-download.scm            		\
diff --git a/guix/build/docker.scm b/guix/build/docker.scm
new file mode 100644
index 0000000000..54dad749ab
--- /dev/null
+++ b/guix/build/docker.scm
@@ -0,0 +1,289 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2017 Ricardo Wurmus <rekado@elephly.net>
+;;; Copyright © 2017, 2018, 2019 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2018 Chris Marusich <cmmarusich@gmail.com>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (guix build docker)
+  #:use-module (gcrypt hash)
+  #:use-module (guix base16)
+  #:use-module ((guix build utils)
+                #:select (mkdir-p
+                          delete-file-recursively
+                          with-directory-excursion
+                          invoke))
+  #:use-module (gnu build install)
+  #:use-module (json)                             ;guile-json
+  #:use-module (srfi srfi-1)
+  #:use-module (srfi srfi-19)
+  #:use-module (srfi srfi-26)
+  #:use-module ((texinfo string-utils)
+                #:select (escape-special-chars))
+  #:use-module (rnrs bytevectors)
+  #:use-module (ice-9 ftw)
+  #:use-module (ice-9 match)
+  #:export (docker-id
+            schema-version
+            image-description
+
+            %tar-determinism-options
+
+            config
+            manifest
+            repositories))
+
+;; Generate a 256-bit identifier in hexadecimal encoding for the Docker image.
+(define docker-id
+  (compose bytevector->base16-string sha256 string->utf8))
+
+(define (layer-diff-id layer)
+  "Generate a layer DiffID for the given LAYER archive."
+  (string-append "sha256:" (bytevector->base16-string (file-sha256 layer))))
+
+;; This is the semantic version of the JSON metadata schema according to
+;; https://github.com/docker/docker/blob/master/image/spec/v1.2.md
+;; It is NOT the version of the image specification.
+(define schema-version "1.0")
+
+(define (image-description id time)
+  "Generate a simple image description."
+  `((id . ,id)
+    (created . ,time)
+    (container_config . #nil)))
+
+(define (canonicalize-repository-name name)
+  "\"Repository\" names are restricted to roughtl [a-z0-9_.-].
+Return a version of TAG that follows these rules."
+  (define ascii-letters
+    (string->char-set "abcdefghijklmnopqrstuvwxyz"))
+
+  (define separators
+    (string->char-set "_-."))
+
+  (define repo-char-set
+    (char-set-union char-set:digit ascii-letters separators))
+
+  (string-map (lambda (chr)
+                (if (char-set-contains? repo-char-set chr)
+                    chr
+                    #\.))
+              (string-trim (string-downcase name) separators)))
+
+(define* (manifest layer-ids #:optional (tag "guix"))
+  "Generate a simple image manifest."
+  (let ((tag (canonicalize-repository-name tag)))
+    `#(((Config . "config.json")
+        (RepoTags . #(,(string-append tag ":latest")))
+        (Layers . ,(list->vector
+                    (map (lambda (id)
+                           (string-append id "/layer.tar"))
+                         layer-ids)))))))
+
+;; According to the specifications this is required for backwards
+;; compatibility.  It duplicates information provided by the manifest.
+(define* (repositories id #:optional (tag "guix"))
+  "Generate a repositories file referencing PATH and the image ID."
+  `((,(canonicalize-repository-name tag) . ((latest . ,id)))))
+
+;; See https://github.com/opencontainers/image-spec/blob/master/config.md
+(define* (config layers time arch #:key entry-point (environment '()))
+  "Generate a minimal image configuration for the given LAYER file."
+  ;; "architecture" must be values matching "platform.arch" in the
+  ;; runtime-spec at
+  ;; https://github.com/opencontainers/runtime-spec/blob/v1.0.0-rc2/config.md#platform
+  `((architecture . ,arch)
+    (comment . "Generated by GNU Guix")
+    (created . ,time)
+    (config . ,`((env . ,(list->vector
+                          (map (match-lambda
+                                 ((name . value)
+                                  (string-append name "=" value)))
+                               environment)))
+                 ,@(if entry-point
+                       `((entrypoint . ,(list->vector entry-point)))
+                       '())))
+    (container_config . #nil)
+    (os . "linux")
+    (rootfs . ((type . "layers")
+               (diff_ids . ,(list->vector
+                             (map layer-diff-id layers)))))))
+
+(define %tar-determinism-options
+  ;; GNU tar options to produce archives deterministically.
+  '("--sort=name" "--mtime=@1"
+    "--owner=root:0" "--group=root:0"))
+
+(define directive-file
+  ;; Return the file or directory created by a 'evaluate-populate-directive'
+  ;; directive.
+  (match-lambda
+    ((source '-> target)
+     (string-trim source #\/))
+    (('directory name _ ...)
+     (string-trim name #\/))))
+
+(define (transformations->expression transformations)
+  (define (sanitize path-fragment)
+    (escape-special-chars
+     ;; GNU tar strips the leading slash off of absolute paths before applying
+     ;; the transformations, so we need to do the same, or else our
+     ;; replacements won't match any paths.
+     (string-trim path-fragment #\/)
+     ;; Escape the basic regexp special characters (see: "(sed) BRE syntax").
+     ;; We also need to escape "/" because we use it as a delimiter.
+     "/*.^$[]\\"
+     #\\))
+
+  (define transformation->replacement
+    (match-lambda
+      ((old '-> new)
+       ;; See "(tar) transform" for details on the expression syntax.
+       (string-append "s/^" (sanitize old) "/" (sanitize new) "/"))))
+
+  (let ((replacements (map transformation->replacement transformations)))
+    (string-append
+     ;; Avoid transforming link targets, since that would break some links
+     ;; (e.g., symlinks that point to an absolute store path).
+     "flags=rSH;"
+     (string-join replacements ";")
+     ;; Some paths might still have a leading path delimiter even after tar
+     ;; transforms them (e.g., "/a/b" might be transformed into "/b"), so
+     ;; strip any leading path delimiters that remain.
+     ";s,^//*,,")))
+
+;; (define* (build-docker-image image paths prefix
+;;                              #:key
+;;                              (repository "guix")
+;;                              (extra-files '())
+;;                              (transformations '())
+;;                              (system (utsname:machine (uname)))
+;;                              database
+;;                              entry-point
+;;                              (environment '())
+;;                              compressor
+;;                              (creation-time (current-time time-utc)))
+;;   "Write to IMAGE a Docker image archive containing the given PATHS.  PREFIX
+;; must be a store path that is a prefix of any store paths in PATHS.  REPOSITORY
+;; is a descriptive name that will show up in \"REPOSITORY\" column of the output
+;; of \"docker images\".
+
+;; When DATABASE is true, copy it to /var/guix/db in the image and create
+;; /var/guix/gcroots and friends.
+
+;; When ENTRY-POINT is true, it must be a list of strings; it is stored as the
+;; entry point in the Docker image JSON structure.
+
+;; ENVIRONMENT must be a list of name/value pairs.  It specifies the environment
+;; variables that must be defined in the resulting image.
+
+;; EXTRA-FILES must be a list of directives for 'evaluate-populate-directive'
+;; describing non-store files that must be created in the image.
+
+;; TRANSFORMATIONS must be a list of (OLD -> NEW) tuples describing how to
+;; transform the PATHS.  Any path in PATHS that begins with OLD will be rewritten
+;; in the Docker image so that it begins with NEW instead.  If a path is a
+;; non-empty directory, then its contents will be recursively added, as well.
+
+;; SYSTEM is a GNU triplet (or prefix thereof) of the system the binaries in
+;; PATHS are for; it is used to produce metadata in the image.  Use COMPRESSOR, a
+;; command such as '(\"gzip\" \"-9n\"), to compress IMAGE.  Use CREATION-TIME, a
+;; SRFI-19 time-utc object, as the creation time in metadata."
+;;   (define transformation-options
+;;     (if (eq? '() transformations)
+;;         '()
+;;         `("--transform" ,(transformations->expression transformations))))
+;;   (let* ((directory "/tmp/docker-image") ;temporary working directory
+;;          (id (docker-id prefix))
+;;          (time (date->string (time-utc->date creation-time) "~4"))
+;;          (arch (let-syntax ((cond* (syntax-rules ()
+;;                                      ((_ (pattern clause) ...)
+;;                                       (cond ((string-prefix? pattern system)
+;;                                              clause)
+;;                                             ...
+;;                                             (else
+;;                                              (error "unsupported system"
+;;                                                     system)))))))
+;;                  (cond* ("x86_64" "amd64")
+;;                         ("i686"   "386")
+;;                         ("arm"    "arm")
+;;                         ("mips64" "mips64le")))))
+;;     ;; Make sure we start with a fresh, empty working directory.
+;;     (mkdir directory)
+;;     (with-directory-excursion directory
+;;       (mkdir id)
+;;       (with-directory-excursion id
+;;         (with-output-to-file "VERSION"
+;;           (lambda () (display schema-version)))
+;;         (with-output-to-file "json"
+;;           (lambda () (scm->json (image-description id time))))
+
+;;         ;; Create a directory for the non-store files that need to go into the
+;;         ;; archive.
+;;         (mkdir "extra")
+
+;;         (with-directory-excursion "extra"
+;;           ;; Create non-store files.
+;;           (for-each (cut evaluate-populate-directive <> "./")
+;;                     extra-files)
+
+;;           (when database
+;;             ;; Initialize /var/guix, assuming PREFIX points to a profile.
+;;             (install-database-and-gc-roots "." database prefix))
+
+;;           (apply invoke "tar" "-cf" "../layer.tar"
+;;                  `(,@transformation-options
+;;                    ,@%tar-determinism-options
+;;                    ,@paths
+;;                    ,@(scandir "."
+;;                               (lambda (file)
+;;                                 (not (member file '("." ".."))))))))
+
+;;         ;; It is possible for "/" to show up in the archive, especially when
+;;         ;; applying transformations.  For example, the transformation
+;;         ;; "s,^/a,," will (perhaps surprisingly) cause GNU tar to transform
+;;         ;; the path "/a" into "/".  The presence of "/" in the archive is
+;;         ;; probably benign, but it is definitely safe to remove it, so let's
+;;         ;; do that.  This fails when "/" is not in the archive, so use system*
+;;         ;; instead of invoke to avoid an exception in that case, and redirect
+;;         ;; stderr to the bit bucket to avoid "Exiting with failure status"
+;;         ;; error messages.
+;;         (with-error-to-port (%make-void-port "w")
+;;           (lambda ()
+;;             (system* "tar" "--delete" "/" "-f" "layer.tar")))
+
+;;         (delete-file-recursively "extra"))
+
+;;       (with-output-to-file "config.json"
+;;         (lambda ()
+;;           (scm->json (config (string-append id "/layer.tar")
+;;                              time arch
+;;                              #:environment environment
+;;                              #:entry-point entry-point))))
+;;       (with-output-to-file "manifest.json"
+;;         (lambda ()
+;;           (scm->json (manifest prefix id repository))))
+;;       (with-output-to-file "repositories"
+;;         (lambda ()
+;;           (scm->json (repositories prefix id repository)))))
+
+;;     (apply invoke "tar" "-cf" image "-C" directory
+;;            `(,@%tar-determinism-options
+;;              ,@(if compressor
+;;                    (list "-I" (string-join compressor))
+;;                    '())
+;;              "."))
+;;     (delete-file-recursively directory)))
diff --git a/guix/docker.scm b/guix/docker.scm
new file mode 100644
index 0000000000..47bc2e8f99
--- /dev/null
+++ b/guix/docker.scm
@@ -0,0 +1,246 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2020 Christopher Baines <mail@cbaines.net>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (guix docker)
+  #:use-module (srfi srfi-9)
+  #:use-module (srfi srfi-19)
+  #:use-module (ice-9 match)
+  #:use-module (guix gexp)
+  #:use-module (guix modules)
+  #:use-module (gnu packages base)
+  #:use-module (gnu packages compression)
+  #:use-module (gnu packages guile)
+  #:use-module (gnu packages gnupg)
+  #:export (docker-image-layer
+            docker-image-layer-name
+            docker-image-layer-store-paths
+            docker-image-layer-transformations
+            docker-image-layer-extra-files
+            docker-image-layer-extra-gexp
+
+            docker-image
+            docker-image-name
+            docker-image-layers
+            docker-image-repository
+            docker-image-entry-point
+            docker-image-environment
+            docker-image-compressor
+            docker-image-creation-time))
+
+
+(define-record-type <docker-image-layer>
+  (%docker-image-layer name store-paths transformations extra-files extra-gexp
+                       creation-time)
+  docker-image-layer?
+  (name               docker-image-layer-name)
+  (store-paths        docker-image-layer-store-paths)
+  (transformations    docker-image-layer-transformations)
+  (extra-files        docker-image-layer-extra-files)
+  (extra-gexp         docker-image-layer-extra-gexp)
+  (creation-time      docker-image-layer-creation-time))
+
+
+(define* (docker-image-layer name store-paths
+                             #:key (transformations '())
+                             (extra-files '()) extra-gexp
+                             (creation-time (make-time time-utc 0 1)))
+  (%docker-image-layer name store-paths transformations extra-files extra-gexp
+                       creation-time))
+
+(define-gexp-compiler (docker-image-layer-compiler (layer <docker-image-layer>)
+                                                   system target)
+  (match layer
+    (($ <docker-image-layer> name store-paths transformations
+                             extra-files extra-gexp creation-time)
+     (gexp->derivation
+      name
+      (with-extensions (list guile-json-3 ;for (guix build docker)
+                             guile-gcrypt)
+        (with-imported-modules `(,@(source-module-closure
+                                    '((guix build docker)
+                                      (guix build utils)
+                                      (guix build store-copy))))
+          #~(begin
+              (use-modules (srfi srfi-26)
+                           (ice-9 ftw)
+                           (json)
+                           (guix build utils)
+                           (guix build docker))
+
+              (let ((out #$output)
+                    (store-paths (list #$@store-paths))
+                    (transformations (list #$@transformations))
+                    (time #$(date->string (time-utc->date creation-time 0) "~4")))
+
+                (define transformation-options
+                  (if (null? transformations)
+                      '()
+                      `("--transform" ,(transformations->expression transformations))))
+
+                (define layer-id
+                  (docker-id out))
+
+                (mkdir out)
+                (with-directory-excursion out
+                  (with-output-to-file "VERSION"
+                    (lambda () (display schema-version)))
+                  (with-output-to-file "json"
+                    (lambda () (scm->json (image-description layer-id time))))
+
+                  ;; Create a directory for the non-store files that need to
+                  ;; go into the archive.
+                  (mkdir "extra")
+
+                  (with-directory-excursion "extra"
+                    ;; Create non-store files.
+                    (for-each (cut evaluate-populate-directive <> "./")
+                              (list #$@extra-files))
+
+                    (apply invoke #$(file-append tar "/bin/tar")
+                           "-cf" "../layer.tar"
+                           `(,@transformation-options
+                             ,@%tar-determinism-options
+                             ,@store-paths
+                             ,@(scandir "."
+                                        (lambda (file)
+                                          (not (member file '("." ".."))))))))
+
+                  ;; It is possible for "/" to show up in the archive,
+                  ;; especially when applying transformations.  For example,
+                  ;; the transformation "s,^/a,," will (perhaps surprisingly)
+                  ;; cause GNU tar to transform the path "/a" into "/".  The
+                  ;; presence of "/" in the archive is probably benign, but it
+                  ;; is definitely safe to remove it, so let's do that.  This
+                  ;; fails when "/" is not in the archive, so use system*
+                  ;; instead of invoke to avoid an exception in that case, and
+                  ;; redirect stderr to the bit bucket to avoid "Exiting with
+                  ;; failure status" error messages.
+                  (with-error-to-port (%make-void-port "w")
+                    (lambda ()
+                      (system* #$(file-append tar "/bin/tar")
+                               "--delete" "/" "-f" "layer.tar")))
+
+                  (delete-file-recursively "extra"))))))
+      #:system system
+      #:target target))))
+
+
+(define-record-type <docker-image>
+  (%docker-image name layers repository entry-point
+                 environment compressor creation-time)
+  docker-image?
+  (name               docker-image-name)
+  (layers             docker-image-layers)
+  (repository         docker-image-repository)
+  (entry-point        docker-image-entry-point)
+  (environment        docker-image-environment)
+  (compressor         docker-image-compressor)
+  (creation-time      docker-image-creation-time))
+
+(define* (docker-image name layers
+                       #:key
+                       (repository "guix")
+                       entry-point
+                       (environment '())
+                       (compressor
+                        #~(#+(file-append gzip "/bin/gzip") "-9n"))
+                       (creation-time (make-time time-utc 0 1)))
+
+  (%docker-image name layers repository entry-point
+                 environment compressor creation-time))
+
+(define-gexp-compiler (docker-image-compiler (image <docker-image>)
+                                             system target)
+  (match image
+    (($ <docker-image> name layers repository entry-point
+                       environment compressor creation-time)
+     (gexp->derivation
+      name
+      (with-extensions (list guile-json-3 ;for (guix build docker)
+                             guile-gcrypt)
+        (with-imported-modules `(,@(source-module-closure
+                                    '((guix build docker)
+                                      (guix build utils)
+                                      (guix build store-copy))))
+          #~(begin
+              (use-modules (srfi srfi-1)
+                           (srfi srfi-26)
+                           (ice-9 ftw)
+                           (json)
+                           (guix build utils)
+                           (guix build docker))
+              (let* ((out #$output)
+                     (directory "/tmp/docker-image") ;temporary working directory
+                     (id (docker-id out))
+                     (repository #$repository)
+                     (time #$(date->string (time-utc->date creation-time 0) "~4"))
+                     (arch (let-syntax ((cond* (syntax-rules ()
+                                                 ((_ (pattern clause) ...)
+                                                  (cond ((string-prefix? pattern #$system)
+                                                         clause)
+                                                        ...
+                                                        (else
+                                                         (error "unsupported system"
+                                                                system)))))))
+                             (cond* ("x86_64" "amd64")
+                                    ("i686"   "386")
+                                    ("arm"    "arm")
+                                    ("mips64" "mips64le"))))
+                     (layers (list #$@ layers))
+                     (layer-docker-ids
+                      (map docker-id layers))
+                     (compressor
+                      (list #$@compressor)))
+
+                ;; Make sure we start with a fresh, empty working directory.
+                (mkdir directory)
+                (with-directory-excursion directory
+                  (for-each symlink
+                            layers
+                            layer-docker-ids)
+
+                  (with-output-to-file "config.json"
+                    (lambda ()
+                      (scm->json (config (map (lambda (id)
+                                                (string-append id "/layer.tar"))
+                                              layer-docker-ids)
+                                         time arch
+                                         #:environment '#$environment
+                                         #$@(if entry-point
+                                                #~(#:entry-point
+                                                   (list #$@entry-point))
+                                                '())))))
+                  (with-output-to-file "manifest.json"
+                    (lambda ()
+                      (scm->json (manifest layer-docker-ids repository))))
+                  (with-output-to-file "repositories"
+                    (lambda ()
+                      (scm->json (repositories (last layer-docker-ids)
+                                               repository)))))
+
+                (apply invoke
+                       #$(file-append tar "/bin/tar")
+                       "-cf" out
+                       "--dereference" ;; to follow the layer symlinks
+                       "-C" directory
+                       `(,@%tar-determinism-options
+                         ,@(if compressor
+                               (list "-I" (string-join compressor))
+                               '())
+                         "."))
+                (delete-file-recursively directory)))))))))
diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index ee0395ea00..a9e9e7a415 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -37,6 +37,7 @@
   #:use-module (guix packages)
   #:use-module (guix profiles)
   #:use-module (guix describe)
+  #:use-module (guix docker)
   #:use-module (guix derivations)
   #:use-module (guix search-paths)
   #:use-module (guix build-system gnu)
@@ -58,7 +59,7 @@
   #:export (compressor?
             lookup-compressor
             self-contained-tarball
-            docker-image
+            docker-image-pack
             squashfs-image
 
             guix-pack))
@@ -482,14 +483,14 @@ added to the pack."
                     build
                     #:references-graphs `(("profile" ,profile))))
 
-(define* (docker-image name profile
-                       #:key target
-                       (profile-name "guix-profile")
-                       (compressor (first %compressors))
-                       entry-point
-                       localstatedir?
-                       (symlinks '())
-                       (archiver tar))
+(define* (docker-image-pack name profile
+                            #:key target
+                            (profile-name "guix-profile")
+                            (compressor (first %compressors))
+                            entry-point
+                            localstatedir?
+                            (symlinks '())
+                            archiver)   ; not sure why this is needed
   "Return a derivation to construct a Docker image of PROFILE.  The
 image is a tarball conforming to the Docker Image Specification, compressed
 with COMPRESSOR.  It can be passed to 'docker load'.  If TARGET is true, it
@@ -500,83 +501,84 @@ the image."
          (file-append (store-database (list profile))
                       "/db/db.sqlite")))
 
-  (define defmod 'define-module)                  ;trick Geiser
-
-  (define build
-    ;; Guile-JSON and Guile-Gcrypt are required by (guix build docker).
-    (with-extensions (list guile-json-3 guile-gcrypt)
-      (with-imported-modules `(((guix config) => ,(make-config.scm))
-                               ,@(source-module-closure
-                                  `((guix build docker)
-                                    (guix build store-copy)
-                                    (guix profiles)
-                                    (guix search-paths))
-                                  #:select? not-config?))
-        #~(begin
-            (use-modules (guix build docker) (guix build store-copy)
-                         (guix profiles) (guix search-paths)
-                         (srfi srfi-1) (srfi srfi-19)
-                         (ice-9 match))
-
-            (define environment
-              (map (match-lambda
-                     ((spec . value)
-                      (cons (search-path-specification-variable spec)
-                            value)))
-                   (profile-search-paths #$profile)))
-
-            (define symlink->directives
-              ;; Return "populate directives" to make the given symlink and its
-              ;; parent directories.
-              (match-lambda
-                ((source '-> target)
-                 (let ((target (string-append #$profile "/" target))
-                       (parent (dirname source)))
-                   `((directory ,parent)
-                     (,source -> ,target))))))
-
-            (define directives
-              ;; Create a /tmp directory, as some programs expect it, and
-              ;; create SYMLINKS.
-              `((directory "/tmp" ,(getuid) ,(getgid) #o1777)
-                ,@(append-map symlink->directives '#$symlinks)))
-
-            (define tag
-              ;; Compute a meaningful "repository" name, which will show up in
-              ;; the output of "docker images".
-              (let ((manifest (profile-manifest #$profile)))
-                (let loop ((names (map manifest-entry-name
-                                       (manifest-entries manifest))))
-                  (define str (string-join names "-"))
-                  (if (< (string-length str) 40)
-                      str
-                      (match names
-                        ((_) str)
-                        ((names ... _) (loop names))))))) ;drop one entry
-
-            (setenv "PATH" (string-append #$archiver "/bin"))
-
-            (build-docker-image #$output
-                                (map store-info-item
-                                     (call-with-input-file "profile"
-                                       read-reference-graph))
-                                #$profile
-                                #:repository tag
-                                #:database #+database
-                                #:system (or #$target (utsname:machine (uname)))
-                                #:environment environment
-                                #:entry-point
-                                #$(and entry-point
-                                       #~(list (string-append #$profile "/"
-                                                              #$entry-point)))
-                                #:extra-files directives
-                                #:compressor '#$(compressor-command compressor)
-                                #:creation-time (make-time time-utc 0 1))))))
+  (define symlink->directives
+    ;; Return "populate directives" to make the given symlink and its
+    ;; parent directories.
+    (match-lambda
+      ((source '-> target)
+       (let ((target (string-append profile "/" target))
+             (parent (dirname source)))
+         `((directory ,parent)
+           (,source -> ,target))))))
+
+  (define directives
+    ;; Create a /tmp directory, as some programs expect it, and
+    ;; create SYMLINKS.
+    `((directory "/tmp" ,(getuid) ,(getgid) #o1777)
+      ,@(append-map symlink->directives symlinks)))
+
+  (define tag
+    ;; Compute a meaningful "repository" name, which will show up in
+    ;; the output of "docker images".
+    (let* ((built-profile
+            (with-store store
+              (let ((output
+                     (build-derivations store (list profile)))
+                    (path
+                     (derivation-output-path
+                      (match (derivation-outputs profile)
+                        (((name . derivation-output))
+                         derivation-output)))))
+                path)))
+           (manifest (profile-manifest built-profile)))
+      (let loop ((names (map manifest-entry-name
+                             (manifest-entries manifest))))
+        (define str (string-join names "-"))
+        (if (< (string-length str) 40)
+            str
+            (match names
+              ((_) str)
+              ((names ... _) (loop names))))))) ;drop one entry
 
-  (gexp->derivation (string-append name ".tar"
-                                   (compressor-extension compressor))
-                    build
-                    #:references-graphs `(("profile" ,profile))))
+  (define environment
+    (map (match-lambda
+           ((spec . value)
+            (cons (search-path-specification-variable spec)
+                  value)))
+         (profile-search-paths
+          (with-store store
+            (let ((output
+                   (build-derivations store (list profile)))
+                  (path
+                   (derivation-output-path
+                    (match (derivation-outputs profile)
+                      (((name . derivation-output))
+                       derivation-output)))))
+              path)))))
+
+  (lower-object
+   (docker-image
+    (string-append name ".tar"
+                   (compressor-extension compressor))
+    (list (docker-image-layer
+           "pack-docker-image-layer"
+           (with-store store
+             (let ((output
+                    (build-derivations store (list profile)))
+                   (path
+                    (derivation-output-path
+                     (match (derivation-outputs profile)
+                       (((name . derivation-output))
+                        derivation-output)))))
+               (requisites store (list path))))
+           ;;#:extra-files directives
+           ))
+    #:repository tag
+    #:environment environment
+    #:entry-point (and entry-point
+                       #~(list (string-append #$profile "/"
+                                              #$entry-point)))
+    #:compressor (compressor-command compressor))))
 
 \f
 ;;;
@@ -793,7 +795,7 @@ last resort for relocation."
   ;; Supported pack formats.
   `((tarball . ,self-contained-tarball)
     (squashfs . ,squashfs-image)
-    (docker  . ,docker-image)))
+    (docker  . ,docker-image-pack)))
 
 (define (show-formats)
   ;; Print the supported pack formats.
@@ -1016,7 +1018,7 @@ Create a bundle of PACKAGE.\n"))
         (else
          (packages->manifest packages))))))
 
-  (with-error-handling
+  ;; (with-error-handling
     (with-store store
       (with-status-verbosity (assoc-ref opts 'verbosity)
         ;; Set the build options before we do anything else.
@@ -1126,4 +1128,4 @@ to your package list.")))
                                       gc-root))
                     (return (format #t "~a~%"
                                     (derivation->output-path drv))))))
-              #:system (assoc-ref opts 'system))))))))
+              #:system (assoc-ref opts 'system)))))))
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] Generate two layers for docker images in guix pack
  2020-03-21 23:24 Experiment in generating multi-layer Docker images with guix pack Christopher Baines
  2020-03-21 23:24 ` [PATCH 1/3] Rename (guix docker) to (guix build docker) Christopher Baines
  2020-03-21 23:24 ` [PATCH 2/3] Make guix pack work with the new docker image gexpressions Christopher Baines
@ 2020-03-21 23:24 ` Christopher Baines
  2020-03-26 12:03 ` Experiment in generating multi-layer Docker images with " Ludovic Courtès
  3 siblings, 0 replies; 7+ messages in thread
From: Christopher Baines @ 2020-03-21 23:24 UTC (permalink / raw)
  To: guix-devel

Split the store items in to two layers, the top layer with the profile and the
store items it directly references, and the bottom layer with the rest of the
store items.

This means that when you use pack and slightly vary the packages being packed,
for example by changing the versions or sources, the base layer will be
unchanged.
---
 guix/scripts/pack.scm | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index a9e9e7a415..698af73d28 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -560,19 +560,31 @@ the image."
    (docker-image
     (string-append name ".tar"
                    (compressor-extension compressor))
-    (list (docker-image-layer
-           "pack-docker-image-layer"
-           (with-store store
-             (let ((output
-                    (build-derivations store (list profile)))
-                   (path
-                    (derivation-output-path
-                     (match (derivation-outputs profile)
-                       (((name . derivation-output))
-                        derivation-output)))))
-               (requisites store (list path))))
-           ;;#:extra-files directives
-           ))
+    (with-store store
+      (let* ((output
+              (build-derivations store (list profile)))
+             (path
+              (derivation-output-path
+               (match (derivation-outputs profile)
+                 (((name . derivation-output))
+                  derivation-output))))
+             (refs
+              (references store path))
+             (reqs
+              (requisites store (list path))))
+
+        (list (docker-image-layer
+               "pack-docker-image-layer"
+               (sort (filter (lambda (path)
+                               (not (member path refs)))
+                             reqs)
+                     string<?))
+              (docker-image-layer
+               "pack-docker-image-layer"
+               (sort (filter (lambda (path)
+                               (member path refs))
+                             reqs)
+                     string<?)))))
     #:repository tag
     #:environment environment
     #:entry-point (and entry-point
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Experiment in generating multi-layer Docker images with guix pack
  2020-03-21 23:24 Experiment in generating multi-layer Docker images with guix pack Christopher Baines
                   ` (2 preceding siblings ...)
  2020-03-21 23:24 ` [PATCH 3/3] Generate two layers for docker images in guix pack Christopher Baines
@ 2020-03-26 12:03 ` Ludovic Courtès
  2020-03-26 20:15   ` Christopher Baines
  3 siblings, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2020-03-26 12:03 UTC (permalink / raw)
  To: Christopher Baines; +Cc: guix-devel

Hello Chris,

Christopher Baines <mail@cbaines.net> skribis:

> These patches are very rough, and not ready, but do at least work in some
> limited capacity. I've been testing with the following commands:
>
>   guix pack --format=docker guile@2.2.6
>   guix pack --format=docker guile@2.2.7
>
> With the previous Docker image generation implementation, two different ~130MB
> images would be generated. These patches mean that each .tar.gz file generated
> by guix pack contains a ~53MB layer which contains the profile and directly
> referenced store items, and then a ~77MB layer with all the other store items
> which is identical for both the 2.2.6 and 2.2.7 pack file.

Nice!

> I think it could be useful to support multiple different strategies for
> generating layers for Docker images, with different trade-offs. This approach
> using two layers should make the resulting images more efficient to use in the
> case where like the guile example above, where the packages you run guix pack
> with have exactly matching inputs.

Did you read <https://grahamc.com/blog/nix-and-layered-docker-images>?
They came up with a pretty smart algorithm that would be worth copying.

> This could often be the case if you're developing an application, packaging it
> with Guix, then using guix pack to generate a Docker image which you
> deploy. With the single layer approach, if you change the application code,
> you'll get an entirely different image. I haven't tried this out, but my hope
> is that by generating a common base layer, if you change the application code
> only the top layer of the Docker image will change, meaning you'll only have
> to deploy that, rather than having to deploy the entire image. If you're
> deploying the images across a network, having less data to send around can
> save time, and reduce the amount of space required to store the images.

Definitely.

> As well as these behaviour changes, these patches also modify the
> implementation. Rather than having some build side code that's used in the
> pack and vm module gexpressions, these patches introduce two new record types:
> <docker-image-layer> and <docker-image>. This at least structures the
> derivations so that each layer is represented by a derivation, and then
> there's a derivation for the image itself, which is a little more efficient in
> terms of computation.

Nice.

I think a layering algorithm like Graham Christensen’s above requires
knowledge of the reference graph, meaning that layering can only be
computed on the build side, using #:references-graphs.  In that case, it
could be that you can’t have a host-side <docker-image-layer> record.

> What do people think about generating multi-layer images, and using record
> types to represent the layers and image?

I think multi-layering is something we should definitely have, and
record for at least the image are a good idea.  :-)

Thanks for looking into this!

Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Experiment in generating multi-layer Docker images with guix pack
  2020-03-26 12:03 ` Experiment in generating multi-layer Docker images with " Ludovic Courtès
@ 2020-03-26 20:15   ` Christopher Baines
  2020-03-29 14:50     ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Christopher Baines @ 2020-03-26 20:15 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 3859 bytes --]


Ludovic Courtès <ludo@gnu.org> writes:

> Christopher Baines <mail@cbaines.net> skribis:
>
>> I think it could be useful to support multiple different strategies for
>> generating layers for Docker images, with different trade-offs. This approach
>> using two layers should make the resulting images more efficient to use in the
>> case where like the guile example above, where the packages you run guix pack
>> with have exactly matching inputs.
>
> Did you read <https://grahamc.com/blog/nix-and-layered-docker-images>?
> They came up with a pretty smart algorithm that would be worth copying.

I'm aware of it, but I haven't read it in detail yet.

>> As well as these behaviour changes, these patches also modify the
>> implementation. Rather than having some build side code that's used in the
>> pack and vm module gexpressions, these patches introduce two new record types:
>> <docker-image-layer> and <docker-image>. This at least structures the
>> derivations so that each layer is represented by a derivation, and then
>> there's a derivation for the image itself, which is a little more efficient in
>> terms of computation.
>
> Nice.
>
> I think a layering algorithm like Graham Christensen’s above requires
> knowledge of the reference graph, meaning that layering can only be
> computed on the build side, using #:references-graphs.  In that case, it
> could be that you can’t have a host-side <docker-image-layer> record.

As I understand it, you only have to do the computation on the build
side if you're restricted to doing a single set of builds. If you first
build the store items you want to put in the image, then look at there
references and compute the derivation for building the image, then you
could do this kind of computation on the client side.

But yeah, this is important to work out, as how image generation should
work, and what behaviours we want should define the structure of the
code.

I went with records to represent layers partially because I'm familiar
with it, but also because it allows for easier manipulation of layers on
the client side. Representing different layers as different derivations
also allows them to potentially be built in parallel, although I'm not
sure how beneficial this might be.

Related to this, at the moment Docker V1 images can be generated, it
would be good in the future to also support Docker V2 images and OCI
images. All three container formats use a layered approach to managing
the files, but they are all different (as far as I'm aware).

In my mind there are three architectural approaches:

 - Image generation entirely on the build side

   - The layers and the image are constructed through one derivation
   - The code for building images is in a module available at build time
   - Different approaches for layering are implemented in the module
     available at build time, and parameters are passed in as
     data/gexpressions

 - Image generation entirely on the client side

   - Each layer is a derivation, and the image is an additional
     derivation that takes the layers as an input
   - The code for building images is inside gexp compilers for the
     record types representing the images and layers
   - Different approaches for layering manipulate the layer records on
     the client side

 - Image generation can be done both build and client side

   - Depending on the parameters, the layers and image can be a single
     derivation, or one for each layer, and another for the image
   - The code for building images is in a module available at build
     time, and this is also used by gexp compilers
   - Different approaches for layering have the option of either being
     on the build side, or the client side

What are peoples thoughts?

Thanks,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Experiment in generating multi-layer Docker images with guix pack
  2020-03-26 20:15   ` Christopher Baines
@ 2020-03-29 14:50     ` Ludovic Courtès
  0 siblings, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2020-03-29 14:50 UTC (permalink / raw)
  To: Christopher Baines; +Cc: guix-devel

Hi Chris,

Christopher Baines <mail@cbaines.net> skribis:

[...]

>> I think a layering algorithm like Graham Christensen’s above requires
>> knowledge of the reference graph, meaning that layering can only be
>> computed on the build side, using #:references-graphs.  In that case, it
>> could be that you can’t have a host-side <docker-image-layer> record.
>
> As I understand it, you only have to do the computation on the build
> side if you're restricted to doing a single set of builds. If you first
> build the store items you want to put in the image, then look at there
> references and compute the derivation for building the image, then you
> could do this kind of computation on the client side.
>
> But yeah, this is important to work out, as how image generation should
> work, and what behaviours we want should define the structure of the
> code.
>
> I went with records to represent layers partially because I'm familiar
> with it, but also because it allows for easier manipulation of layers on
> the client side. Representing different layers as different derivations
> also allows them to potentially be built in parallel, although I'm not
> sure how beneficial this might be.

That’s a good point, it could help.  We could also use a “dynamic
dependency” like for grafts so we can compute things on the host side
anyway (tempting, but we should probably not start using that
everywhere!).

> Related to this, at the moment Docker V1 images can be generated, it
> would be good in the future to also support Docker V2 images and OCI
> images. All three container formats use a layered approach to managing
> the files, but they are all different (as far as I'm aware).

Oh, I thought these formats were all the same.  I suppose it’d be enough
to support OCI, right?

> In my mind there are three architectural approaches:
>
>  - Image generation entirely on the build side
>
>    - The layers and the image are constructed through one derivation
>    - The code for building images is in a module available at build time
>    - Different approaches for layering are implemented in the module
>      available at build time, and parameters are passed in as
>      data/gexpressions
>
>  - Image generation entirely on the client side
>
>    - Each layer is a derivation, and the image is an additional
>      derivation that takes the layers as an input
>    - The code for building images is inside gexp compilers for the
>      record types representing the images and layers
>    - Different approaches for layering manipulate the layer records on
>      the client side
>
>  - Image generation can be done both build and client side
>
>    - Depending on the parameters, the layers and image can be a single
>      derivation, or one for each layer, and another for the image
>    - The code for building images is in a module available at build
>      time, and this is also used by gexp compilers
>    - Different approaches for layering have the option of either being
>      on the build side, or the client side
>
> What are peoples thoughts?

From a pragmatic standpoint, perhaps we can first integrate what you
propose (option #2), and later adjust the code towards #1 or #3 as we
see fit.

WDYT?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-29 14:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-21 23:24 Experiment in generating multi-layer Docker images with guix pack Christopher Baines
2020-03-21 23:24 ` [PATCH 1/3] Rename (guix docker) to (guix build docker) Christopher Baines
2020-03-21 23:24 ` [PATCH 2/3] Make guix pack work with the new docker image gexpressions Christopher Baines
2020-03-21 23:24 ` [PATCH 3/3] Generate two layers for docker images in guix pack Christopher Baines
2020-03-26 12:03 ` Experiment in generating multi-layer Docker images with " Ludovic Courtès
2020-03-26 20:15   ` Christopher Baines
2020-03-29 14:50     ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).