From mboxrd@z Thu Jan  1 00:00:00 1970
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@gnu.org>
Subject: Re: Experiment in generating multi-layer Docker images with guix pack
Date: Thu, 26 Mar 2020 13:03:15 +0100
Message-ID: <87zhc39vcs.fsf@gnu.org>
References: <20200321232428.31832-1-mail@cbaines.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Return-path: <guix-devel-bounces+gcggd-guix-devel=m.gmane-mx.org@gnu.org>
Received: from eggs.gnu.org ([2001:470:142:3::10]:39806)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <ludo@gnu.org>) id 1jHREE-0003UQ-DI
 for guix-devel@gnu.org; Thu, 26 Mar 2020 08:03:19 -0400
In-Reply-To: <20200321232428.31832-1-mail@cbaines.net> (Christopher Baines's
 message of "Sat, 21 Mar 2020 23:24:25 +0000")
List-Id: "Development of GNU Guix and the GNU System distribution."
 <guix-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>,
 <mailto:guix-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/guix-devel>
List-Post: <mailto:guix-devel@gnu.org>
List-Help: <mailto:guix-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>,
 <mailto:guix-devel-request@gnu.org?subject=subscribe>
Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane-mx.org@gnu.org
Sender: "Guix-devel"
 <guix-devel-bounces+gcggd-guix-devel=m.gmane-mx.org@gnu.org>
To: Christopher Baines <mail@cbaines.net>
Cc: guix-devel@gnu.org

Hello Chris,

Christopher Baines <mail@cbaines.net> skribis:

> These patches are very rough, and not ready, but do at least work in some
> limited capacity. I've been testing with the following commands:
>
>   guix pack --format=3Ddocker guile@2.2.6
>   guix pack --format=3Ddocker guile@2.2.7
>
> With the previous Docker image generation implementation, two different ~=
130MB
> images would be generated. These patches mean that each .tar.gz file gene=
rated
> by guix pack contains a ~53MB layer which contains the profile and direct=
ly
> referenced store items, and then a ~77MB layer with all the other store i=
tems
> which is identical for both the 2.2.6 and 2.2.7 pack file.

Nice!

> I think it could be useful to support multiple different strategies for
> generating layers for Docker images, with different trade-offs. This appr=
oach
> using two layers should make the resulting images more efficient to use i=
n the
> case where like the guile example above, where the packages you run guix =
pack
> with have exactly matching inputs.

Did you read <https://grahamc.com/blog/nix-and-layered-docker-images>?
They came up with a pretty smart algorithm that would be worth copying.

> This could often be the case if you're developing an application, packagi=
ng it
> with Guix, then using guix pack to generate a Docker image which you
> deploy. With the single layer approach, if you change the application cod=
e,
> you'll get an entirely different image. I haven't tried this out, but my =
hope
> is that by generating a common base layer, if you change the application =
code
> only the top layer of the Docker image will change, meaning you'll only h=
ave
> to deploy that, rather than having to deploy the entire image. If you're
> deploying the images across a network, having less data to send around can
> save time, and reduce the amount of space required to store the images.

Definitely.

> As well as these behaviour changes, these patches also modify the
> implementation. Rather than having some build side code that's used in the
> pack and vm module gexpressions, these patches introduce two new record t=
ypes:
> <docker-image-layer> and <docker-image>. This at least structures the
> derivations so that each layer is represented by a derivation, and then
> there's a derivation for the image itself, which is a little more efficie=
nt in
> terms of computation.

Nice.

I think a layering algorithm like Graham Christensen=E2=80=99s above requir=
es
knowledge of the reference graph, meaning that layering can only be
computed on the build side, using #:references-graphs.  In that case, it
could be that you can=E2=80=99t have a host-side <docker-image-layer> recor=
d.

> What do people think about generating multi-layer images, and using record
> types to represent the layers and image?

I think multi-layering is something we should definitely have, and
record for at least the image are a good idea.  :-)

Thanks for looking into this!

Ludo=E2=80=99.