unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
From: Stephen Scheck <singularsyntax@gmail.com>
To: Chris Marusich <cmmarusich@gmail.com>
Cc: help-guix <help-guix@gnu.org>
Subject: Re: Guix Docker image inflation
Date: Sun, 31 May 2020 13:50:06 -0400	[thread overview]
Message-ID: <CAKjnHz0ngmPC3dYu6czZTLAbXMyw7HfXiLr4fhdTyDiUo42vdA@mail.gmail.com> (raw)
In-Reply-To: <87367glo7c.fsf@gmail.com>

On Sun, May 31, 2020 at 12:31 AM Chris Marusich <cmmarusich@gmail.com>
wrote:

> > Also, layers are helpful in the case of someone pulling down daily
> > Guix Docker images on a frequent basis, because then only the new,
> > ideally small layers need to be downloaded, whereas if you rebase for
> > every image build, you'd have to download the entire image every day.
>
> That is true, but suppose I have the following 3 images:
>
> - Image A: A base image created in January 2020.
> - Image B: Based on A, and I ran "guix pull" in February 2020.
> - Image C: Based on A, and I ran "guix pull" in June 2020.
>
> I would guess that the size difference between A and B is approximately
> the same as the difference between A and C.  It'll be different, of
> course, but generally the size difference between A and C should not
> grow linearly with time, since "guix pull" is only going to install at
> most the total closure of things necessary to build and run Guix, which
> doesn't increase much in size as time goes on.  However, when you
> daisy-chain the images every day, the image size will grow linearly with
> time because the contents of all the previous layers is carried forward.
>
> > My build script issues several `docker exec <container> <command>`
> > sequences, followed by a `docker commit <container>`. Intermediate
> > changes to the container file system prior to the commit do not
> > generate layers, only the net changes after the commit.
>
> There are two problems here.  One is that the image size grows without
> bound.  The other is that guix-daemon is failing to GC store items in
> the Docker container.  Although they are both concerning, the latter is
> not the cause of the former.
>
> If you install new store items (e.g., via "guix pull"), make them dead,
> and then GC them, all in the same container before running "docker
> commit", then I agree: those GC'd store items would not persist in a
> layer anywhere.  However, I don't think that's what's happening here.
> Sure, there might be a few store items like this, but in practice, there
> will be many store items from the previous image which began live but
> became dead when you ran "guix pull" and deleted your old profile
> generations.  It is those store items that are adding the most space to
> your image.
>

Yes, I get this. I never expected the container to stay constant in size,
but I
was hoping daily pulls would result in relatively low image growth. It's not
clear to me if any of the items which should get GC'd but don't are just
ephemeral build results, in which case growth might be tolerable with an
occasional rebase (perhaps monthly or bi-monthly).

But I'm now starting to doubt my whole approach because it seems like
there are some fundamental GC problems with running a live Guix system
inside a container.


> Besides store items, I noticed two other things about your images:
>
> - The contents of /var is growing slowly without bound, but it isn't
>   nearly as bad as the contents of /gnu/store.  This is probably due to
>   log files; consider pruning them.
>

These are presumably OK to delete, without any special handling for Guix?


> - Your script runs "docker commit" while guix-daemon (and other
>   programs) are still running.  To ensure the guix-daemon's database (or
>   other things) does not become corrupt, consider terminating all
>   processes before committing the new image.
>

`docker commit` pauses the container (unless you tell it not to) ...
although
I guess that could still cause problems if Guix store writes aren't
implemented
in an atomic way.

Thanks,
-SS

  parent reply	other threads:[~2020-05-31 17:50 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 19:41 Guix Docker image inflation Stephen Scheck
2020-05-28 18:10 ` Leo Famulari
2020-05-29 16:19   ` Stephen Scheck
2020-05-29 17:08     ` Leo Famulari
2020-05-29 17:56       ` Stephen Scheck
2020-05-29 18:02         ` Leo Famulari
2020-05-29 18:21           ` Marius Bakke
2020-05-29 18:37             ` Leo Famulari
2020-05-29 18:44               ` zimoun
2020-05-29 21:24                 ` Stephen Scheck
2020-05-29 18:29           ` Stephen Scheck
2020-05-29 17:12     ` zimoun
2020-05-29 17:36       ` Stephen Scheck
2020-05-29 18:08 ` zimoun
2020-05-29 18:47   ` Stephen Scheck
2020-05-29 20:02     ` zimoun
2020-05-29 21:04       ` Stephen Scheck
2020-05-29 21:54         ` zimoun
2020-05-29 22:11           ` Stephen Scheck
2020-05-29 23:30 ` Chris Marusich
2020-05-29 23:55   ` zimoun
2020-05-30 17:13     ` Stephen Scheck
2020-05-31  9:37       ` zimoun
2020-05-31 18:30         ` Stephen Scheck
2020-05-31 18:51           ` zimoun
2020-05-31 19:43             ` Stephen Scheck
2020-05-31 23:27               ` zimoun
2020-05-31 21:04           ` Chris Marusich
2020-06-01  0:37             ` zimoun
2020-05-30 17:02   ` Stephen Scheck
2020-05-31  4:31     ` Chris Marusich
2020-05-31  9:08       ` zimoun
2020-05-31 17:50       ` Stephen Scheck [this message]
2020-05-31 18:33         ` zimoun
2020-05-31  8:24     ` zimoun
2020-05-31 10:50       ` Vincent Legoll
2020-05-31 17:58         ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKjnHz0ngmPC3dYu6czZTLAbXMyw7HfXiLr4fhdTyDiUo42vdA@mail.gmail.com \
    --to=singularsyntax@gmail.com \
    --cc=cmmarusich@gmail.com \
    --cc=help-guix@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).