unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
From: Chris Marusich <cmmarusich@gmail.com>
To: Stephen Scheck <singularsyntax@gmail.com>
Cc: help-guix <help-guix@gnu.org>
Subject: Re: Guix Docker image inflation
Date: Sat, 30 May 2020 21:31:51 -0700	[thread overview]
Message-ID: <87367glo7c.fsf@gmail.com> (raw)
In-Reply-To: <CAKjnHz1=4v5kMQ8G6+_rQpMrK183HD5JUW8YKuvFPVX-e_UfWw@mail.gmail.com> (Stephen Scheck's message of "Sat, 30 May 2020 13:02:02 -0400")

[-- Attachment #1: Type: text/plain, Size: 3488 bytes --]

Hi Stephen,

Stephen Scheck <singularsyntax@gmail.com> writes:

> Layers certainly add some image size overhead, but I don't think that
> is the culprit here.

> Also, layers are helpful in the case of someone pulling down daily
> Guix Docker images on a frequent basis, because then only the new,
> ideally small layers need to be downloaded, whereas if you rebase for
> every image build, you'd have to download the entire image every day.

That is true, but suppose I have the following 3 images:

- Image A: A base image created in January 2020.
- Image B: Based on A, and I ran "guix pull" in February 2020.
- Image C: Based on A, and I ran "guix pull" in June 2020.

I would guess that the size difference between A and B is approximately
the same as the difference between A and C.  It'll be different, of
course, but generally the size difference between A and C should not
grow linearly with time, since "guix pull" is only going to install at
most the total closure of things necessary to build and run Guix, which
doesn't increase much in size as time goes on.  However, when you
daisy-chain the images every day, the image size will grow linearly with
time because the contents of all the previous layers is carried forward.

> My build script issues several `docker exec <container> <command>`
> sequences, followed by a `docker commit <container>`. Intermediate
> changes to the container file system prior to the commit do not
> generate layers, only the net changes after the commit.

There are two problems here.  One is that the image size grows without
bound.  The other is that guix-daemon is failing to GC store items in
the Docker container.  Although they are both concerning, the latter is
not the cause of the former.

If you install new store items (e.g., via "guix pull"), make them dead,
and then GC them, all in the same container before running "docker
commit", then I agree: those GC'd store items would not persist in a
layer anywhere.  However, I don't think that's what's happening here.
Sure, there might be a few store items like this, but in practice, there
will be many store items from the previous image which began live but
became dead when you ran "guix pull" and deleted your old profile
generations.  It is those store items that are adding the most space to
your image.

Besides store items, I noticed two other things about your images:

- The contents of /var is growing slowly without bound, but it isn't
  nearly as bad as the contents of /gnu/store.  This is probably due to
  log files; consider pruning them.

- Your script runs "docker commit" while guix-daemon (and other
  programs) are still running.  To ensure the guix-daemon's database (or
  other things) does not become corrupt, consider terminating all
  processes before committing the new image.

> FYI, Guix itself can build Docker images from scratch - no base image
>> required!  It can even build a Docker image of a full-blown Guix System
>> from scratch.  Sorry if you already knew that - I just wanted to point
>> it out in case you didn't!
>>
>
> Yes, thanks, I know - if you read through the thread you'll see that I make
> reference to  `guix system docker-image [...]`.

I apologize for not reading your thread more closely to begin with.  I
took a closer looks, and I think I can explain what is going on now.
Please check the bug report and reply there if anything is unclear.

-- 
Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2020-05-31  4:32 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-27 19:41 Guix Docker image inflation Stephen Scheck
2020-05-28 18:10 ` Leo Famulari
2020-05-29 16:19   ` Stephen Scheck
2020-05-29 17:08     ` Leo Famulari
2020-05-29 17:56       ` Stephen Scheck
2020-05-29 18:02         ` Leo Famulari
2020-05-29 18:21           ` Marius Bakke
2020-05-29 18:37             ` Leo Famulari
2020-05-29 18:44               ` zimoun
2020-05-29 21:24                 ` Stephen Scheck
2020-05-29 18:29           ` Stephen Scheck
2020-05-29 17:12     ` zimoun
2020-05-29 17:36       ` Stephen Scheck
2020-05-29 18:08 ` zimoun
2020-05-29 18:47   ` Stephen Scheck
2020-05-29 20:02     ` zimoun
2020-05-29 21:04       ` Stephen Scheck
2020-05-29 21:54         ` zimoun
2020-05-29 22:11           ` Stephen Scheck
2020-05-29 23:30 ` Chris Marusich
2020-05-29 23:55   ` zimoun
2020-05-30 17:13     ` Stephen Scheck
2020-05-31  9:37       ` zimoun
2020-05-31 18:30         ` Stephen Scheck
2020-05-31 18:51           ` zimoun
2020-05-31 19:43             ` Stephen Scheck
2020-05-31 23:27               ` zimoun
2020-05-31 21:04           ` Chris Marusich
2020-06-01  0:37             ` zimoun
2020-05-30 17:02   ` Stephen Scheck
2020-05-31  4:31     ` Chris Marusich [this message]
2020-05-31  9:08       ` zimoun
2020-05-31 17:50       ` Stephen Scheck
2020-05-31 18:33         ` zimoun
2020-05-31  8:24     ` zimoun
2020-05-31 10:50       ` Vincent Legoll
2020-05-31 17:58         ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87367glo7c.fsf@gmail.com \
    --to=cmmarusich@gmail.com \
    --cc=help-guix@gnu.org \
    --cc=singularsyntax@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).