From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id HAqDEeYb1F4eEAAA0tVLHw (envelope-from ) for ; Sun, 31 May 2020 21:04:38 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id gOnUDOYb1F6vIQAAB5/wlQ (envelope-from ) for ; Sun, 31 May 2020 21:04:38 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id BFD95940039 for ; Sun, 31 May 2020 21:04:37 +0000 (UTC) Received: from localhost ([::1]:34496 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jfV8G-0005Zb-Gi for larch@yhetil.org; Sun, 31 May 2020 17:04:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53096) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jfV88-0005ZU-S6 for help-guix@gnu.org; Sun, 31 May 2020 17:04:28 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]:35643) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jfV87-0005dU-6s for help-guix@gnu.org; Sun, 31 May 2020 17:04:28 -0400 Received: by mail-pl1-x62a.google.com with SMTP id q16so3421887plr.2 for ; Sun, 31 May 2020 14:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=8XOJZgLMhXd44HmkkE6IbsVCp4+P6WqF5ic9m9pFgDY=; b=IQzKgAzKgXu+oa27TvZX8VC9m5Ybkq7hXg8fIdbH91ek92QlX56tgUThvGD72WdG3a FDY4WjeTjUdQByUqWfcuJp9IjT7FylEcje+xhhMcnTxbz1eOj725anZUnUi8LhPiMzE2 LJb6e1EQbk2atiqrZ5jhx7cta+3HwAfMj63pO0YRwaCT2fL2GcuwyPU/c7SMJgmpz3Eq EQcTSjanTKCcXUrfleCJXWd694hOjXSwmnnQOY0BedEvhXCh7rtSefrBA0gOIHcHrR39 Ixlh65mJPQBM1jBEXL1HqE7U9cmvRdsI2ZrW3oI+LD7scwtkvC7YIJZAajSaFhPtMhu0 8j2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=8XOJZgLMhXd44HmkkE6IbsVCp4+P6WqF5ic9m9pFgDY=; b=UashbTzWkHHLEkwK3amHgp2mcSn02FNbJlpESditTRyYHEKzYX2bYitvoCT1rBByV8 QF4dDE1NIzXlcL1GFrrQeeE44ot4rN0cR5HIVAKPsUhs0Q3x4m/u4+/k/32SuUs878Hq Ws/cCJHHOz6SgLCs4Ocq0EiiXdffhU0tWrXWLa3XhrsfJSl8p68nADh3t4thogJz4fG4 QtgEpHpebTJR14UKp5AbrnKIS0qoGZsa/P59kHQlVXr/BKh4MKoWNwX5E4znJxGeKTIp NgvdcryT5R1SqZ6pXKrDLYFMqe3PV35sHvP5zPq7l+Ss/TUDJeQR8R/y0whDb7j99MJd Wmpg== X-Gm-Message-State: AOAM531yRluoL2r8UvVEuiD1hPveLYaU2dnroQzQD6oZSQhju34wKe69 W8ZU8YFhWSsbBQMc9ZCtGcb98EWKeow= X-Google-Smtp-Source: ABdhPJzw1f3YTQwFuJW2iBJH7c9z4buXplxFzoEAIUB4q1209T4hF2swMUdOOhFXjLEB/vmDrpRrjg== X-Received: by 2002:a17:90a:3608:: with SMTP id s8mr16580767pjb.167.1590959064977; Sun, 31 May 2020 14:04:24 -0700 (PDT) Received: from garuda-lan (c-73-97-103-127.hsd1.wa.comcast.net. [73.97.103.127]) by smtp.gmail.com with ESMTPSA id g1sm12298703pfo.142.2020.05.31.14.04.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 May 2020 14:04:23 -0700 (PDT) From: Chris Marusich To: Stephen Scheck Subject: Re: Guix Docker image inflation References: <87h7vyxqrz.fsf@gmail.com> Date: Sun, 31 May 2020 14:04:19 -0700 In-Reply-To: (Stephen Scheck's message of "Sun, 31 May 2020 14:30:16 -0400") Message-ID: <87h7vvx1d8.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=cmmarusich@gmail.com; helo=mail-pl1-x62a.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: help-guix@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: help-guix Errors-To: help-guix-bounces+larch=yhetil.org@gnu.org Sender: "Help-Guix" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (rsa verify failed) header.d=gmail.com header.s=20161025 header.b=IQzKgAzK; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of help-guix-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=help-guix-bounces@gnu.org X-Spam-Score: 0.49 X-TUID: z5PuSzjHFoxH --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Stephen Scheck writes: > IF any of the store files resulting from `guix pull` are ephemeral > (i.e. intermediate build results not anchored to a profile) AND guix > GC worked inside the container, my approach might still work - yes > there would be image and layers growth but it might be small enough > not to care between periodic image rebases. But I'm starting to doubt > that, or at least it is difficult to quantify with the GC issues. I think you're right about it being difficult to quantify the GC issues. Basically, when you run "guix pull", the current Guix will "build" (i.e., maybe download via substitutes, maybe build from source) the new Guix, which puts it into the store, and updates the profile symlinks to make it current. In the process of doing this, some intermediate builds might be performed if substitutes are not available. Although the new Guix will remain live in the store after the profile symlinks are updated to make it current, (1) intermediate results might be left dead after "guix pull" is finished, and (2) if the old Guix is sufficiently different from the new Guix, it will also become dead after the symlinks that were keeping it live are removed. So, the amount of garbage that will be left over depends on a few factors, like whether substitutes were available, and how different the new Guix is from the old one. It can also depend on how the guix-daemon has been started (see "--gc-keep-outputs" and --gc-keep-derivations" in the "Invoking guix-daemon" section of the manual). In the case of your Docker images, most (all?) of the garbage is coming from case (2) above: as Guix changes, the old Guix will be made dead and GC'd (hypothetically, let's suppose GC is working), but it will still exist on prior layers, since it came from a prior layer. As for case (1), the intermediate results, I think they are not contributing to your image size for two reasons: substitutes are probably available, and even if they weren't available, the intermediates would probably appear during "guix pull", which means they'd be on the top layer and would be GC'd, so they wouldn't be included in any layer of the next image. The fact that the biggest dead paths in your latest image consist entirely of store paths that look suspiciously like they came from prior Guix installations is further evidence in support of this theory. =2D-8<---------------cut here---------------start------------->8--- root@guix /# du -Phc $(guix gc --list-dead) 2>/dev/null | sort -hk 1,1 | ta= il finding garbage collector roots... determining live/dead paths... 187M /gnu/store/0vwg9aqzs5xrk10vcs4dl105s3f42ilf-guix-b1affd477-modules/lib= /guile/3.0/site-ccache 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib= /guile 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib= /guile/3.0 187M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules/lib= /guile/3.0/site-ccache 194M /gnu/store/hz2rn2l0jixg91q4rsdcwc489y71ll29-guix-05e1edf22-modules 198M /gnu/store/5mhn1ynxvy7jihsknsnv3yspkkvc0r5s-guix-2e59ae238-modules 210M /gnu/store/0vwg9aqzs5xrk10vcs4dl105s3f42ilf-guix-b1affd477-modules 210M /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules 3.0G total root@guix /#=20 =2D-8<---------------cut here---------------end--------------->8--- These "guix-HASH-modules" directories, for example, are used as part of each Guix installation: =2D-8<---------------cut here---------------start------------->8--- root@guix /# realpath ~/.config/guix/current/share/guile /gnu/store/mj6pf6nf0kf03nhh7bmpc6m43v6knq6m-guix-a5374cde9-modules/share/gu= ile root@guix /#=20 =2D-8<---------------cut here---------------end--------------->8--- Each of them has a total closure size of almost 500 MB, although since they might share some references, each one individually is adding "only" about 200 MB. =2D-8<---------------cut here---------------start------------->8--- root@guix /# guix size /gnu/store/mj6pf6nf0kf03nhh7bmpc6m43v6knq6m-guix-a53= 74cde9-modules store item total s= elf /gnu/store/mj6pf6nf0kf03nhh7bmpc6m43v6knq6m-guix-a5374cde9-modules 485.9 = 206.9 42.6% /gnu/store/hkmsljl2sf4nk96b35f0bmfkr2lqanfq-guix-packages-base 105.7 = 105.7 21.8% /gnu/store/s7izb7j0s5rzcq297nd7ba9sfiqh5zmz-guix-system 43.2 = 43.2 8.9% /gnu/store/fa6wj5bxkj5ll1d7292a70knmyl7a0cr-glibc-2.31 38.4 = 36.7 7.6% /gnu/store/01b4w3m6mp55y531kyi1g8shh722kwqm-gcc-7.5.0-lib 71.0 = 32.6 6.7% /gnu/store/wcv5mscivggkygnz68nn2671fr3kapjc-guix-packages-base-source 19= .4 19.4 4.0% /gnu/store/6zygksmvzcq92xf65cna91dbf7a4zblh-guix-extra 19.4 = 19.4 4.0% /gnu/store/a7wiy24mmcilbqp39pl0jdlw10vbvavb-guix-cli 8.0 = 7.3 1.5% /gnu/store/f6k9b4grrfpip4h5lrmpnsnn2gqziihr-guix-system-tests 4.6 = 4.6 1.0% /gnu/store/gbrd1laxsncb9zd218pyglisxyxymmbd-guix-system-source 1.9 = 1.9 0.4% /gnu/store/mmhimfwmmidf09jw1plw3aw1g1zn2nkh-bash-static-5.0.16 1.6 = 1.6 0.3% /gnu/store/5lr8miawrk380zw8yjy0crcl6vcs10s3-guix-extra-source 1.5 = 1.5 0.3% /gnu/store/pwcp239kjf7lnj5i4lkdzcfcxwcfyk72-bash-minimal-5.0.16 39.4 = 1.0 0.2% /gnu/store/r7k859hmcnkazf492fasqvk25jflnfk6-xz-5.2.4 73.0 = 0.9 0.2% /gnu/store/bhs4rj58v8j1narb2454raan2ps38xd8-grep-3.4 72.9 = 0.8 0.2% /gnu/store/z0572147hprpbjrcjqkgrv3f80ip2klx-guix-cli-source 0.7 = 0.7 0.1% /gnu/store/a9f7wmc75hbpg520phw9z4l9asm3qvsw-bzip2-1.0.8 72.5 = 0.4 0.1% /gnu/store/7y0nin2d0j46j26a1n46bl5zl3px0zvz-guix-system-tests-source 0.= 3 0.3 0.1% /gnu/store/rykm237xkmq7rl1p0nwass01p090p88x-zlib-1.2.11 71.2 = 0.2 0.0% /gnu/store/jqr5bz89gfwhxcndnhq333dyclvkq7ws-lzlib-1.11 71.2 = 0.2 0.0% /gnu/store/378zjf2kgajcfd7mfr98jn5xyc5wa3qv-gzip-1.10 73.1 = 0.2 0.0% /gnu/store/kfj1lc84v50imn3raijgih4salilmf1a-guix-packages-base-modules 12= 5.2 0.0 0.0% /gnu/store/lvszhqs57scb2ax18l2nrn9dwiyf6iza-guix-system-tests-modules 4= .9 0.0 0.0% /gnu/store/lr65f259z1730p7bvplsj9k6yvbkyh39-guix-system-modules 45.1 = 0.0 0.0% /gnu/store/nk1x6cdif8pd9vi04nzxfqinh0ag06am-guix-extra-modules 20.9 = 0.0 0.0% /gnu/store/s6vlfscnfvnrlv3yfag6qsy5j6c9pxqb-guix-cli-modules 8.0 = 0.0 0.0% total: 485.9 MiB root@guix /#=20 =2D-8<---------------cut here---------------end--------------->8--- And there are still other components adding space each time you run "guix pull", like the "guix-system" component, for example: =2D-8<---------------cut here---------------start------------->8--- root@guix /# du -Phc $(guix gc --list-dead | grep guix-system) 2>/dev/null = | sort -hk 1,1 | tail finding garbage collector roots... determining live/dead paths... 44M /gnu/store/qhbk7g8z97m37iak1s1yn2my82gv0lj5-guix-system/gnu 44M /gnu/store/slwkzcmg6r1lr9a16x3krd2ax384p8wr-guix-system 44M /gnu/store/slwkzcmg6r1lr9a16x3krd2ax384p8wr-guix-system/gnu 44M /gnu/store/vwzk618h1wxy6z9i06xnhnxj4gvhkiss-guix-system 44M /gnu/store/vwzk618h1wxy6z9i06xnhnxj4gvhkiss-guix-system/gnu 44M /gnu/store/w47fgv8p2hvaqdwywymwvm0qlh4gw0ih-guix-system 44M /gnu/store/w47fgv8p2hvaqdwywymwvm0qlh4gw0ih-guix-system/gnu 44M /gnu/store/zf67wb6c0s97vwmywjq09hy9jq0w5mmi-guix-system 44M /gnu/store/zf67wb6c0s97vwmywjq09hy9jq0w5mmi-guix-system/gnu 523M total root@guix /#=20 =2D-8<---------------cut here---------------end--------------->8--- Anyway, the point is: you begin with a previous image. The previous image already has these store paths from the previous installation of Guix. Therefore, they exist on the previous layer. Because they exist on the previous layer, they cannot be removed from the Docker image, and they are carried forward in that previous layer, to all new images. Regardless of any changes to guix-daemon we might make, the way in which you build your images will cause them to grow by hundreds of megabytes every day. > Actually, there might be another way around this, still avoiding the > need for a custom Runner, for example mounting /var/guix and > /gnu/store into the container instead of belonging to it. If done that > way, layer accumulation wouldn't be an issue, and maybe GC between > layers neither. This sounds like a great idea, actually! "The right way" to do Docker containers is to have a single process per container, and to not store state in the Docker container. We're violating that principle on both counts when we run an entire GNU/Linux distribution inside a Docker container, especially since the guix-daemon is all about managing the "state" of /var/guix and /gnu/store. If you can somehow move that "state" into a Docker volume instead of the container itself, that would definitely be an improvement. It may be tricky, though, since if guix-daemon sees stuff in /gnu/store that is inconsistent with its database in /var/guix, bad things can happen. So you'll have to ensure they remain consistent with one another. >> Besides store items, I noticed two other things about your images: >> >> - The contents of /var is growing slowly without bound, but it isn't >> nearly as bad as the contents of /gnu/store. This is probably due to >> log files; consider pruning them. >> > > These are presumably OK to delete, without any special handling for Guix? I think the answer is "probably", but I would stop guix-daemon first. Other processes may be using /var, too, so I would stop them, also. >> - Your script runs "docker commit" while guix-daemon (and other >> programs) are still running. To ensure the guix-daemon's database (or >> other things) does not become corrupt, consider terminating all >> processes before committing the new image. >> > > `docker commit` pauses the container (unless you tell it not to) ... > although I guess that could still cause problems if Guix store writes > aren't implemented in an atomic way. I'm not sure what "pause" means in the Docker documentation, but since I can run "docker commit" while running a shell in the container, and the shell doesn't get terminated, it clearly doesn't terminate the processes. It might be safe do just pause the container when committing, but it's definitely safe if you gracefully shut down all processes first. This definitely ensures that things like databases are left in known good states when committing the image. What I'm saying is that, yeah sure, you can probably get away with not gracefully shutting down the processes. Similarly, you can often get away with pulling the power cord out of your computer because a lot of software and storage is pretty robust by default nowadays. However, it increases the risk of encountering a problem like data corruption, so it's better to shut things down gracefully if you can. =2D-=20 Chris --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAl7UG9MACgkQ3UCaFdgi Rp0gpg//dI6A7g51VGuomWjILrhdgaHpqmplVjQZll+gCaWMNI8yW5dDp6P88p4o LQ8gdBz0HgWLpxcMKCazT4WQ8col8POaSSV5Z0B5eugRPkqUkGFrRk4aY0m7Dc0w KyWVITLlXgNnxdzsYF7p2WyRatVn79RN9qVuenwxEkuHcaouAKuagGR9jpxQy1r5 2For2f7XAsRuiE/kQqTA01DXeSwZw6B4rHXMMraOYILWXRyojcrLtPrB61YEwpRu eoSvgPjCyfGn7B2uOTIltXsyxOFxQvohzIcuLP+XFE4yaccjtqG0O4SFbTLyZpRT RgLHvgMguIGbJlXLyODJ+wDkdqNHFHB/r4Eno4+iOxDUTLk+VyLGBnbMiTTKsSjW lHLh9nBJOdNEO4Go7/oGsjUnDPqttSmjNFFbfdULFol+ZLDrIHxqNV8jRTkvAMNY g6pluGuWQOfDg24kVJpLhOsYmjK9QU+Fx1dO1JL9WqB1/bYoz4nebM2u1KykekZf L1VITL+Gg88WHnOxBh2jVsSMW2exZ6gLPg7tanW1LyQDvk4ytOGuyUsVTs1hPwyr zCzoM6Ra+wMC1VOa5zoJqowbwk3SxFLpMhdiNLB8h09RmTmwSS8V2y+7WwuuwD8T Gy9inVGpqfyTjG9PmDpAjkel1Lebwh/cziHEBVvi+FDNSsNLj9w= =Irc3 -----END PGP SIGNATURE----- --=-=-=--