unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Csepp <raingloom@riseup.net>
To: Simon Tournier <zimon.toutoune@gmail.com>
Cc: "Ludovic Courtès" <ludo@gnu.org>,
	"Andreas Enge" <andreas@enge.fr>,
	guix-devel@gnu.org
Subject: Re: How many bytes do we add (closure of guix) when adding one new package?
Date: Tue, 30 May 2023 21:10:07 +0200	[thread overview]
Message-ID: <87r0qxvd9q.fsf@riseup.net> (raw)
In-Reply-To: <87cz2it3yk.fsf@gmail.com>


Simon Tournier <zimon.toutoune@gmail.com> writes:

> Hi,
>
> On ven., 26 mai 2023 at 18:21, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> I agree that .go files are quite big (.scm files as well, but we’ve
>> improved information density somewhat by removing input labels :-)).
>>
>> The size of .go files went down when we switch to the baseline compiler
>> (aka. -O1):
>>
>>   https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00071.html
>>
>> That thread has ideas of things to do to further reduce .go size.
>
> Just to put a figure on what means “big”: currently the .go files are 5
> times bigger than their associated .scm.
>
> Somehow, it’s the trap of DSL. :-) Packages are declarative and the
> information they declare is not dense.  However, because they are
> bytecompiled to a general programming language, their specificity is not
> exploited.  In an ideal world, the compiled binary representation of the
> packages should be smaller than their human-readable text-file
> counterpart.
>
> The mentioned improvement is nice.  And it’s visible:
>
> --8<---------------cut here---------------start------------->8---
> 145M /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
> 117M /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
> 127M /gnu/store/ndii4bpyzh2rc05ya61s89rig9hdrl4k-guix-a0178d34f-modules/lib/guile/3.0/site-ccache/gnu
> 164M /gnu/store/ni63a203jf61dwxlv8kr9b8x3vb1pdsp-guix-8e2f32cee-modules/lib/guile/3.0/site-ccache/gnu
> --8<---------------cut here---------------end--------------->8---
>
> However, it has almost no impact on the whole size; scaled by the number
> of packages.
>
>> Download size has to be treated separately though.  For example, ‘git
>> pull’ doesn’t redownload all of the repo or directory, and it uses
>> compression heavily.  Thus, a few hundred bytes of additional .scm text
>> translate in less than that.
>>
>> As for the rest, download size can be reduced for example by choosing a
>> content-address transport, like something based on ERIS.
>>
>> I think we must look precisely at what we want to optimize—on-disk size,
>> or bandwidth requirement, in particular—and look at the whole solution
>> space.
>
> I think one direction is to tackle the way *package-modules* is built.
> Because of that, Guix is building too much and the design is not optimal
> – whatever technical solutions we implement for improving after that.
>
> On my poor laptop, Guix is becoming unusable because many operations are
> becoming so slow – when it’s still acceptable with APT of Debian.  For
> instance, it’s something like 20 minutes for running “guix pull” without
> substitutes.  And when I am traveling without a fast Internet
> connection, it’s often too much for the network at hand.
>
> Currently, “guix pull” is either building too much and downloading too
> much; by design.
>
>
> Cheers,
> simon

Something I've been considering is if Guix could make use of database
optimizations on its packages.  Having access to Scheme for everything
is nice, but using it as a storage solution is kind of silly when we are
mostly just storing structs.  Some kind of struct-of-arrays optimization
could definitely reduce their size by a lot, might even speed up some
operations.  It makes zero sense to load full package definitions from
disk for most queries, such as guix search, with an SoA representation
we could load only the fields that we care about.

ps.: Now I'm even more glad that I'm using a file system with
transparent compression on all my Guix systems.


  reply	other threads:[~2023-05-30 19:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <ZEZWS/h9xa/ZX3/E@jurong>
     [not found] ` <875y9jzl9m.fsf@gnu.org>
     [not found]   ` <874jot19fd.fsf_-_@gnu.org>
     [not found]     ` <87fs7rvv5s.fsf_-_@gnu.org>
     [not found]       ` <ZGj3hGKGwu3mQklT@jurong>
     [not found]         ` <878rddooy4.fsf@gnu.org>
2023-05-25 18:24           ` How many bytes do we add (closure of guix) when adding one new package? Simon Tournier
2023-05-26 16:21             ` Ludovic Courtès
2023-05-30 12:10               ` Simon Tournier
2023-05-30 19:10                 ` Csepp [this message]
2023-05-31  8:05                   ` Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?) Simon Tournier
2023-05-31 11:10                     ` Csepp
2023-05-31 11:55                       ` Attila Lendvai
2023-05-30 20:55                 ` How many bytes do we add (closure of guix) when adding one new package? Jack Hill
2023-05-31  8:27                   ` Simon Tournier
2023-05-31 12:47                     ` Guillaume Le Vaillant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0qxvd9q.fsf@riseup.net \
    --to=raingloom@riseup.net \
    --cc=andreas@enge.fr \
    --cc=guix-devel@gnu.org \
    --cc=ludo@gnu.org \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).