* How many bytes do we add (closure of guix) when adding one new package?
[not found] ` <878rddooy4.fsf@gnu.org>
@ 2023-05-25 18:24 ` Simon Tournier
2023-05-26 16:21 ` Ludovic Courtès
0 siblings, 1 reply; 10+ messages in thread
From: Simon Tournier @ 2023-05-25 18:24 UTC (permalink / raw)
To: Guix Devel; +Cc: Ludovic Courtès, Andreas Enge
Hi,
The initial discussion was about the closure of Guix and that “guix pull”
brings graphical libraries. See #63050 [1].
Here, I would like to open a discussion about how Guix scales,
i.e. about its size. I am trying to answer the question I am asking as
the subject. ;-)
It’s another angle to view Andreas and Ludo discussion: :-)
>> Note that I do not care so much about the closure size, but about the
>> number of packages that are needed to just build guix (although of course
>> the two are related). Or otherwise said, the dependencies for "guix pull".
>
> Yes, understood. Graphviz is not in the closure anyway, it’s a
> build-only dependency.
Somehow, the closure is increasing:
--8<---------------cut here---------------start------------->8---
$ for i in $(seq 1 4); do guix time-machine --commit=v1.$i.0 -- size guix | grep 'total:' ;done
total: 410.9 MiB
total: 496.0 MiB
total: 564.8 MiB
total: 637.2 MiB
$ guix size guix | grep 'total:'
total: 611.2 MiB
--8<---------------cut here---------------end--------------->8---
(Yeah, the package guix is not exactly the same as guix itself, but it
appears to me a good enough approximation. And my current revision is
14c0380.)
Compare:
--8<---------------cut here---------------start------------->8---
$ guix time-machine --commit=v1.1.0 -- size guix --sort=self | wc -l
44
$ guix time-machine --commit=v1.4.0 -- size guix --sort=self | wc -l
72
$ guix size guix --sort=self | wc -l
70
--8<---------------cut here---------------end--------------->8---
which is the Andreas’s concern for “exotic” architectures. Moreover,
the inflation (in size) is about some packages that are just becoming
bigger.
--8<---------------cut here---------------start------------->8---
$ guix time-machine --commit=v1.1.0 -- size guix --sort=self | head
store item total self
/gnu/store/fp16m5hkzql7jwhvnkm1j1i5qch0arhx-guix-1.1.0rc2-1.9d0d27f 410.9 221.6 53.9%
/gnu/store/1mkkv2caiqbdbbd256c4dirfi4kwsacv-guile-2.2.6 123.9 44.4 10.8%
/gnu/store/ahqgl4h89xqj695lgqvsaf6zh2nhy4pj-glibc-2.29 37.4 35.8 8.7%
/gnu/store/2plcy91lypnbbysb18ymnhaw3zwk8pg1-gcc-7.4.0-lib 70.0 32.6 7.9%
/gnu/store/n79cf8bvy3k96gjk1rf18d36w40lkwlr-glibc-utf8-locales-2.29 13.9 13.9 3.4%
/gnu/store/k2m4q2av9hw73hw2jx6qrxqdyh855398-openssl-1.1.1c 76.4 6.4 1.6%
/gnu/store/gzp4ig4rdb1qf4i5dy1d9nl0zmj5q09y-ncurses-6.1-20190609 75.9 5.9 1.4%
/gnu/store/hfvz18igm68p5yz7z4asn6ph363blp1z-gnutls-3.6.9 130.6 5.1 1.2%
/gnu/store/b5vjmib411m74lbpf051fnwz3s9zcw79-guile-git-0.3.0 98.8 4.4 1.1%
$ guix time-machine --commit=v1.4.0 -- size guix --sort=self | head
store item total self
/gnu/store/9nvx97hr8kkr26gzwni2fblfn0yq0xjw-guix-1.4.0rc2 637.2 330.1 51.8%
/gnu/store/qlmpcy5zi84m6dikq3fnx5dz38qpczlc-guile-3.0.8 130.0 53.0 8.3%
/gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7 129.1 52.0 8.2%
/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33 38.3 36.6 5.7%
/gnu/store/094bbaq6glba86h1d4cj16xhdi6fk2jl-gcc-10.3.0-lib 71.7 33.4 5.2%
/gnu/store/96srhmpmxa20wmsck95g3iq4hb3lz4a0-glib-2.70.2 98.1 15.3 2.4%
/gnu/store/mw3py6smb1pk8yx298hd9ivz9lzbksqi-glibc-utf8-locales-2.33 13.9 13.9 2.2%
/gnu/store/5583c2za2jsn9g6az79rnksgvigwnsk7-util-linux-2.37.2-lib 80.7 9.0 1.4%
/gnu/store/9rrnm5hdjw7cy96a2a9rfgh6y08wsbmf-ncurses-6.2.20210619 77.6 5.9 0.9%
$ guix size guix --sort=self | head
store item total self
/gnu/store/cgjddvw9zay626z8hyxl0zmn1354c24k-guix-1.4.0-6.dc5430c 611.2 350.2 57.3%
/gnu/store/4gvgcfdiz67wv04ihqfa8pqwzsb0qpv5-guile-3.0.9 135.0 53.1 8.7%
/gnu/store/gsjczqir1wbz8p770zndrpw4rnppmxi3-glibc-2.35 40.6 38.8 6.3%
/gnu/store/930nwsiysdvy2x5zv1sf6v7ym75z8ayk-gcc-11.3.0-lib 75.3 34.7 5.7%
/gnu/store/nb40pwd37v6i1g4b1fq4l6q4h9px3asr-glib-2.72.3 101.3 14.9 2.4%
/gnu/store/5fmqijrs5f7vx8mc2q2pmq26yvhb74sm-glibc-utf8-locales-2.35 13.9 13.9 2.3%
/gnu/store/gwx2sf5wl9bsl21lwv35g5la63bwyy95-util-linux-2.37.4-lib 84.3 9.0 1.5%
/gnu/store/69wd3pd1hd3j84xr965jj2fk2qmxn0hl-openssl-3.0.8 83.4 8.1 1.3%
/gnu/store/bcc053jvsbspdjr17gnnd9dg85b3a0gy-ncurses-6.2.20210619 81.2 5.9 1.0%
--8<---------------cut here---------------end--------------->8---
Considering Guix itself, one explanation for the increase is the number
of packages – assuming the services and other are negligible; “git diff
--shortstat” is a good indicator at first sight. Well, we could be more
precise about the documentation. Hum, this ugly,
--8<---------------cut here---------------start------------->8---
$ for doc in $(for ci in $(for t in $(git tag | grep v1 | grep -v rc ); do git --no-pager show $t | grep commit ;done); do for d in $(find ~/.cache/guix/inferiors/ -type l -print); do printf "$d "; $d/bin/guix --version 2>/dev/null ;done | grep $ci ;done | cut -f1 -d' '); do du -sh $(readlink -f $doc)/share/* ;done | grep info
172K /gnu/store/5pa1706ckwhn6x4mn5kl2b7h15k3in9x-profile/share/info
200K /gnu/store/z1icpkfbz59dr7k7rnb0jd8j1ii8mdph-profile/share/info
376K /gnu/store/hm0rwgcvrs85y3hgjsw8616cxy61h6si-profile/share/info
304K /gnu/store/zbrgzk7l0j7805i82sl3gmx6y2b0iz9q-profile/share/info
--8<---------------cut here---------------end--------------->8---
is probably providing a clue about the assumption.
Ok, let assume that the packages are the main source of size increasing.
The question is: can we evaluate the size for one package? How many
bytes do we add to the whole Guix when we add one package? On average
and roughly.
We have the number of packages and the whole size for successive
versions. Therefore, we can do the difference between the two. We get
[2078, 1848, 4704, 1532] which means 2078 packages had been added
between v1.1.0 and v1.2.0, 1848 between v1.2.0 and v1.3.0, etc.
We can do the same for the size, [22.9, 19.4, 66.2, 20.1] and then we
can compute the ratio: the size per package. Something like:
[0.011020211742059676, 0.010497835497835485, 0.01407312925170069, 0.013120104438642276]
Let get an average: 0.012177820232559531.
Now, let take the number of packages for v1.1.0 and do the
multiplication. We get: 159.6 MiB.
Ok, it means that the difference is more or less the core of Guix – what
we are assuming that is slowly growing. It reads 62 MiB.
Therefore, we can predict the size for the other versions using this
linear model based on the previous evaluated average.
size = mean * number_packages + core
It reads:
[246.9, 269.4, 326.7, 345.3] compared to [244.5, 263.9, 330.1, 350.2].
Hum, this quick back-to-the-envelope computation does not seem too bad.
I guess.
Conclusions:
1. the addition of one package leads to an increase of ~ 12 KiB
2. the core of Guix is about ~ 62 MiB
3. doubling the number of packages is doubling the size to download at
“guix pull” time.
Maybe, we should re-think (guix self). Especially the *package-modules*
part and re-discuss if we could split that part. From my understanding.
Cheers,
simon
1: https://issues.guix.gnu.org/issue/63050
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How many bytes do we add (closure of guix) when adding one new package?
2023-05-25 18:24 ` How many bytes do we add (closure of guix) when adding one new package? Simon Tournier
@ 2023-05-26 16:21 ` Ludovic Courtès
2023-05-30 12:10 ` Simon Tournier
0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2023-05-26 16:21 UTC (permalink / raw)
To: Simon Tournier; +Cc: Guix Devel, Andreas Enge
Hello!
Thanks for the detailed analysis!
Simon Tournier <zimon.toutoune@gmail.com> skribis:
> Conclusions:
>
> 1. the addition of one package leads to an increase of ~ 12 KiB
>
> 2. the core of Guix is about ~ 62 MiB
>
> 3. doubling the number of packages is doubling the size to download at
> “guix pull” time.
I agree that .go files are quite big (.scm files as well, but we’ve
improved information density somewhat by removing input labels :-)).
The size of .go files went down when we switch to the baseline compiler
(aka. -O1):
https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00071.html
That thread has ideas of things to do to further reduce .go size.
Download size has to be treated separately though. For example, ‘git
pull’ doesn’t redownload all of the repo or directory, and it uses
compression heavily. Thus, a few hundred bytes of additional .scm text
translate in less than that.
As for the rest, download size can be reduced for example by choosing a
content-address transport, like something based on ERIS.
I think we must look precisely at what we want to optimize—on-disk size,
or bandwidth requirement, in particular—and look at the whole solution
space.
My 2¢!
Ludo’.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How many bytes do we add (closure of guix) when adding one new package?
2023-05-26 16:21 ` Ludovic Courtès
@ 2023-05-30 12:10 ` Simon Tournier
2023-05-30 19:10 ` Csepp
2023-05-30 20:55 ` How many bytes do we add (closure of guix) when adding one new package? Jack Hill
0 siblings, 2 replies; 10+ messages in thread
From: Simon Tournier @ 2023-05-30 12:10 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: Guix Devel, Andreas Enge
Hi,
On ven., 26 mai 2023 at 18:21, Ludovic Courtès <ludo@gnu.org> wrote:
> I agree that .go files are quite big (.scm files as well, but we’ve
> improved information density somewhat by removing input labels :-)).
>
> The size of .go files went down when we switch to the baseline compiler
> (aka. -O1):
>
> https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00071.html
>
> That thread has ideas of things to do to further reduce .go size.
Just to put a figure on what means “big”: currently the .go files are 5
times bigger than their associated .scm.
Somehow, it’s the trap of DSL. :-) Packages are declarative and the
information they declare is not dense. However, because they are
bytecompiled to a general programming language, their specificity is not
exploited. In an ideal world, the compiled binary representation of the
packages should be smaller than their human-readable text-file
counterpart.
The mentioned improvement is nice. And it’s visible:
--8<---------------cut here---------------start------------->8---
145M /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
117M /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
127M /gnu/store/ndii4bpyzh2rc05ya61s89rig9hdrl4k-guix-a0178d34f-modules/lib/guile/3.0/site-ccache/gnu
164M /gnu/store/ni63a203jf61dwxlv8kr9b8x3vb1pdsp-guix-8e2f32cee-modules/lib/guile/3.0/site-ccache/gnu
--8<---------------cut here---------------end--------------->8---
However, it has almost no impact on the whole size; scaled by the number
of packages.
> Download size has to be treated separately though. For example, ‘git
> pull’ doesn’t redownload all of the repo or directory, and it uses
> compression heavily. Thus, a few hundred bytes of additional .scm text
> translate in less than that.
>
> As for the rest, download size can be reduced for example by choosing a
> content-address transport, like something based on ERIS.
>
> I think we must look precisely at what we want to optimize—on-disk size,
> or bandwidth requirement, in particular—and look at the whole solution
> space.
I think one direction is to tackle the way *package-modules* is built.
Because of that, Guix is building too much and the design is not optimal
– whatever technical solutions we implement for improving after that.
On my poor laptop, Guix is becoming unusable because many operations are
becoming so slow – when it’s still acceptable with APT of Debian. For
instance, it’s something like 20 minutes for running “guix pull” without
substitutes. And when I am traveling without a fast Internet
connection, it’s often too much for the network at hand.
Currently, “guix pull” is either building too much and downloading too
much; by design.
Cheers,
simon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How many bytes do we add (closure of guix) when adding one new package?
2023-05-30 12:10 ` Simon Tournier
@ 2023-05-30 19:10 ` Csepp
2023-05-31 8:05 ` Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?) Simon Tournier
2023-05-30 20:55 ` How many bytes do we add (closure of guix) when adding one new package? Jack Hill
1 sibling, 1 reply; 10+ messages in thread
From: Csepp @ 2023-05-30 19:10 UTC (permalink / raw)
To: Simon Tournier; +Cc: Ludovic Courtès, Andreas Enge, guix-devel
Simon Tournier <zimon.toutoune@gmail.com> writes:
> Hi,
>
> On ven., 26 mai 2023 at 18:21, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> I agree that .go files are quite big (.scm files as well, but we’ve
>> improved information density somewhat by removing input labels :-)).
>>
>> The size of .go files went down when we switch to the baseline compiler
>> (aka. -O1):
>>
>> https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00071.html
>>
>> That thread has ideas of things to do to further reduce .go size.
>
> Just to put a figure on what means “big”: currently the .go files are 5
> times bigger than their associated .scm.
>
> Somehow, it’s the trap of DSL. :-) Packages are declarative and the
> information they declare is not dense. However, because they are
> bytecompiled to a general programming language, their specificity is not
> exploited. In an ideal world, the compiled binary representation of the
> packages should be smaller than their human-readable text-file
> counterpart.
>
> The mentioned improvement is nice. And it’s visible:
>
> --8<---------------cut here---------------start------------->8---
> 145M /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
> 117M /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
> 127M /gnu/store/ndii4bpyzh2rc05ya61s89rig9hdrl4k-guix-a0178d34f-modules/lib/guile/3.0/site-ccache/gnu
> 164M /gnu/store/ni63a203jf61dwxlv8kr9b8x3vb1pdsp-guix-8e2f32cee-modules/lib/guile/3.0/site-ccache/gnu
> --8<---------------cut here---------------end--------------->8---
>
> However, it has almost no impact on the whole size; scaled by the number
> of packages.
>
>> Download size has to be treated separately though. For example, ‘git
>> pull’ doesn’t redownload all of the repo or directory, and it uses
>> compression heavily. Thus, a few hundred bytes of additional .scm text
>> translate in less than that.
>>
>> As for the rest, download size can be reduced for example by choosing a
>> content-address transport, like something based on ERIS.
>>
>> I think we must look precisely at what we want to optimize—on-disk size,
>> or bandwidth requirement, in particular—and look at the whole solution
>> space.
>
> I think one direction is to tackle the way *package-modules* is built.
> Because of that, Guix is building too much and the design is not optimal
> – whatever technical solutions we implement for improving after that.
>
> On my poor laptop, Guix is becoming unusable because many operations are
> becoming so slow – when it’s still acceptable with APT of Debian. For
> instance, it’s something like 20 minutes for running “guix pull” without
> substitutes. And when I am traveling without a fast Internet
> connection, it’s often too much for the network at hand.
>
> Currently, “guix pull” is either building too much and downloading too
> much; by design.
>
>
> Cheers,
> simon
Something I've been considering is if Guix could make use of database
optimizations on its packages. Having access to Scheme for everything
is nice, but using it as a storage solution is kind of silly when we are
mostly just storing structs. Some kind of struct-of-arrays optimization
could definitely reduce their size by a lot, might even speed up some
operations. It makes zero sense to load full package definitions from
disk for most queries, such as guix search, with an SoA representation
we could load only the fields that we care about.
ps.: Now I'm even more glad that I'm using a file system with
transparent compression on all my Guix systems.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How many bytes do we add (closure of guix) when adding one new package?
2023-05-30 12:10 ` Simon Tournier
2023-05-30 19:10 ` Csepp
@ 2023-05-30 20:55 ` Jack Hill
2023-05-31 8:27 ` Simon Tournier
1 sibling, 1 reply; 10+ messages in thread
From: Jack Hill @ 2023-05-30 20:55 UTC (permalink / raw)
To: Simon Tournier; +Cc: Ludovic Courtès, Guix Devel, Andreas Enge
[-- Attachment #1: Type: text/plain, Size: 1656 bytes --]
On Tue, 30 May 2023, Simon Tournier wrote:
> Just to put a figure on what means “big”: currently the .go files are 5
> times bigger than their associated .scm.
>
> Somehow, it’s the trap of DSL. :-) Packages are declarative and the
> information they declare is not dense. However, because they are
> bytecompiled to a general programming language, their specificity is not
> exploited. In an ideal world, the compiled binary representation of the
> packages should be smaller than their human-readable text-file
> counterpart.
>
> The mentioned improvement is nice. And it’s visible:
>
> --8<---------------cut here---------------start------------->8---
> 145M /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
> 117M /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
> 127M /gnu/store/ndii4bpyzh2rc05ya61s89rig9hdrl4k-guix-a0178d34f-modules/lib/guile/3.0/site-ccache/gnu
> 164M /gnu/store/ni63a203jf61dwxlv8kr9b8x3vb1pdsp-guix-8e2f32cee-modules/lib/guile/3.0/site-ccache/gnu
> --8<---------------cut here---------------end--------------->8---
This is probably a tagent, sorry, but I was curious how well the .go files
compressed. It seems quite well, actually:
jackhill@mimolette ~/.config/guix/current/lib/guile/3.0/site-ccache/gnu [env]$ sudo compsize .
Processed 595 files, 1659 regular extents (1659 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 21% 36M 173M 173M
none 100% 16K 16K 16K
zstd 21% 36M 173M 173M
Best,
Jack
^ permalink raw reply [flat|nested] 10+ messages in thread
* Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?)
2023-05-30 19:10 ` Csepp
@ 2023-05-31 8:05 ` Simon Tournier
2023-05-31 11:10 ` Csepp
0 siblings, 1 reply; 10+ messages in thread
From: Simon Tournier @ 2023-05-31 8:05 UTC (permalink / raw)
To: Csepp; +Cc: Ludovic Courtès, Andreas Enge, guix-devel
Hi,
On Tue, 30 May 2023 at 21:10, Csepp <raingloom@riseup.net> wrote:
> It makes zero sense to load full package definitions from
> disk for most queries, such as guix search, with an SoA representation
> we could load only the fields that we care about.
That’s already the case; see
~/.config/guix/current/lib/guix/package.cache.
For instance, “guix package -A” exploits it and the performances are
acceptable. Two past summers, wow already! I tried to augment it and
exploit it for “guix search”. The implementation and benchmark is in
#39258 [1]. Well, the whole thread of #39258 appears to me worth to
consider because it spots various bottleneck specific to “guix search”
and explains why the improvement is not straightforward.
Well, I have started months ago to write a Guix extension using
guile-xapian. My aim is to tackle two annoyances: 1. the speed and
2. the relevance.
About the relevance #2, the issue is that the current scoring considers
only the local information of one package without considering the global
information of all the others. Well, see [2,3,4] for some details. :-)
1: https://issues.guix.gnu.org/39258#119
2: https://yhetil.org/guix/CAJ3okZ3E3bhZ5pROZS68wEKdKOcZ8SpXsvdi-bnB=9Jz3mPahA@mail.gmail.com
3: https://yhetil.org/guix/CAJ3okZ3+hn0nJP98OhnZYLWJvhLGpdTUK+jB0hoM5JArQxO=zw@mail.gmail.com
4: https://yhetil.org/guix/CAJ3okZ0LaJzWDBA7bjqZew_jAmtt1rj9PJhevwrtBiA_COXENg@mail.gmail.com
> ps.: Now I'm even more glad that I'm using a file system with
> transparent compression on all my Guix systems.
Did you benchmarked the performances for some Guix operations on these
compressed vs uncompressed file system?
Cheers,
simon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How many bytes do we add (closure of guix) when adding one new package?
2023-05-30 20:55 ` How many bytes do we add (closure of guix) when adding one new package? Jack Hill
@ 2023-05-31 8:27 ` Simon Tournier
2023-05-31 12:47 ` Guillaume Le Vaillant
0 siblings, 1 reply; 10+ messages in thread
From: Simon Tournier @ 2023-05-31 8:27 UTC (permalink / raw)
To: Jack Hill; +Cc: Ludovic Courtès, Guix Devel, Andreas Enge
Hi Jack,
On Tue, 30 May 2023 at 16:55, Jack Hill <jackhill@jackhill.us> wrote:
> $ ~/.config/guix/current/lib/guile/3.0/site-ccache/gnu [env]$ sudo compsize .
> Processed 595 files, 1659 regular extents (1659 refs), 0 inline.
> Type Perc Disk Usage Uncompressed Referenced
> TOTAL 21% 36M 173M 173M
> none 100% 16K 16K 16K
> zstd 21% 36M 173M 173M
Cool! Could you do (or anyone else with btrfs),
guix time-machine --commit=d62c9b2671be55ae0305bebfda17b595f33797f2 \
-- describe
guix time-machine --commit=a099685659b4bfa6b3218f84953cbb7ff9e88063 \
-- describe
then report the size (compsize) of
/gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
/gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
?
Cheers,
simon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?)
2023-05-31 8:05 ` Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?) Simon Tournier
@ 2023-05-31 11:10 ` Csepp
2023-05-31 11:55 ` Attila Lendvai
0 siblings, 1 reply; 10+ messages in thread
From: Csepp @ 2023-05-31 11:10 UTC (permalink / raw)
To: Simon Tournier; +Cc: Csepp, Ludovic Courtès, Andreas Enge, guix-devel
Simon Tournier <zimon.toutoune@gmail.com> writes:
> Hi,
>
> On Tue, 30 May 2023 at 21:10, Csepp <raingloom@riseup.net> wrote:
>
>> It makes zero sense to load full package definitions from
>> disk for most queries, such as guix search, with an SoA representation
>> we could load only the fields that we care about.
>
> That’s already the case; see
> ~/.config/guix/current/lib/guix/package.cache.
>
> For instance, “guix package -A” exploits it and the performances are
> acceptable. Two past summers, wow already! I tried to augment it and
> exploit it for “guix search”. The implementation and benchmark is in
> #39258 [1]. Well, the whole thread of #39258 appears to me worth to
> consider because it spots various bottleneck specific to “guix search”
> and explains why the improvement is not straightforward.
That's a good improvement, but it's in addition to the ELF files, so it
doesn't save any space, and as far as I know it doesn't speed up
non-textual queries, like searching for packages that use a specific
build system.
> Well, I have started months ago to write a Guix extension using
> guile-xapian. My aim is to tackle two annoyances: 1. the speed and
> 2. the relevance.
>
> About the relevance #2, the issue is that the current scoring considers
> only the local information of one package without considering the global
> information of all the others. Well, see [2,3,4] for some details. :-)
>
> 1: https://issues.guix.gnu.org/39258#119
> 2: https://yhetil.org/guix/CAJ3okZ3E3bhZ5pROZS68wEKdKOcZ8SpXsvdi-bnB=9Jz3mPahA@mail.gmail.com
> 3: https://yhetil.org/guix/CAJ3okZ3+hn0nJP98OhnZYLWJvhLGpdTUK+jB0hoM5JArQxO=zw@mail.gmail.com
> 4: https://yhetil.org/guix/CAJ3okZ0LaJzWDBA7bjqZew_jAmtt1rj9PJhevwrtBiA_COXENg@mail.gmail.com
Thanks for the links, gonna read them in more detail later.
>> ps.: Now I'm even more glad that I'm using a file system with
>> transparent compression on all my Guix systems.
>
> Did you benchmarked the performances for some Guix operations on these
> compressed vs uncompressed file system?
I haven't, but I have recently tried to move to a larger drive and
accidentally did a btrfs send without compression and the system didn't
fit.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?)
2023-05-31 11:10 ` Csepp
@ 2023-05-31 11:55 ` Attila Lendvai
0 siblings, 0 replies; 10+ messages in thread
From: Attila Lendvai @ 2023-05-31 11:55 UTC (permalink / raw)
To: Csepp; +Cc: Simon Tournier, Ludovic Courtès, Andreas Enge, guix-devel
> It makes zero sense to load full package definitions from disk for
> most queries, such as guix search, with an SoA representation we
> could load only the fields that we care about.
i'd like to quickly point out something while we are discussing this:
when i came to guix it was rather confusing that there two namespaces through which one can get hold of a package object:
1) module-global variables in the scheme module system
2) a reified repository of packages (i.e. a scheme hash table, and some lookup functions).
these two namespaces are quite indepenedent from each other, and there are no formal rules that govern the relationship between the two.
there are scheme variable reference in some places, some others issue a call to SPECIFICATION->PACKAGE, manifests do that implicitly (?), etc. my gut feeling is that there is potential here for unification and simplification, without limiting composability.
--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“The greatest gift you can give someone is your own personal development. I used to say, 'If you’ll take care of me, I’ll take care of you'. Now I say, 'I will take care of me for you, if you will take care of you for me'.”
— Jim Rohn
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: How many bytes do we add (closure of guix) when adding one new package?
2023-05-31 8:27 ` Simon Tournier
@ 2023-05-31 12:47 ` Guillaume Le Vaillant
0 siblings, 0 replies; 10+ messages in thread
From: Guillaume Le Vaillant @ 2023-05-31 12:47 UTC (permalink / raw)
To: Simon Tournier; +Cc: Jack Hill, Ludovic Courtès, Andreas Enge, guix-devel
[-- Attachment #1: Type: text/plain, Size: 2079 bytes --]
Simon Tournier <zimon.toutoune@gmail.com> skribis:
> Hi Jack,
>
> On Tue, 30 May 2023 at 16:55, Jack Hill <jackhill@jackhill.us> wrote:
>
>> $ ~/.config/guix/current/lib/guile/3.0/site-ccache/gnu [env]$ sudo compsize .
>> Processed 595 files, 1659 regular extents (1659 refs), 0 inline.
>> Type Perc Disk Usage Uncompressed Referenced
>> TOTAL 21% 36M 173M 173M
>> none 100% 16K 16K 16K
>> zstd 21% 36M 173M 173M
>
> Cool! Could you do (or anyone else with btrfs),
>
> guix time-machine --commit=d62c9b2671be55ae0305bebfda17b595f33797f2 \
> -- describe
> guix time-machine --commit=a099685659b4bfa6b3218f84953cbb7ff9e88063 \
> -- describe
>
> then report the size (compsize) of
>
> /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
> /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
>
> ?
>
> Cheers,
> simon
Hi,
With a BTRFS filesystem compressed with zstd:3, I get:
--8<---------------cut here---------------start------------->8---
# compsize /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
Processed 503 files, 1317 regular extents (1317 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 27% 40M 144M 144M
none 100% 10M 10M 10M
zstd 22% 30M 133M 133M
# compsize /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
Processed 530 files, 1169 regular extents (1169 refs), 0 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 19% 22M 116M 116M
none 100% 32K 32K 32K
zstd 19% 22M 116M 116M
--8<---------------cut here---------------end--------------->8---
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-05-31 12:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <ZEZWS/h9xa/ZX3/E@jurong>
[not found] ` <875y9jzl9m.fsf@gnu.org>
[not found] ` <874jot19fd.fsf_-_@gnu.org>
[not found] ` <87fs7rvv5s.fsf_-_@gnu.org>
[not found] ` <ZGj3hGKGwu3mQklT@jurong>
[not found] ` <878rddooy4.fsf@gnu.org>
2023-05-25 18:24 ` How many bytes do we add (closure of guix) when adding one new package? Simon Tournier
2023-05-26 16:21 ` Ludovic Courtès
2023-05-30 12:10 ` Simon Tournier
2023-05-30 19:10 ` Csepp
2023-05-31 8:05 ` Faster “guix search” (was Re: How many bytes do we add (closure of guix) when adding one new package?) Simon Tournier
2023-05-31 11:10 ` Csepp
2023-05-31 11:55 ` Attila Lendvai
2023-05-30 20:55 ` How many bytes do we add (closure of guix) when adding one new package? Jack Hill
2023-05-31 8:27 ` Simon Tournier
2023-05-31 12:47 ` Guillaume Le Vaillant
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).