unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludovic.courtes@inria.fr>
To: zimoun <zimon.toutoune@gmail.com>
Cc: guix-devel@gnu.org
Subject: Substitute retention
Date: Tue, 12 Oct 2021 18:04:25 +0200	[thread overview]
Message-ID: <87y26ytek6.fsf_-_@inria.fr> (raw)
In-Reply-To: <86h7dmms8c.fsf@gmail.com> (zimoun's message of "Tue, 12 Oct 2021 12:50:59 +0200")

Hi!

(Moving to guix-devel from <https://issues.guix.gnu.org/42162#43>.)

zimoun <zimon.toutoune@gmail.com> skribis:

>> For the record, the ‘guix publish’ config on berlin is here:
>>
>>   https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/modules/sysadmin/services.scm#n485
>>
>> If I read that correctly, nars have a TTL of 180 days (this is the time
>> a nar is retained after the last time it has been requested, so it’s a
>> lower bound.)

[...]

> Just for the record, a back to envelope computations.  180 days before
> today was April 15th (M-x calendar C-u 180 C-b).  It means 6996 commits
> (35aaf1fe10 is my current last commit).
>
>     git log --format="%cd" --after=2021-04-15 | wc -l
>     6996
>
> However, these commits are pushed by batch.  Roughly, it reads:
>
>     git log --format="%cd" --after=2021-04-15 --date=unix \
>         | awk 'NR == 1{old= $1; next}{print old - $1; old = $1}' \
>         | sort -n | uniq -c | grep -e "0$" | head
>           1 -1542620
>        3388 0
>          14 10
>           6 20
>           5 30
>           2 40
>           4 50
>           1 60
>           2 70
>           2 80
>
> (Take the ’awk’ with care, I am not sure of what I am doing. :-)  And,
> it is rough because timezone etc.)
>
> Other said 3388/6996= ~50% of commits are pushed at the same time, i.e.,
> missed by both build farms using 2 different strategies to collect the
> thing to build (fetch every 5 minutes or fetch from guix-commits).  It
> is a quick back to envelope so keep that with some salt. :-)

OK.

> On that number, after 180 days (6 months), it is hard to evaluate the
> rate of the time-machine queries.  And from my experience (no number to
> back), running time-machine on a commit older than this 180 days implies
> to build derivations.  Or it is a lucky day. :-)

Right.

So what can we do to address this issue?  I *think* we could use a
higher TTL on berlin, and we can try that right away (9 months to being
with?).

However, there is an upper bound anyway.  To make informed decisions on
the retention policy, we should monitor storage space on berlin/bayfront
to better estimate what can be done.  We have Zabbix but it’s not
accessible from the outside; maybe we could graph storage space
somewhere so people can grab the data and work on those estimates?

What if we decide that we need to provide substitutes for 2y old
commits?  In that case, we need a plan to scale up.  That could be
renting storage space somewhere.  That’s largely non-technical work that
needs attention.

There are also technical tweaks that could help: distinguishing between
“important” substitutes that we want to keep, and less important
substitutes (how?); identifying “equivalence classes” for builds of a
given package; etc.  The outcome is unclear and it’ll take time.

Thoughts?

Ludo’.


       reply	other threads:[~2021-10-12 16:05 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87a6tdce94.fsf@inria.fr>
     [not found] ` <87mu4iv0gc.fsf@inria.fr>
     [not found]   ` <handler.42162.D42162.16105343699609.notifdone@debbugs.gnu.org>
     [not found]     ` <87v9c0ap22.fsf_-_@gnu.org>
     [not found]       ` <87wnmsn5lz.fsf_-_@gnu.org>
     [not found]         ` <87bl44vfvg.fsf_-_@gmail.com>
     [not found]           ` <87o880byyz.fsf@inria.fr>
     [not found]             ` <CAJ3okZ2WCpzAUgBGZ1JaJmKkEmjjpFfy8hkBD854CD9vLiDHSw@mail.gmail.com>
     [not found]               ` <87czoay4sq.fsf@inria.fr>
     [not found]                 ` <86h7dmms8c.fsf@gmail.com>
2021-10-12 16:04                   ` Ludovic Courtès [this message]
2021-10-12 18:06                     ` Substitute retention zimoun
2021-10-15  9:27                       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y26ytek6.fsf_-_@inria.fr \
    --to=ludovic.courtes@inria.fr \
    --cc=guix-devel@gnu.org \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).