From: Simon Tournier <zimon.toutoune@gmail.com>
To: Thibault Lestang <t.lestang@imperial.ac.uk>,
guix-science <guix-science@gnu.org>
Subject: Re: Conda environments and reproducibility
Date: Mon, 28 Nov 2022 21:46:05 +0100 [thread overview]
Message-ID: <86v8my7qpe.fsf@gmail.com> (raw)
In-Reply-To: <87pmd7ar8k.fsf@imperial.ac.uk>
Hi,
On Mon, 28 Nov 2022 at 17:28, Thibault Lestang <t.lestang@imperial.ac.uk> wrote:
> -----
> @luispedrocoelho
> Me, 6 months ago: I am going to save this conda
> environment with all the versions of all the packages so it can be
> recreated later; this is Reproducible Science!
>
> conda, today: these versions don't work together, lol.
> -----
>
> I simply can't explain how such a behavior can happen.
One thing is the link rot. I do not know if it is currently estimated,
but for sure, we always underestimate it.
> I understand that conda ships pre-compiled binaries. I see how that's
> bad for reproducibility and provenance tracking since it's not
> straightforward to know how these binaries and dependencies were
> compiled. I'm assuming that, when conda saves an environment, it records
> version tags and "everything else required" to pull the same binaries
> later. Okay - I see how binaries could /technically/ be modified at a
> later stage whilst maintaning the same version tag (provenance tracking
> issue).
Aside, you are assuming the availability of such binaries. :-)
Another thing, from the old time where I used Conda, and I may be wrong,
is, I guess , the SAT solver [1]. Well, 6 months ago, you described
your environment, for instance saying:
1.0 <= foo
2.0 <= bar <= 3.0
baz <= 4.0
then foo@1.1, foo@1.2 and foo@2.0 had been released in these past 6
months. But baz <= 4.0 only works with 0.9 <= foo <= 1.2 and the
constraint on bar implies other constraints on foo and/or baz.
The complexity about SAT solvers is exponential, IIRC, for sure really
bad, and I do not know the state-of-the-art but I guess the problem to
solve is going to be worse and worse as the time flies.
From my experience, you have only one solution to fight against the
time: freeze. The question is then how or what to freeze. :-)
One way for freezing is the binary container. Another way for freezing
is to have a “summary” capturing the whole (fixed) graph of
dependencies. This is (usually named) the channels.scm file (guix
describe). Then, the assumptions become:
1. solve the link rot; tackled by Software Heritage,
2. Linux kernel API backward compatibility,
3. hardware compatibility,
to be able to rebuild. If I might, here some stuff: :-)
https://www.nature.com/articles/s41597-022-01720-9
https://simon.tournier.info/posts/2022-11-08-bluehats.html
https://simon.tournier.info/posts/2022-04-15-cafe-guix-long-term.html
Cheers,
simon
1: https://en.wikipedia.org/wiki/SAT_solver
next prev parent reply other threads:[~2022-11-28 20:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-28 17:28 Conda environments and reproducibility Thibault Lestang
2022-11-28 19:45 ` Konrad Hinsen
2022-11-29 10:32 ` Thibault Lestang
2022-11-29 13:12 ` Hugo Buddelmeijer
2022-11-29 13:39 ` Konrad Hinsen
2022-12-01 14:01 ` Hugo Buddelmeijer
2022-12-02 13:01 ` Konrad Hinsen
2022-11-29 20:10 ` Simon Tournier
2022-12-16 10:16 ` Thibault Lestang
2023-03-11 11:05 ` Ludovic Courtès
2023-03-11 11:43 ` Simon Tournier
2023-03-13 10:26 ` Lestang, Thibault
2023-03-13 11:00 ` Ricardo Wurmus
2023-03-13 12:38 ` Simon Tournier
2023-03-16 10:26 ` Ludovic Courtès
2023-03-16 13:40 ` Thibault Lestang
2023-04-03 15:22 ` Simon Tournier
2023-04-04 12:19 ` Thibault Lestang
2022-12-02 10:52 ` Ludovic Courtès
2022-12-02 11:05 ` Ludovic Courtès
2022-12-02 13:59 ` Simon Tournier
2022-12-02 14:06 ` Hugo Buddelmeijer
2022-11-28 20:46 ` Simon Tournier [this message]
2022-11-29 10:41 ` Thibault Lestang
2022-11-29 14:25 ` Simon Tournier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86v8my7qpe.fsf@gmail.com \
--to=zimon.toutoune@gmail.com \
--cc=guix-science@gnu.org \
--cc=t.lestang@imperial.ac.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).