unofficial mirror of guix-science@gnu.org 
 help / color / mirror / Atom feed
From: Simon Tournier <zimon.toutoune@gmail.com>
To: Thibault Lestang <t.lestang@imperial.ac.uk>,
	guix-science <guix-science@gnu.org>
Subject: Re: Conda environments and reproducibility
Date: Mon, 28 Nov 2022 21:46:05 +0100	[thread overview]
Message-ID: <86v8my7qpe.fsf@gmail.com> (raw)
In-Reply-To: <87pmd7ar8k.fsf@imperial.ac.uk>

Hi,

On Mon, 28 Nov 2022 at 17:28, Thibault Lestang <t.lestang@imperial.ac.uk> wrote:
> -----
> @luispedrocoelho
> Me, 6 months ago: I am going to save this conda
> environment with all the versions of all the packages so it can be
> recreated later; this is Reproducible Science!
>
> conda, today: these versions don't work together, lol.
> -----
>
> I simply can't explain how such a behavior can happen.

One thing is the link rot.  I do not know if it is currently estimated,
but for sure, we always underestimate it.

> I understand that conda ships pre-compiled binaries. I see how that's
> bad for reproducibility and provenance tracking since it's not
> straightforward to know how these binaries and dependencies were
> compiled. I'm assuming that, when conda saves an environment, it records
> version tags and "everything else required" to pull the same binaries
> later. Okay - I see how binaries could /technically/ be modified at a
> later stage whilst maintaning the same version tag (provenance tracking
> issue).

Aside, you are assuming the availability of such binaries. :-)

Another thing, from the old time where I used Conda, and I may be wrong,
is, I guess , the SAT solver [1].  Well, 6 months ago, you described
your environment, for instance saying:

    1.0 <= foo
    2.0 <= bar <= 3.0
    baz <= 4.0

then foo@1.1, foo@1.2 and foo@2.0 had been released in these past 6
months.  But baz <= 4.0 only works with 0.9 <= foo <= 1.2 and the
constraint on bar implies other constraints on foo and/or baz.

The complexity about SAT solvers is exponential, IIRC, for sure really
bad, and I do not know the state-of-the-art but I guess the problem to
solve is going to be worse and worse as the time flies.

From my experience, you have only one solution to fight against the
time: freeze.  The question is then how or what to freeze. :-)

One way for freezing is the binary container.  Another way for freezing
is to have a “summary” capturing the whole (fixed) graph of
dependencies.  This is (usually named) the channels.scm file (guix
describe).  Then, the assumptions become:

 1. solve the link rot; tackled by Software Heritage,
 2. Linux kernel API backward compatibility,
 3. hardware compatibility,

to be able to rebuild.  If I might, here some stuff: :-)

https://www.nature.com/articles/s41597-022-01720-9
https://simon.tournier.info/posts/2022-11-08-bluehats.html
https://simon.tournier.info/posts/2022-04-15-cafe-guix-long-term.html


Cheers,
simon

1: https://en.wikipedia.org/wiki/SAT_solver


  parent reply	other threads:[~2022-11-28 20:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-28 17:28 Conda environments and reproducibility Thibault Lestang
2022-11-28 19:45 ` Konrad Hinsen
2022-11-29 10:32   ` Thibault Lestang
2022-11-29 13:12     ` Hugo Buddelmeijer
2022-11-29 13:39       ` Konrad Hinsen
2022-12-01 14:01         ` Hugo Buddelmeijer
2022-12-02 13:01           ` Konrad Hinsen
2022-11-29 20:10       ` Simon Tournier
2022-12-16 10:16         ` Thibault Lestang
2023-03-11 11:05           ` Ludovic Courtès
2023-03-11 11:43             ` Simon Tournier
2023-03-13 10:26               ` Lestang, Thibault
2023-03-13 11:00                 ` Ricardo Wurmus
2023-03-13 12:38                   ` Simon Tournier
2023-03-16 10:26                     ` Ludovic Courtès
2023-03-16 13:40                       ` Thibault Lestang
2023-04-03 15:22                         ` Simon Tournier
2023-04-04 12:19                           ` Thibault Lestang
2022-12-02 10:52       ` Ludovic Courtès
2022-12-02 11:05       ` Ludovic Courtès
2022-12-02 13:59         ` Simon Tournier
2022-12-02 14:06         ` Hugo Buddelmeijer
2022-11-28 20:46 ` Simon Tournier [this message]
2022-11-29 10:41   ` Thibault Lestang
2022-11-29 14:25     ` Simon Tournier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86v8my7qpe.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-science@gnu.org \
    --cc=t.lestang@imperial.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).