unofficial mirror of guix-science@gnu.org 
 help / color / mirror / Atom feed
From: Konrad Hinsen <konrad.hinsen@cnrs.fr>
To: Hugo Buddelmeijer <hugo@buddelmeijer.nl>,
	Thibault Lestang <t.lestang@imperial.ac.uk>
Cc: guix-science <guix-science@gnu.org>
Subject: Re: Conda environments and reproducibility
Date: Tue, 29 Nov 2022 14:39:46 +0100	[thread overview]
Message-ID: <m1k03d517h.fsf@fastmail.net> (raw)
In-Reply-To: <CA+Jv8O1VzXjPgZ04HaDHpeyvuDqaU_e2FYdsckhDzyi8Dgi8Pg@mail.gmail.com>

Hi Hugo,

 Buddelmeijer <hugo@buddelmeijer.nl> writes:

> Hi Konrad, Thibault and others,
>
> Konrad, is it perhaps possible for you to dig up this broken conda
> environment file?

Yes:

   https://gist.github.com/brospars/4671d9013f0d99e1c961482dab533c57

That environment was set up in 2018 on a Linux machine, and then tested
under macOS and Windows as well. It broke in early 2019.

> First, just like you all, my conclusion is that guix is the answer. The
> last two paragraphs by Simon captures it succinctly. However, conda seems
> to work fine for most people. It would therefore be instructive to have
> concrete 'failure stories' in order to show people that conda is not enough.

I have heard many stories of conda failing long-term, i.e. environments
not being reproducible after a year or two. Most use cases are probably
more short-term.

> It doesn't seem common to overwrite conda binaries. Conda takes some (not
> enough?) measures to prevent the scenario Konrad describes. In particular,
> the filenames include a 'hash' since conda 3 (~2014) [1]:

Weird. We worked with official Miniconda downloads from early 2018, and
our environment files contain no hashes.

> My realization was that improving these hashes is a goose chase and will
> ultimately lead to horrific things like "turing-complete yaml files". And
> at that point it is clear, at least to me, that guix is the answer.

Indeed. Turing-complete Scheme files :-)

My conclusion so far is that conda can never attain long-term
reproducibility, because it wants to be multi-platform. And that means
that it doesn't control the foundations on which it has to build.

From a user's point of view, a big problem with conda is the opacity of
the machinery, which in addition changes all the time as you say. With
Guix, I can understand how everything is built, and thus understand the
potential obstacles to a rebuild many years later. With conda, I don't
really know and my understanding is that the build machinery is not
even completely public (for Anaconda at least).

> One thing that conda (or actualy conda-forge) does well, are their bots.
> I'm a maintainer of some conda packages and once a month or so I get a
> fully automated pull request to update my package [4], e.g. when the
> upstream package is updated, or when a dependency is updated. They even

That's nice!

> packages, such as compilers. This makes maintaining conda-forge packages a
> breeze. Having such bots also within the guix-ecosystem would probably help
> attract developers.

Indeed. More generally, I think package managers should do a better job
in reaching out to upstream maintainers. They are our allies in
providing a better UX.

Cheers,
  Konrad
-- 
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: konrad DOT hinsen AT cnrs DOT fr
http://dirac.cnrs-orleans.fr/~hinsen/
ORCID: https://orcid.org/0000-0003-0330-9428
Twitter: @khinsen
---------------------------------------------------------------------


  reply	other threads:[~2022-11-29 13:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-28 17:28 Conda environments and reproducibility Thibault Lestang
2022-11-28 19:45 ` Konrad Hinsen
2022-11-29 10:32   ` Thibault Lestang
2022-11-29 13:12     ` Hugo Buddelmeijer
2022-11-29 13:39       ` Konrad Hinsen [this message]
2022-12-01 14:01         ` Hugo Buddelmeijer
2022-12-02 13:01           ` Konrad Hinsen
2022-11-29 20:10       ` Simon Tournier
2022-12-16 10:16         ` Thibault Lestang
2023-03-11 11:05           ` Ludovic Courtès
2023-03-11 11:43             ` Simon Tournier
2023-03-13 10:26               ` Lestang, Thibault
2023-03-13 11:00                 ` Ricardo Wurmus
2023-03-13 12:38                   ` Simon Tournier
2023-03-16 10:26                     ` Ludovic Courtès
2023-03-16 13:40                       ` Thibault Lestang
2023-04-03 15:22                         ` Simon Tournier
2023-04-04 12:19                           ` Thibault Lestang
2022-12-02 10:52       ` Ludovic Courtès
2022-12-02 11:05       ` Ludovic Courtès
2022-12-02 13:59         ` Simon Tournier
2022-12-02 14:06         ` Hugo Buddelmeijer
2022-11-28 20:46 ` Simon Tournier
2022-11-29 10:41   ` Thibault Lestang
2022-11-29 14:25     ` Simon Tournier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1k03d517h.fsf@fastmail.net \
    --to=konrad.hinsen@cnrs.fr \
    --cc=guix-science@gnu.org \
    --cc=hugo@buddelmeijer.nl \
    --cc=t.lestang@imperial.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).