Hi Konrad, Thibault and others,
Konrad, is it perhaps possible for you to dig up this broken conda environment file?
First, just
like you all, my conclusion is that guix is the answer. The last two
paragraphs by Simon captures it succinctly. However, conda seems to work
fine for most people. It would therefore be instructive to have
concrete 'failure stories' in order to show people that conda is not
enough.
That's fair enough. Conda & pip are everywhere around me, and I'd like
to form an accurate picture of their shotcomings before mentioning
alternative approaches to people who use these tools everyday!
I agree, let me share my perspective.
Konrad Hinsen <konrad.hinsen@cnrs.fr> writes:
> That's in a way what happened in my scenario: rebuilding with a new
> compilation infrastructure produces different packages that share
> version numbers and tags with the prior ones.
Okay - this is an explanation I can understand. A better approach
would have been /not/ to overwrite existing package binaries with new
ones produced from the new infrastructure.
It doesn't seem common to overwrite conda binaries. Conda takes some (not enough?) measures to prevent the scenario Konrad describes. In particular, the filenames include a 'hash' since conda 3 (~2014) [1]:
in
the past, we have had things like py27np111 in filenames. This is the
same idea, just generalized. Since we can't readily put every possible
constraint into the filename, we have kept the old ones, but added the
hash as a general solution.
This hash includes information about the compiler used (~2017) [2, 3]:
The
build hash will be added to the build string if these are true for any
dependency: [...] package uses {{ compiler() }} jinja2 function
That
is, "conda env export" should contain entries like
"scipy=1.8.0=py39hee8e79c_1", where the hee8e79c should uniquely define
the dependencies 'that matter', like which compiler is used. What goes into the hash seems rather
complicated, and grows over time.
This hash is a
great step forward in reproducibility. But it is too fragile. I can't
directly see how, but I can easily assume that this dependency-hash
mechanism leads to the problem that Konrad faced even when no files are overwritten. Maybe because a new dependency resolver in conda would have stricter rules on interoperability. (It is still possible that files indeed were overwritten though; it was probably an incident like this that made them change the hashes.)
My realization was
that improving these hashes is a goose chase and will ultimately lead to
horrific things like "turing-complete yaml files". And at that point it is
clear, at least to me, that guix is the answer.
One
thing that conda (or actualy conda-forge) does well, are their bots.
I'm a maintainer of some conda packages and once a month or so I get a
fully automated pull request to update my package [4], e.g. when the
upstream package is updated, or when a dependency is updated. They even
have a tracking system for migrating dependencies that are used by many
packages, such as compilers. This makes maintaining conda-forge packages
a breeze. Having such bots also within the guix-ecosystem would
probably help attract developers.
By the way, it is quite hard to use conda in guix, primarily because "conda activate myenvironment" will try to set PS1 by calling a bash function called 'conda'. This bash function calls the 'conda' executable, which takes PS1, modifies it, and returns it to the bash function. The bash function subsequently sets PS1 (and makes a backup for deactivating the environment again). However, the conda executable is replaced by a bash script that calls conda_real. And bash scripts eat PS1 (because it is in non-interactive mode), so conda_real gets an empty PS1, fails to modify it, and then the bash function sets PS1 to nothing. I've got it working properly on my machine, but don't feel comfortable enough yet with Scheme / guix to provide a proper patch. The simplest might be to use another shell for the conda package (because I believe only bash eats PS1); not sure whether that is possible in guix. And I would rather make guix packages of everything and ditch conda altogether. But supporting conda properly would help more people transition.
(Oh, this reminds me of the problems of activation and deactivation scripts in conda. For another time.)
Greetings,
Hugo