From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id QLzPKWQjimNQYAEAbAwnHQ (envelope-from ) for ; Fri, 02 Dec 2022 17:10:12 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id kOu+KWQjimOAKwEAauVa8A (envelope-from ) for ; Fri, 02 Dec 2022 17:10:12 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 56431CF2E for ; Fri, 2 Dec 2022 17:10:12 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p18bh-0006sT-Lc; Fri, 02 Dec 2022 11:09:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p18bg-0006rh-7n for guix-science@gnu.org; Fri, 02 Dec 2022 11:09:44 -0500 Received: from mail-wm1-x32c.google.com ([2a00:1450:4864:20::32c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1p18bd-0003F5-VQ; Fri, 02 Dec 2022 11:09:43 -0500 Received: by mail-wm1-x32c.google.com with SMTP id ay27-20020a05600c1e1b00b003d070f4060bso4649256wmb.2; Fri, 02 Dec 2022 08:09:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=j00QAHYKfBZ4hCblrXwmqvozwvH3F9SpZdgk+4fchlM=; b=CVVZ3kiVRXOgiTFovXmjidk4ORKpi3VX+sKQtL9ZGJce4Xhci+esYXwd5AqiL7ec9q vpFURiRsOrEX1P1KxYoM4rPm7gBCdAytwBPbzvhzqk2GRe/G2QjGriMmgHJrTyqCrxst xa2Qf315jjnHFkF0gDJ0RsggtKQnHKBuWprmIMyDWMzpkh2h5DzzE19xhHCnVUbU4g7h dwMVnUCM5Qv0i5hWJUkvHSV8TyXGd8CPi96CHFsW1g/3KVqlFf3gOm3yZ1IbA+A6cNJr m/Y8WhkSNhxniC318Ap/5DLLLxrVlng6zFaXrkgbLZfxhz5+gqmh17Lo6lbnE4QwAFMp 4lyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j00QAHYKfBZ4hCblrXwmqvozwvH3F9SpZdgk+4fchlM=; b=hq5Uag8e54/NP+h3VqnjTlzMzPAdG94/uVp6MUmgJtNQHh2zlAxD7gK6MoJrxAq5wk FxHvbKlAbnz2mA1rX3o2ZknTqxQQ+b9ubLJWzTBgnyVyKjYSzcJ5DzwrFbwmbk2CvTS1 SadIRBr1Ixuf7u2k4qdEbTTMqx6m5eN2NFBIHIheLPPM+Dj2IjnI4t+BPA9POSPYQ6mA eCI9mfXDFyKw0s/YpqFVla24UVqK4cxHT2bhPYbvsY3EYYSFR5akt4SWpfb22IarHlX6 RN7fgXdeSz9emmDyUeI5LhdkWg34sElgQP2c2dJWFyWlgTzABleChwrfRwrILUOCEunv NqLA== X-Gm-Message-State: ANoB5pljgoh8QrEsp1snaExbhImMUlemyaqJWaN+oUHhiBfk5qHdvdYM DUtd7s1YA3hcULfJJ5uXO1CL3wJ1h28= X-Google-Smtp-Source: AA0mqf79drwE8IJclXxQSJvCIB1HSXbEGMC13s8vp8X+hb+N6/hY5HGa3H55yur2U1zFzKfjipCFAg== X-Received: by 2002:a05:600c:502b:b0:3a5:cb0e:8242 with SMTP id n43-20020a05600c502b00b003a5cb0e8242mr52810434wmr.188.1669997379142; Fri, 02 Dec 2022 08:09:39 -0800 (PST) Received: from pfiuh07 ([193.48.40.241]) by smtp.gmail.com with ESMTPSA id he10-20020a05600c540a00b003b4a699ce8esm12159832wmb.6.2022.12.02.08.09.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Dec 2022 08:09:38 -0800 (PST) From: Simon Tournier To: Ludovic =?utf-8?Q?Court=C3=A8s?= , Hugo Buddelmeijer Cc: Thibault Lestang , Konrad Hinsen , guix-science Subject: Re: Conda environments and reproducibility In-Reply-To: <87k03at69n.fsf@gnu.org> References: <87pmd7ar8k.fsf@imperial.ac.uk> <87zgcayre2.fsf@imperial.ac.uk> <87k03at69n.fsf@gnu.org> Date: Fri, 02 Dec 2022 14:59:42 +0100 Message-ID: <878rjpsy7l.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=zimon.toutoune@gmail.com; helo=mail-wm1-x32c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-science@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-science-bounces+larch=yhetil.org@gnu.org Sender: guix-science-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1669997412; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=j00QAHYKfBZ4hCblrXwmqvozwvH3F9SpZdgk+4fchlM=; b=qXhZpeQrMqVt3SoSXI7e6YuCjCIsot0+OyEc7Hs9ARZmRLzoov5zlos8UiwIVkxvnwlvkR ktCpYHxXUvBtBb1obmp85NCvBgSNdlGjeTPP2glE1O9ehUi9UuMwPHhzX1ywzTPDr/QKHh HJxWXJL4/jH9280GOeM3wtekyPwCuru2IMrB62Yi75mlTh+AusFsLc3lSXLZLNU5cy76Db 2zGHxCHg58K7Gcqhj+zg+6/ZC3X+aQXguN7JhMTwUICwkQAH27CaTYrjDccLKaEtG6wBtm vKbVXoIbwCtFs9Kbqq5da5eoLBVeSncvi1k7eBsMOlVo/F9UYhhHA+vrovNCQw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1669997412; a=rsa-sha256; cv=none; b=VPS9VRM34LmKLPwqVarf/J4FSU4/vvqbO4oi6ZI4+TG6Td2w4DrhtFjEbsYB4iOJ5iTTGc vJs6lDlbqjMotoSwZFM0bKkyR6IqxvY78gC1RUTed1bOt0smOcCUOggS+HYZh0ywNLpNpS IaGHYf6yensPrmB5gaW+z7rX/XFCbCnp+1l96zv1uZvRD4Slvd6vpe/xc/VRgq4o1dfAjw UosIXYBuGKEjVM1w2JoYq9h/q9eKr9kFPX82IrIWvYKlBYH4lKMpWHrUatb3sjV8MOSoOo MmjdSk6qKRVTDal//iWtRnb4mheb0LTFhY6RDDQnbnaQd/IXN7KlkWpedLN9sw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=CVVZ3kiV; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-science-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-science-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -2.47 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=CVVZ3kiV; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-science-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-science-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 56431CF2E X-Spam-Score: -2.47 X-Migadu-Scanner: scn1.migadu.com X-TUID: YOnrAR0oGzVW Hi, On Fri, 02 Dec 2022 at 12:05, Ludovic Court=C3=A8s wrote: > Hugo Buddelmeijer skribis: > >> That is, "conda env export" should contain entries like >> "scipy=3D1.8.0=3Dpy39hee8e79c_1", where the hee8e79c should uniquely def= ine the >> dependencies 'that matter', like which compiler is used. What goes into = the >> hash seems rather complicated, and grows over time. > > I think one source of many problems here is to think that there are > dependencies that do not matter. Another one, which those hashes appear > to address, is to think that a name/version pair is enough to > unambiguously designate a software artifact. > > This hash is a hash of the build result, not a hash of the input, is > that correct? Well, the official Conda documentation seems explanatory, IMHO. For instance, https://conda.io/projects/conda/en/latest/dev-guide/deep-dives/solvers.html= #matchspec-vs-packagerecord >From my understanding, if you go via MatchSpec then the SAT solver is invoked. The SAT solver tries to satisfy all the constraints and the solution depends on the state of the index (the upstream repository). Aside the SAT solver can be very long and even fails if the constraints are too hard, there is no guarantee that the SAT solver will find the exact same combination for the packages to install. Having an equality (numpy=3D1.23) or something else does not really change this point. Conda offers the option to be =E2=80=9Cexplicit=E2=80=9D. And in that case= , the solver is not invoked. Somehow, it is a way to directly deal with PackageRecord. Then, the Conda documentation has these warnings: * Explicit package installs Since the solver is not involved, the dependencies of the explicit package(s) are not processed at all. This can leave the environment in an inconsistent state, which can be fixed by running conda update --all, for example. * Cloning an environment It essentially takes the source environment, generates the URLs for each installed packages (filtering conda, conda-env and their dependencies) and passes the list of URLs to explicit(). If the source tarballs are not in the cache anymore, it will query the index for the best possible match for the current channels. As such, there=E2=80=99s a slim chance that the= copy is not exactly a clone of the original environment. https://conda.io/projects/conda/en/latest/dev-guide/deep-dives/solv= ers.html#early-exit-tasks Therefore, the official Conda documentation explains that it is not possible to have some guarantee about reproducing an environment. > I think it would be great to have a blog post that walks through > shortcomings and concrete issues one may encounter when trying to > reproduce a software environment with Conda, contrasting it with how > Guix does thing. This would probably make more sense for people who use > Conda everyday than a high-level overview of Guix. >From my understanding, the main issue is that Conda perfectly works when you are in a short temporal window (2-3 months, say!). In this range, people can often reproduce. It becomes more complicated outside this range =E2=80=93 so it is hard to demo for explaining. :-) For sure, a blog post by people being fluent in both Conda and Guix would be very welcome. Aside the discussion about reproducibility, just a Rosetta Stone comparing how to do that using Conda vs Guix. It would smooth the migration and at least give a try with Guix. :-) Cheers, simon