From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id EMhrAnKEmGMq2AAAbAwnHQ (envelope-from ) for ; Tue, 13 Dec 2022 14:56:02 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id WDiBAnKEmGOPiwAAauVa8A (envelope-from ) for ; Tue, 13 Dec 2022 14:56:02 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 909DFA8CD for ; Tue, 13 Dec 2022 14:56:01 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p55jD-0001ir-Gf; Tue, 13 Dec 2022 08:53:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p55jA-0001h2-9N for guix-science@gnu.org; Tue, 13 Dec 2022 08:53:48 -0500 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p55j8-0001yh-1w for guix-science@gnu.org; Tue, 13 Dec 2022 08:53:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=OcocZOvL4okNXSPwUzHMUdR6S2FyxXR/+IyCDe/aT+M=; b=Vqfh6p/Eyvkc3r7xf2k1zYfUSk71042VIBDFjAgRaQK6Vfu0HTqkc0oP cFxbv3UWmIT1QIWi3lMpIHFqVfIhwep70z/x0G3IuZ3FOgYXCKYSRe//7 5pWy4Bxt8ZCm/4GpWjWiPL9QynbQHS9JNtW10GzzTvOLHrQ272/bpqr4c 0=; X-IronPort-AV: E=Sophos;i="5.96,241,1665439200"; d="scan'208";a="83175921" Received: from eduroam-111172.grenet.fr (HELO ribbon) ([130.190.111.172]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Dec 2022 14:53:39 +0100 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Lars-Dominik Braun Cc: Simon Tournier , guix-science@gnu.org, Simon Tournier , lars@6xq.net Subject: Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix". References: <86y1rkitkk.fsf@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: Tridi 23 Frimaire an 231 de la =?utf-8?Q?R=C3=A9volu?= =?utf-8?Q?tion=2C?= jour du Roseau X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 13 Dec 2022 14:53:34 +0100 In-Reply-To: (Lars-Dominik Braun's message of "Wed, 7 Dec 2022 09:36:29 +0100") Message-ID: <875yefwgtd.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=192.134.164.83; envelope-from=ludovic.courtes@inria.fr; helo=mail2-relais-roc.national.inria.fr X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, GB_FAKE_RF_SHORT=1.475, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-science@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-science-bounces+larch=yhetil.org@gnu.org Sender: guix-science-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1670939761; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=OcocZOvL4okNXSPwUzHMUdR6S2FyxXR/+IyCDe/aT+M=; b=lEzpfFxPMsupk31cykHAtDG5/1OCN4ZQb/z2Gcz8KzjGm7gkCqHZdAM/OdFgeFN/UwYL0F 4JpwoCugvl229hasyebhkZkjWIULorKLw0hhFEkjAKMog+XQMNxfTlTI4pCL1Q5the6Rhj VUvDhbVD9PkI8C9CIUURSaQKkvLo5MtKjC9XZstJ0GwtKgc+qLtWOGlS/xL/sGxwVEqYxr RR71jeZ7jRQADeSl6hHjHxhnfYd9VAqEEpnW0w/oHkA86tsVNqVb+0nYMG8dB2XJiG2Xl5 ssjGaXCppj9s1tJnnXZ/g3977cL2QevxrITTEGRv0OXn42RLB1Vp5O+x2yNkqw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=inria.fr header.s=dc header.b="Vqfh6p/E"; spf=pass (aspmx1.migadu.com: domain of "guix-science-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-science-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=inria.fr ARC-Seal: i=1; s=key1; d=yhetil.org; t=1670939761; a=rsa-sha256; cv=none; b=LTw13eVBnOXOaxIDIQhUybRfG30aJ3HVYw/fBLkPF7xgr9/u0sNiEE30X7284G+qlxREjC 2neYSE0UouPUs1+aS+NZAgfgyTO/Ebv8Mj55nRXoAQNJbOSDTqartJDOXjNtlrVVGrF3hG hLaI39h9siyYvFbZmtVIMSXlKFYsnGU+YivtWr7OT6RuZXjSQRdM3qHj8hm5+pca+GUmHw kzGSgPmfm8FMb0pz1cpNDbxd5HdMl3XpO9vC9a8eBiCdnHDmO7MYWxq0OW7ka2cBI0nT2e jKe4EPPZPtJNZul/x5FoaZV0NfDoSpFzHOc+xFK8RgDbikSAK0fr7ebndmNBFw== X-Migadu-Spam-Score: -2.04 X-Spam-Score: -2.04 X-Migadu-Queue-Id: 909DFA8CD X-Migadu-Scanner: scn0.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=inria.fr header.s=dc header.b="Vqfh6p/E"; spf=pass (aspmx1.migadu.com: domain of "guix-science-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-science-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=inria.fr X-TUID: gYJV+Oz65ZCM --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello! Lars-Dominik Braun skribis: >> Applied, thanks. It is under drafts/ [1]. Last round proofread before >> publishing. On Friday? > Friday sounds good. I=E2=80=99m attching minor changes to the synax highl= ighting. We missed one Friday but there are plenty coming up. :-) As mentioned on #guix-hpc, I think it=E2=80=99d be interesting to add a reference to https://www.nature.com/articles/s41597-022-01143-6 to illustrate the rationale. I think it=E2=80=99s important because R users a= re likely to wonder why they=E2=80=99d bother with Guix in the first place. Here=E2=80=99s a proposal in that direction; feel free to take it, tear it = down, change it, or whatever. Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable diff --git a/drafts/reproducible-cran.md b/drafts/reproducible-cran.md index c691163..28f6108 100644 --- a/drafts/reproducible-cran.md +++ b/drafts/reproducible-cran.md @@ -60,6 +60,42 @@ pre-built substitutes to speed up installation times. Ad= ditionally, reproducing environments would include fewer steps if the package recipes were available to anyone by default. =20 +## Why deploy R software with Guix anyway? + +At this point, perhaps you're wondering: R is stable, and tools such as +[Packrat](https://rstudio.github.io/packrat/) let me save and restore +the exact R package versions I need. While this might seem =E2=80=9Cgood +enough=E2=80=9D, we can already tell this approach [has a number of +shortcomings](https://hpc.guix.info/blog/2022/07/is-reproducibility-practi= cal/), +one of which being that it cannot handle dependencies not written in +R=E2=80=94such as R itself. + +A [study published in *Nature Scientific Data* in February +2022](https://doi.org/10.1038/s41597-022-01143-6) gives empirical +insight into this: + +> _[We] retrieve and analyze more than 2000 replication datasets with +> over 9000 unique R files published from 2010 to 2020. Second, we +> execute the code in a clean runtime environment to assess its ease of +> reuse. [=E2=80=A6] We find that 74% of R files failed to complete without +> error in the initial execution, while 56% failed when code cleaning +> was applied, showing that many errors can be prevented with good +> coding practices._ + +Three fourth of those R packages fail to run out of the box=E2=80=94this is +huge. How did the authors re-execute this code? + +> _We re-executed R code from each of the replication packages using +> three R software versions, R 3.2, R 3.6, and R 4.0, in a clean +> environment._ + +Despite this guesswork, coupled with automatic =E2=80=9Csource cleaning=E2= =80=9D, the +authors found that most packages still fail to run. + +The motivation to deploy R software with Guix becomes clear: it=E2=80=99s = the +ability to automatically redeploy the same software environment, at +different points in time, on different machines. + ## Introducing guix-cran =20 GNU Guix provides a mechanism called =E2=80=9Cchannels=E2=80=9D, --=-=-=--