diff --git a/drafts/reproducible-cran.md b/drafts/reproducible-cran.md index c691163..28f6108 100644 --- a/drafts/reproducible-cran.md +++ b/drafts/reproducible-cran.md @@ -60,6 +60,42 @@ pre-built substitutes to speed up installation times. Additionally, reproducing environments would include fewer steps if the package recipes were available to anyone by default. +## Why deploy R software with Guix anyway? + +At this point, perhaps you're wondering: R is stable, and tools such as +[Packrat](https://rstudio.github.io/packrat/) let me save and restore +the exact R package versions I need. While this might seem “good +enough”, we can already tell this approach [has a number of +shortcomings](https://hpc.guix.info/blog/2022/07/is-reproducibility-practical/), +one of which being that it cannot handle dependencies not written in +R—such as R itself. + +A [study published in *Nature Scientific Data* in February +2022](https://doi.org/10.1038/s41597-022-01143-6) gives empirical +insight into this: + +> _[We] retrieve and analyze more than 2000 replication datasets with +> over 9000 unique R files published from 2010 to 2020. Second, we +> execute the code in a clean runtime environment to assess its ease of +> reuse. […] We find that 74% of R files failed to complete without +> error in the initial execution, while 56% failed when code cleaning +> was applied, showing that many errors can be prevented with good +> coding practices._ + +Three fourth of those R packages fail to run out of the box—this is +huge. How did the authors re-execute this code? + +> _We re-executed R code from each of the replication packages using +> three R software versions, R 3.2, R 3.6, and R 4.0, in a clean +> environment._ + +Despite this guesswork, coupled with automatic “source cleaning”, the +authors found that most packages still fail to run. + +The motivation to deploy R software with Guix becomes clear: it’s the +ability to automatically redeploy the same software environment, at +different points in time, on different machines. + ## Introducing guix-cran GNU Guix provides a mechanism called “channels”,