2018-05-13 7:07 GMT+02:00 Ricardo Wurmus : > > Catonano writes: > > > Ricardo, I don't understand the problem you're raising here (I didn't > read > > the article yet, though) > > > > Would you mind to elaborate on that ? > > > > Why would you want to record the environment ? > > I want to record the detected build environment so that I can restore it > at execution time. Autoconf provides macros that probe the environment > and record the full path to detected tools. For example, I’m looking > for Samtools, and the user may provide a particular variant of Samtools > at configure time. Thanks for clarifying ! Let me vent some thoughts on te issue ! Under Guix, the way to provide a specific version of the Samtools would be to run the configuration in an environment that offers a specific Samtools package, so that the configuration tool can pick that up Under a traditional distro, it'd be to feed file paths to the configuration tool So, how much of the traditional way of doing things do we want to support, in our pipelines ? I record the full path to the executable at > configure time and embed that path in a configuration file that is read > when the pipeline is run. > > This works fine for tools, but doesn’t work very well at all for modules > in language environments. Take R for example. I can detect and record > the location of the R and Rscript executables, but I cannot easily > record the location of build-time R packages (such as r-deseq2) in a way > that allows me to rebuild the environment at runtime. > > Instead of writing an Autoconf macro that records the exact location of > each of the detected R packages and their dependencies I chose to solve > the problem in Guix by wrapping the pipeline executables in R_SITE_LIBS, > because I figured that on systems without Guix you aren’t likely to > install R packages into separate unique locations anyway — on most > systems R packages end up being installed to one and the same directory. > > I think the desire to restore the configured environment at runtime is > valid and we do this all the time when we run binaries that have > embedded absolute paths (to libraries or other tools). I didn't mean to imply it's not valid I was just trying to understand what are the concerns on the ground and the context > It’s just that > it gets pretty awkward to do this for things like R packages or Python > modules (or Guile modules for that matter). > > The Guix workflow language solves this problem by depending on Guix for > software deployment. For PiGx we picked Snakemake early on and it does > not have a software deployment solution (it expects to either run inside > a suitable environment that the user provides or to have access to > pre-built Singularity application bundles). I don’t like to treat > pipelines like some sort of collection of scripts that must be invoked > in a suitable environment. I like to see pipelines as big software > packages that should know about the environment they need, that can be > configured like regular tools, and thus only require the packager to > assemble the environment, not the end-user. > I understand your concern to consider pipelines as packages But say, for example, that a pipeline gets distributed as a .deb package with dependencies to R (or Guile) modules Or, say, that a pipeline is distributed with a bundled guix.scm file containing R modules (or Guile modules) as inputs Would that break the idea of a pipeline as a package ? I'm afraid that the idea of a pipeline as a package shouldn't be entrusted to the configuration tool, but rather to the package management tool And the pipeline author shouldn't be assumed to work in isolation, confident that any package management environment will be able to rus their pipeline smoothly The pipelines authors should be concerned with the collocation of their pipeline in the packaged graph, that shouldn't be a concern of the packager only Maybe the sotware authors should provide dependency information in a standardized format (rdf ? ) and that should be leveraged by packagers in order to prepare .deb packages or guix.scm files And if you are a developer and you want to test the software with a specific version of a dependency, then you should run the configuration tool in an environment where that version of the dependency is available, so that the configuration tool can pick that up If you are on Guix, you will probably create that environment with the Guix environment tool If you are on Debian or Fedora, you will have to rely on those distros development tools On traditional distros, you can install packages in your user folder or in /opt or in other positions And then, you can feed those to the configuration tool On Guix, the conditions are different The idea of pipelines as packages will be treated differently by the configuration tool under Guix and the configuration tool under Debian/Fedora So, in my view a configuration tool should be quite dumb and assume that the package management is smarter You object that implies the idea of the pipeline as a ugly hack That is not necessarily so It's just that I don't think that the pipelines authors can complete the issue in their configuration management Guix introduces the idea of the whole dependencies stack and that can't be of concern to packagers only. I don't think so Maybe I'm too pessimistic, I don't know Thanks for this discussion !