all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ricardo Wurmus <rekado@elephly.net>
To: Catonano <catonano@gmail.com>
Cc: guix-devel <guix-devel@gnu.org>,
	"Ludovic Courtès" <ludovic.courtes@inria.fr>,
	"guix-hpc@gnu.org" <guix-hpc@gnu.org>
Subject: Re: [rb-general] Paper preprint: Reproducible genomics analysis pipelines with GNU Guix
Date: Sun, 13 May 2018 07:07:46 +0200	[thread overview]
Message-ID: <87fu2wcfdp.fsf@elephly.net> (raw)
In-Reply-To: <CAJ98PDwVYaSH6-aSpyOcBvPyBw_k4ZaGsXe-6_9neJkDO-QBFQ@mail.gmail.com>


Catonano <catonano@gmail.com> writes:

> Ricardo, I don't understand the problem you're raising here (I didn't read
> the article yet, though)
>
> Would you mind to elaborate on that ?
>
> Why would you want to record the environment ?

I want to record the detected build environment so that I can restore it
at execution time.  Autoconf provides macros that probe the environment
and record the full path to detected tools.  For example, I’m looking
for Samtools, and the user may provide a particular variant of Samtools
at configure time.  I record the full path to the executable at
configure time and embed that path in a configuration file that is read
when the pipeline is run.

This works fine for tools, but doesn’t work very well at all for modules
in language environments.  Take R for example.  I can detect and record
the location of the R and Rscript executables, but I cannot easily
record the location of build-time R packages (such as r-deseq2) in a way
that allows me to rebuild the environment at runtime.

Instead of writing an Autoconf macro that records the exact location of
each of the detected R packages and their dependencies I chose to solve
the problem in Guix by wrapping the pipeline executables in R_SITE_LIBS,
because I figured that on systems without Guix you aren’t likely to
install R packages into separate unique locations anyway — on most
systems R packages end up being installed to one and the same directory.

I think the desire to restore the configured environment at runtime is
valid and we do this all the time when we run binaries that have
embedded absolute paths (to libraries or other tools).  It’s just that
it gets pretty awkward to do this for things like R packages or Python
modules (or Guile modules for that matter).

The Guix workflow language solves this problem by depending on Guix for
software deployment.  For PiGx we picked Snakemake early on and it does
not have a software deployment solution (it expects to either run inside
a suitable environment that the user provides or to have access to
pre-built Singularity application bundles).  I don’t like to treat
pipelines like some sort of collection of scripts that must be invoked
in a suitable environment.  I like to see pipelines as big software
packages that should know about the environment they need, that can be
configured like regular tools, and thus only require the packager to
assemble the environment, not the end-user.

--
Ricardo

  reply	other threads:[~2018-05-13  8:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-11 12:18 Paper preprint: Reproducible genomics analysis pipelines with GNU Guix Ricardo Wurmus
2018-04-11 18:30 ` [rb-general] " Holger Levsen
2018-04-11 18:40   ` Ricardo Wurmus
2018-04-11 19:00     ` Holger Levsen
2018-04-11 18:31 ` Holger Levsen
2018-04-11 21:16 ` Roel Janssen
2018-04-15  7:50   ` Amirouche Boubekki
2018-04-23  8:20 ` [rb-general] " Ludovic Courtès
     [not found]   ` <87fu30fsra.fsf@elephly.net>
2018-05-11  8:10     ` Ludovic Courtès
2018-05-11  8:19       ` Ricardo Wurmus
2018-05-11  9:39         ` Catonano
2018-05-13  5:07           ` Ricardo Wurmus [this message]
2018-05-13  8:58             ` Catonano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fu2wcfdp.fsf@elephly.net \
    --to=rekado@elephly.net \
    --cc=catonano@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=guix-hpc@gnu.org \
    --cc=ludovic.courtes@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.