all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: help-guix@gnu.org
Subject: Re: leaky pipelines and Guix
Date: Sat, 5 Mar 2016 12:05:28 +0100	[thread overview]
Message-ID: <87k2lh9ocn.fsf@mdc-berlin.de> (raw)
In-Reply-To: <87egci10tz.fsf@gnu.org>


Ludovic Courtès <ludo@gnu.org> writes:

> Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:
>
>> So, how could I package something like that?  Is packaging the wrong
>> approach here and should I really just be using “guix environment” to
>> prepare a suitable environment, run the pipeline, and then exit?
>
> Maybe packages are the wrong abstraction here?
>
> IIUC, a pipeline is really a function that takes inputs and produces
> output(s).  So it can definitely be modeled as a derivation.

This may be true and the basic abstraction you propose seems correct and
useful, but I was talking about existing pipelines.  They have already
been implemented using snakemake or make to keep track of individual
steps, etc.  My primary concern is with making these pipelines work, not
to rewrite them.

For a particularly nasty pipeline I’m just using a separate profile
just for the pipeline dependencies.  Users build the pipeline glue code
themselves by whatever means they deem appropriate and then load the
profile in a subshell:

    bash
    source /path/to/pipeline-profile/etc/profile
    # run the pipeline here
    exit

I think that these existing bio pipelines should really be treated more
like configurable packages.  For a pipeline that we’re currently working
on I’m involved in making sure that it can be packaged and installed.
We chose to use autoconf to substitute tool placeholders at configure
time.  This allows us to install the pipeline easily with Guix as we can
treat tools just as regular runtime dependencies.  At configure time the
actual full paths to the needed tools are injected into the sources, so
we don’t need to propagate anything and make assumptions about PATH.

Many problems with bio pipelines stem from the fact that they are not
treated as first-class applications, so they often don’t have a wrapper
script, nor a configuration or installation step.  I think the easiest
way to fix this is to encourage the design of pipelines as real software
packages rather than distributing bland Makefiles/snakefiles and
assuming that the user will arrange for a suitable environment.

~~ Ricardo

  parent reply	other threads:[~2016-03-05 11:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-09 11:25 leaky pipelines and Guix Ricardo Wurmus
2016-02-12 14:04 ` Ludovic Courtès
2016-03-04 23:29   ` myglc2
2016-03-07  9:56     ` Ludovic Courtès
2016-03-07 23:21       ` myglc2
2016-03-05 11:05   ` Ricardo Wurmus [this message]
2016-03-07  9:54     ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k2lh9ocn.fsf@mdc-berlin.de \
    --to=ricardo.wurmus@mdc-berlin.de \
    --cc=help-guix@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.