From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: Re: leaky pipelines and Guix Date: Sat, 5 Mar 2016 12:05:28 +0100 Message-ID: <87k2lh9ocn.fsf@mdc-berlin.de> References: <87egci10tz.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:55416) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1acA1l-0003F8-2G for help-guix@gnu.org; Sat, 05 Mar 2016 06:05:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1acA1h-0001tM-RA for help-guix@gnu.org; Sat, 05 Mar 2016 06:05:40 -0500 In-Reply-To: <87egci10tz.fsf@gnu.org> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-guix-bounces+gcggh-help-guix=m.gmane.org@gnu.org Sender: help-guix-bounces+gcggh-help-guix=m.gmane.org@gnu.org To: Ludovic =?utf-8?Q?Court=C3=A8s?= Cc: help-guix@gnu.org Ludovic Court=C3=A8s writes: > Ricardo Wurmus skribis: > >> So, how could I package something like that? Is packaging the wrong >> approach here and should I really just be using =E2=80=9Cguix environm= ent=E2=80=9D to >> prepare a suitable environment, run the pipeline, and then exit? > > Maybe packages are the wrong abstraction here? > > IIUC, a pipeline is really a function that takes inputs and produces > output(s). So it can definitely be modeled as a derivation. This may be true and the basic abstraction you propose seems correct and useful, but I was talking about existing pipelines. They have already been implemented using snakemake or make to keep track of individual steps, etc. My primary concern is with making these pipelines work, not to rewrite them. For a particularly nasty pipeline I=E2=80=99m just using a separate profi= le just for the pipeline dependencies. Users build the pipeline glue code themselves by whatever means they deem appropriate and then load the profile in a subshell: bash source /path/to/pipeline-profile/etc/profile # run the pipeline here exit I think that these existing bio pipelines should really be treated more like configurable packages. For a pipeline that we=E2=80=99re currently = working on I=E2=80=99m involved in making sure that it can be packaged and instal= led. We chose to use autoconf to substitute tool placeholders at configure time. This allows us to install the pipeline easily with Guix as we can treat tools just as regular runtime dependencies. At configure time the actual full paths to the needed tools are injected into the sources, so we don=E2=80=99t need to propagate anything and make assumptions about PA= TH. Many problems with bio pipelines stem from the fact that they are not treated as first-class applications, so they often don=E2=80=99t have a w= rapper script, nor a configuration or installation step. I think the easiest way to fix this is to encourage the design of pipelines as real software packages rather than distributing bland Makefiles/snakefiles and assuming that the user will arrange for a suitable environment. ~~ Ricardo