From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: leaky pipelines and Guix Date: Tue, 9 Feb 2016 12:25:23 +0100 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:58637) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aT6QM-0004lD-6Q for help-guix@gnu.org; Tue, 09 Feb 2016 06:25:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aT6QH-00081T-S5 for help-guix@gnu.org; Tue, 09 Feb 2016 06:25:37 -0500 Received: from pegasus.bbbm.mdc-berlin.de ([141.80.25.20]:36915) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aT6QH-00080w-9h for help-guix@gnu.org; Tue, 09 Feb 2016 06:25:33 -0500 Received: from localhost (localhost [127.0.0.1]) by pegasus.bbbm.mdc-berlin.de (Postfix) with ESMTP id EDFC2381190 for ; Tue, 9 Feb 2016 12:25:30 +0100 (CET) Received: from pegasus.bbbm.mdc-berlin.de ([127.0.0.1]) by localhost (pegasus.bbbm.mdc-berlin.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3puVl1Hbejlw for ; Tue, 9 Feb 2016 12:25:24 +0100 (CET) Received: from HTCAONE.mdc-berlin.net (puck.citx.mdc-berlin.de [141.80.36.101]) by pegasus.bbbm.mdc-berlin.de (Postfix) with ESMTP for ; Tue, 9 Feb 2016 12:25:24 +0100 (CET) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-guix-bounces+gcggh-help-guix=m.gmane.org@gnu.org Sender: help-guix-bounces+gcggh-help-guix=m.gmane.org@gnu.org To: help-guix@gnu.org Hi Guix, although I=E2=80=99m comfortable packaging software for Guix I=E2=80=99m = still not confident enough to tackle bioinformatics pipelines, as they don=E2=80=99= t play well with isolation. In the pipeline that I=E2=80=99m currently working on as a consultant pac= kager I=E2=80=99m trying to treat the pipeline itself as a first-class package.= This means that the locations of the tools it calls out to are all configurable (thanks to auto{conf,make}) and they certainly do not have to be in the PATH. This allows us to install this pipeline (and the tools it needs) easily alongside other variants of tools. The pipeline is also not just a bare Makefile but has a wrapper script to provide a simplifed user interface. However, most pipelines do not take this approach. Pipelines are often designed as glue (written in Perl, or as Makefiles) that ties together other tools in some particular order. These tools are usually assumed to be available on the PATH. Pipelines aren=E2=80=99t treated enough lik= e packages (which will be the subject of an inflammatory, click-baiting blog post that I=E2=80=99m working on), so they usually come without a configuration script to override implicit assumptions. In the context of Guix this means that each pipeline would need its very own isolated environment where the PATH is set up to contain the locations of all tools that are needed at runtime (that=E2=80=99s what I = mean by =E2=80=9Cleaky=E2=80=9D). As many pipelines do not come with wrapper scr= ipts there is no easy way to sneakily set up such an environment for the duration of the run. So, how could I package something like that? Is packaging the wrong approach here and should I really just be using =E2=80=9Cguix environment= =E2=80=9D to prepare a suitable environment, run the pipeline, and then exit? I know that there is work in progress to support profile-based environments that would make this a little more feasible (as the environment wouldn=E2= =80=99t be as volatile as they are now), but it seems somewhat inconvenient. This pains me especially in the context of multi-user systems. I can easily create a shared profile containing the tools that are needed by a particular pipeline and provide a wrapper script that does something like this (pseudo-code): bash eval $(guix package --search-paths=3Dprefix) do things exit But I wouldn=E2=80=99t want to do this for individual users, letting them install all tools in a separate profile to run that pipeline, run something like the above to set up the environment, then fetch the tarball containing the glue code that constitutes the pipeline (because we wouldn=E2=80=99t offer a Guix package for something that=E2=80=99s not= usable without so much effort to prepare an environment first), unpack it and then run it inside that environment. To me this seems to be in the twilight zone between proper packaging and a use-case for =E2=80=9Cguix environment=E2=80=9D. I welcome any comment= s about how to approach this and I=E2=80=99m looking forward to the many practical trick= s that I must have overlooked. ~~ Ricardo