From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:470:142:3::10]:54056) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j0w58-00056y-O1 for gwl-devel@gnu.org; Sun, 09 Feb 2020 18:33:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j0w57-000667-Kp for gwl-devel@gnu.org; Sun, 09 Feb 2020 18:33:42 -0500 Received: from mail-qt1-x832.google.com ([2607:f8b0:4864:20::832]:37954) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j0w57-000651-ES for gwl-devel@gnu.org; Sun, 09 Feb 2020 18:33:41 -0500 Received: by mail-qt1-x832.google.com with SMTP id c24so3869265qtp.5 for ; Sun, 09 Feb 2020 15:33:41 -0800 (PST) MIME-Version: 1.0 References: <87h801p818.fsf@elephly.net> <87h800u84z.fsf@kyleam.com> <87a75spx3i.fsf@elephly.net> <877e0vq5iy.fsf@elephly.net> In-Reply-To: <877e0vq5iy.fsf@elephly.net> From: zimoun Date: Mon, 10 Feb 2020 00:33:28 +0100 Message-ID: Subject: Re: Preparing for a new release Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gwl-devel-bounces+kyle=kyleam.com@gnu.org Sender: "gwl-devel" To: Ricardo Wurmus Cc: gwl-devel@gnu.org Hi, Thank you for pushing forward. On Sun, 9 Feb 2020 at 14:01, Ricardo Wurmus wrote: > * inputs and outputs are not validated > > When a process declares that it produces an output, but then doesn=E2= =80=99t do > that, the next process will fail with a nasty error message. This is > especially nasty when using containerization as the error is about > failing to map the input into the container. > > Processes should automatically validate their inputs and outputs. > Since inputs and outputs could technically be something other than > files I=E2=80=99m not sure exactly how to do this. >From my understanding, Snakemake uses only files as inputs/outputs. But I do not know what happens if 'rule 1' claims to output 'file' and 'rule 2' says its input is 'file' and then 'rule 1' never produces this file 'file'; the 'rule 2' is never processed, I guess. Hum? need to check... > @Roel: can you give an example of inputs / outputs that are not files? Since UNIX considers everything as file.. ;-) > I remember that you suggested that inputs might be database queries, > for example. I wonder if we should mark inputs and outputs with > types, so that the GWL can know if something is supposed to be a file > or something else. =E2=80=A6just how would =E2=80=9Csomething else=E2= =80=9D be used in a > process? I am not able to imagine other thing than files. But often, I would like that these files stay in memory instead of be written on disk and then re-read by the next process. When the IO is slow (disk full using RAID6), it kills the workflow performance. > * It=E2=80=99s not possible to select more than one tagged item > > In my test workflow I=E2=80=99m generating a bunch of inputs by mapping= over > an argument list. Now the problem is that I can=E2=80=99t select all o= f these > inputs easily in a code snippet. With the syntax we have I can only > select the first item following a tag. > > To address this I=E2=80=99ve extended the accessor syntax, so this work= s now: > > --8<---------------cut here---------------start------------->8--- > process frobnicate > packages "frobnicator" > inputs > . genome: "hg19.fa" > . samples: "a" "b" "c" > outputs > . "result" > # { > frobnicate -g {{inputs:genome}} --files {{inputs::samples}} > {{outpu= ts}} > } > --8<---------------cut here---------------end--------------->8--- > > Note how {{inputs::samples}} is substituted with =E2=80=9Ca b c=E2=80= =9D. With just a > single colon it would be just =E2=80=9Ca=E2=80=9D. Single colon =3D si= ngle item; double > colon =3D more than one item. I am confused by the syntax. Well how to select the second element "b"? Naively, I would tempt to write {{inputs:samples:2}} or {{inputs::samples:2= }}. All the best, simon