From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kyle Meyer Subject: Re: [PATCH] workflow: Consider unspecified free inputs when checking cache. In-Reply-To: Date: Tue, 25 Jun 2019 21:31:07 -0400 Message-ID: <87ef3hceuc.fsf@kyleam.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable To: zimoun Cc: Ricardo Wurmus , gwl-devel@gnu.org List-ID: Hi simon, zimoun writes: > Hi, > > On Tue, 25 Jun 2019 at 06:30, Kyle Meyer wrote: >> >> Ricardo Wurmus writes: > >> > I=E2=80=99m not sure if we should keep picking >> > inputs from the environment silently and by default, but your patch is >> > anyway more correct than what we had before. >> >> Hmm, for my use case, taking free inputs from the file system based on >> the current directory is the only method that I'm actually interested in >> (i.e. I don't see myself having any use for --input). Perhaps my >> thinking is too shaped by make/snakemake, and I don't fully grasp the >> approach GWL is trying to take. > > I am not sure to fully understand the issue and all the recent changes. No need to understand the patch :] The quoted discussion is pretty tangential to the issue resolved by the patch. It's only related in that it's about unspecified free inputs. > One idea of GWL is to have a functional workflow: the > multi-composition of functions/processes. And free inputs > are--say--the argument of this function. Therefore, if you have many > samples and you need to apply the same workflow, then you just apply > the function to each sample with --input. I mean it is my > understanding of the approach. Maybe I have wrong... Thanks, that matches my understanding. To expand on my original comment that --input doesn't fit my desired use case: I want the workflow to be fully specified. For me, this translates to (1) all scripts that aren't included in Guix packages are tracked in a Git repository (2) software dependencies are completely specified via a manifest and Guix inferiors (3) all input data files are tracked in the repository (via git-annex) (4) the workflow itself is tracked in the repository (5) the workflow unambiguously specifies how to generate each output. I don't want to have to document that "to generate THIS, you should run `guix workflow --input=3DTHIS=3DTHAT'". I want to make a blanket statement that "you can run ` TARGET' to generate any desired output, which will do whatever is needed to make that happen". (I know this isn't currently possible with GWL.) Of course none of that is to say above is _the_ way to do it; I'm just elaborating on what my use case and focus is.