From mboxrd@z Thu Jan 1 00:00:00 1970 References: <874lbfxijq.fsf@elephly.net> From: Ricardo Wurmus Subject: Re: [GWL] (random) next steps? Message-ID: <87va3mr6fl.fsf@elephly.net> In-reply-to: Date: Fri, 21 Dec 2018 21:06:08 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-devel" To: zimoun Cc: Guix Devel , gwl-devel@gnu.org List-ID: Hi simon, >> > 6. >> > The graph of dependencies between the processes/units/rules is written >> > by hand. What should be the best strategy to capture it ? By files "= =C3=A0 >> > la" Snakemake ? Other ? >> >> The GWL currently does not use the input information provided by the >> user in the data-inputs field. For the content addressible store we >> will need to change this. The GWL will then be able of determining that >> data-inputs are in fact the outputs of other processes. > > Hum? nice but how? > I mean, the graph cannot be deduced and it needs to be written by > hand, somehow. Isn't it? We can connect a graph by joining the inputs of one process with the outputs of another. With a content addressed store we would run processes in isolation and map the declared data inputs into the environment. Instead of working on the global namespace of the shared file system we can learn from Guix and strictly control the execution environment. After a process has run to completion, only files that were declared as outputs end up in the content addressed store. A process could declare outputs like this: (define the-process (process (name 'foo) (outputs '((result "path/to/result.bam") (meta "path/to/meta.xml"))))) Other processes can then access these files with: (output the-process 'result) i.e. the file corresponding to the declared output =E2=80=9Cresult=E2=80=9D= of the process named by the variable =E2=80=9Cthe-process=E2=80=9D. The question here is just how far we want to take the idea of =E2=80=9Ccont= ent addressed=E2=80=9D =E2=80=93 is it enough to take the hash of all inputs or= do we need to compute the output hash, which could be much more expensive? -- Ricardo