From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: Re: Use guix to distribute data & reproducible (data) science Date: Sat, 10 Feb 2018 00:17:33 +0100 Message-ID: <8760757nxu.fsf@elephly.net> References: <365e13248634ac1e26cf6678611d550d@hypermove.net> <87mv0ixf07.fsf@gnu.org> <1cb709d0-b282-192c-ce1d-20fbff43430e@fastmail.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:37962) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ekHvQ-0008Jt-Id for guix-devel@gnu.org; Fri, 09 Feb 2018 18:17:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ekHvN-0004M4-ER for guix-devel@gnu.org; Fri, 09 Feb 2018 18:17:48 -0500 Received: from sender-of-o51.zoho.com ([135.84.80.216]:21023) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ekHvN-0004Lj-5G for guix-devel@gnu.org; Fri, 09 Feb 2018 18:17:45 -0500 In-reply-to: List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: zimoun Cc: Guix Devel zimoun writes: > I do not know so much, but a idea should to write a workflow: you > fetch the data, you clean them and you check by hashing that the > result is the expected one. Only the softwares used to do that are in > the store. The input and output data are not, but your workflow check > that they are the expected ones. > However, it depends on what we are calling 'cleaning' because some > algorithms are not deterministic. > > Hum? I do not know if there is some mechanism in GWL to check the hash > of the `data-inputs' field. In the GWL the data-inputs field is not special as far as any of the current execution engines are concerned. It=E2=80=99s up to the execution engine to implement recency checks or identity checks as there is not one size that fits all inputs. --=20 Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net