From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludovic.courtes@inria.fr (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: Use guix to distribute data & reproducible (data) science Date: Fri, 09 Feb 2018 18:13:44 +0100 Message-ID: <87mv0ixf07.fsf@gnu.org> References: <365e13248634ac1e26cf6678611d550d@hypermove.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:45583) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ekCFD-0008UW-Tq for guix-devel@gnu.org; Fri, 09 Feb 2018 12:13:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ekCFA-0003WH-Ov for guix-devel@gnu.org; Fri, 09 Feb 2018 12:13:51 -0500 Received: from mail2-relais-roc.national.inria.fr ([192.134.164.83]:53923) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ekCFA-0003Up-EJ for guix-devel@gnu.org; Fri, 09 Feb 2018 12:13:48 -0500 In-Reply-To: <365e13248634ac1e26cf6678611d550d@hypermove.net> (Amirouche Boubekki's message of "Fri, 09 Feb 2018 17:32:32 +0100") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Amirouche Boubekki Cc: Guix Devel Hi! Amirouche Boubekki skribis: > tl;dr: Distribution of data and software seems similar. > Data is more and more important in software and reproducible > science. Data science ecosystem lakes resources sharing. > I think guix can help. I think some of us especially Guix-HPC folks are convinced about the usefulness of Guix as one of the tools in the reproducible science toolchain (that was one of the themes of my FOSDEM talk). :-) Now, whether Guix is the right tool to distribute data, I don=E2=80=99t kno= w. Distributing large amounts of data is a job in itself, and the store isn=E2=80=99t designed for that. It could quickly become a bottleneck. Th= at=E2=80=99s one of the reasons why the Guix Workflow Language (GWL) does not store scientific data in the store itself. I think data should probably be stored and distributed out-of-band using appropriate storage mechanisms. Ludo=E2=80=99.