unofficial mirror of gwl-devel@gnu.org
 help / color / mirror / Atom feed
From: Konrad Hinsen <konrad.hinsen@fastmail.net>
To: zimoun <zimon.toutoune@gmail.com>, gwl-devel@gnu.org
Subject: Re: Managing data files in workflows
Date: Fri, 26 Mar 2021 13:46:42 +0100	[thread overview]
Message-ID: <m1h7kyhykt.fsf@ordinateur-de-catherine--konrad.home> (raw)
In-Reply-To: <86v99ebdnz.fsf@gmail.com>

Hi Simon,

> It does not answer your concrete question but instead open a new
> one. :-)

And a good one!

>  1. how to deal with data?
>  2. on which does the workflow trigger a recomputation?

Number 2 was what I had in mind with my question. And I still wonder
how GWL handles it now and/or in some near future.

> There is 3 levels:
>
>  1- the methods for fetching: URL (http or ftp), Git, IPFS, Dat, etc.
>  2- the record representing a “data”
>  3- how to effectively locally store and deal with it
>
> And if it makes sense that a ’data’ is an input of a
> ’package’, and conversely, is a question.
>
> Long time ago, with GWL folks we discussed “backend”, as git-annex or
> something else, but from my understanding, it would answer about #3 and
> what git-annex accepts as protocol would answer to #1.  Remaining #2.

Perhaps a good first step is to actually use git-annex for big files,
and then integrate it more and more into Guix and/or GWL. Multiple
backends will certainly be required in the near future, because data
storage is not yet sufficiently standardized to pick one specific
technology. So why not profit from the work already done in git-annex?

One answer to #2 would be to use a git repository. Managed by git-annex,
with remotes pointing to the repositories that actually hold the data.
Not very elegant, but as a first step, why not.

Cheers,
  Konrad.


  reply	other threads:[~2021-03-26 12:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25  9:57 Managing data files in workflows Konrad Hinsen
2021-03-26  7:02 ` zimoun
2021-03-26 12:46   ` Konrad Hinsen [this message]
2021-03-26  8:47 ` Ricardo Wurmus
2021-03-26 12:30   ` Konrad Hinsen
2021-03-26 12:54     ` Konrad Hinsen
2021-03-26 13:13     ` Ricardo Wurmus
2021-03-26 15:36       ` Konrad Hinsen
2021-04-01 13:27         ` Ricardo Wurmus
2021-04-02  8:41           ` Konrad Hinsen
2021-04-07 11:38             ` Ricardo Wurmus
2021-04-08  7:28               ` Konrad Hinsen
2021-05-03  9:18                 ` Ricardo Wurmus
2021-05-03 11:58                   ` zimoun
2021-05-03 13:47                     ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.guixwl.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1h7kyhykt.fsf@ordinateur-de-catherine--konrad.home \
    --to=konrad.hinsen@fastmail.net \
    --cc=gwl-devel@gnu.org \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).