unofficial mirror of gwl-devel@gnu.org
 help / color / mirror / Atom feed
From: Pjotr Prins <pjotr2019@thebird.nl>
To: zimoun <zimon.toutoune@gmail.com>
Cc: gwl-devel@gnu.org, Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>
Subject: Re: Next steps for the GWL
Date: Thu, 6 Jun 2019 09:06:59 -0500	[thread overview]
Message-ID: <20190606140659.wcwhc3bcfdkaznjw@thebird.nl> (raw)
In-Reply-To: <20190606134404.g3synqkzopqab3ue@thebird.nl>

We should also assess this

https://labs.eleks.com/2019/03/ipfs-network-data-replication.html

On Thu, Jun 06, 2019 at 08:44:04AM -0500, Pjotr Prins wrote:
> IPFS is meant for data sharing and reproducibility. It also allows for
> private networks which is rather important.
> 
> Scalability of IPFS is a concern, so either we cache using IPFS or we
> have some other caching mechanism.
> 
> git-annex is too much of a hack in my book. It also does not scale
> that well.
> 
> Pj.
> 
> On Thu, Jun 06, 2019 at 12:55:52PM +0200, zimoun wrote:
> > Hi,
> > 
> > On Thu, 6 Jun 2019 at 12:11, Ricardo Wurmus
> > <ricardo.wurmus@mdc-berlin.de> wrote:
> > 
> > > > One of the things I'd love to do
> > > > with GWL is to make it play well with git-annex, something that would
> > > > almost certainly be too specific for GWL itself.  For example
> > > >
> > > >   * Make data caching git-annex aware.  When deciding to recompute data
> > > >     files, GWL avoids computing the hash of data files, using scripts as
> > > >     the cheaper proxy, as you described in 87womnnjg0.fsf@elephly.net.
> > > >     But if the user is tracking data files with git-annex, getting the
> > > >     hash of data files becomes less expensive because we can ask
> > > >     git-annex for the hash it has already computed.
> > > >
> > > >   * Support getting annex data files on demand (i.e. 'git annex get') if
> > > >     they are needed as inputs.
> > >
> > > I wonder what the protocol should look like.  Should a workflow
> > > explicitly request a “git annex” file or should it be up to the person
> > > running the workflow, i.e. when “git annex” has been configured to be
> > > the cache backend it would simply look up the declared input/output
> > > files there.
> > >
> > > I suppose the answers would equally apply to using IPFS as a cache.
> > 
> > I agree that the mechanism such as `git-annex` should be nice.
> > But is it not a mean for the CAS that we previously discussed?
> > 
> > I fully agree with the features and their description. Totally cool!
> > However, I am a bit reluctant with `git-annex` because it requires a
> > Haskell compiler and it is far far from "bootstrapability". I am aware
> > of the Ricardo's try---and AFIAK the only one. And here [1]
> > explanations by one Haskeller.
> > 
> > My opinion: GWL should stay on the path of Reproducibility,
> > end-to-end. So `git-annex` should be a transitional step---while the
> > Haskell bootstrap is not solved---as a mean for the CAS (cache) and I
> > would find more elegant to use the "data-oriented IPFS": IPLD [2].
> > 
> > 
> > [1] https://www.joachim-breitner.de/blog/748-Thoughts_on_bootstrapping_GHC
> > [2] https://ipld.io/
> > 
> > 
> > All the best,
> > simon
> > 

  reply	other threads:[~2019-06-06 14:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29 13:47 Next steps for the GWL Ricardo Wurmus
2019-06-03 15:16 ` zimoun
2019-06-03 16:18   ` Ricardo Wurmus
2019-06-06 11:07     ` zimoun
2019-06-06 12:19       ` Ricardo Wurmus
2019-06-06 13:23         ` Pjotr Prins
2019-06-06  3:19 ` Kyle Meyer
2019-06-06 10:11   ` Ricardo Wurmus
2019-06-06 10:55     ` zimoun
2019-06-06 11:59       ` Ricardo Wurmus
2019-06-06 13:44       ` Pjotr Prins
2019-06-06 14:06         ` Pjotr Prins [this message]
2019-06-06 15:07     ` Kyle Meyer
2019-06-06 20:29       ` Ricardo Wurmus
2019-06-07  4:11         ` Kyle Meyer
2019-06-12  9:46 ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.guixwl.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190606140659.wcwhc3bcfdkaznjw@thebird.nl \
    --to=pjotr2019@thebird.nl \
    --cc=gwl-devel@gnu.org \
    --cc=ricardo.wurmus@mdc-berlin.de \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).