unofficial mirror of gwl-devel@gnu.org
 help / color / mirror / Atom feed
From: Ricardo Wurmus <rekado@elephly.net>
To: gwl-devel@gnu.org
Subject: fastest way to run a GWL workflow on AWS
Date: Mon, 06 Jul 2020 11:52:04 +0200	[thread overview]
Message-ID: <87a70dkm2j.fsf@elephly.net> (raw)

Hey there,

I had an idea to get a GWL workflow to run on AWS without having to mess
with Docker and all that.  GWL should do all of these steps when AWS
deployment is requested:

* create an EFS file system.  Why EFS?  Unlike EBS (block storage) and
  S3, one EFS can be accessed simultaneously by different virtual
  machines (EC2 instances).

* sync the closure of the complete workflow (all steps) to EFS.  (How?
  We could either mount EFS locally or use an EC2 instance as a simple
  “cloud” file server.) This differs from how other workflow languages
  handle things.  Other workflow systems have one or more Docker
  image(s) per step (sometimes one Docker image per application), which
  means that there is some duplication and setup time as Docker images
  are downloaded from a registry (where they have previously been
  uploaded).  Since Guix knows the closure of all programs in the
  workflow we can simply upload all of it.

* create as many EC2 instances as requested (respecting optional
  grouping information to keep any set of processes on the same node)
  and mount the EFS over NFS.  The OS on the EC2 instances doesn’t
  matter.

* run the processes on the EC2 instances (parallelizing as far as
  possible) and have them write to a unique directory on the shared
  EFS.  The rest of the EFS is used as a read-only store to access all
  the Guix-built tools.

The EFS either stays active or its contents are archived to S3 upon
completion to reduce storage costs.

The last two steps are obviously a little vague; we’d need to add a few
knobs to allow users to easily tweak resource allocation beyond what the
GWL currently offers (e.g. grouping, mapping resources to EC2 machine
sizes.)  To implement the last step we would need to keep track of step
execution.  We can already do this, but the complication here is to
effect execution on the remote nodes.

I also want to add optional reporting for each step.  There could be a
service that listens to events and each step would trigger events to
indicate start and stop of each step.  This could trivially be
visualized, so that users can keep track of the state of the workflow
and its processes, e.g. with a pretty web interface.

For the deployment to AWS (and eventual tear-down) we can use Guile AWS.

None of this depends on “guix deploy”, which I think would be a poor fit
as these virtual machines are meant to be disposable.

Another thing I’d like to point out is that this doesn’t lead users down
the AWS rabbit hole.  We don’t use specialized AWS services like their
cluster/grid service, nor do we use Docker, nor ECS, etc.  We use the
simplest resource types: plain EC2 and boring NFS storage.  This looks
like one of the simplest remote execution models, which could just as
well be used with other remote compute providers (or even a custom
server farm).

One of the open issues is to figure out how to sync the /gnu/store items
to EFS efficiently.  I don’t really want to shell out to rsync, nor do I
want to use “guix copy”, which would require a remote installation of
Guix.  Perhaps rsync would be the easiest route for a rough first
draft.  It would also be nice if we could deduplicate our slice of the
store to cut down on unnecessary traffic to AWS.

What do you think about this?

-- 
Ricardo


             reply	other threads:[~2020-07-06  9:52 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06  9:52 Ricardo Wurmus [this message]
2020-07-16  0:08 ` zimoun
2020-07-16 15:17   ` Ricardo Wurmus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.guixwl.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a70dkm2j.fsf@elephly.net \
    --to=rekado@elephly.net \
    --cc=gwl-devel@gnu.org \
    --subject='Re: fastest way to run a GWL workflow on AWS' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).