From mboxrd@z Thu Jan  1 00:00:00 1970
From: Chris Marusich <cmmarusich@gmail.com>
Subject: Guix orchestration notes
Date: Sun, 18 Feb 2018 05:37:00 +0100
Message-ID: <871shigbgz.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: multipart/signed; boundary="==-=-=";
	micalg=pgp-sha256; protocol="application/pgp-signature"
Return-path: <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:4830:134:3::10]:34012)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cmmarusich@gmail.com>) id 1enGit-0002BZ-Dc
	for guix-devel@gnu.org; Sat, 17 Feb 2018 23:37:13 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cmmarusich@gmail.com>) id 1enGiq-0004ej-4x
	for guix-devel@gnu.org; Sat, 17 Feb 2018 23:37:11 -0500
Received: from mail-pl0-x230.google.com ([2607:f8b0:400e:c01::230]:35453)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <cmmarusich@gmail.com>)
	id 1enGip-0004eH-Ow
	for guix-devel@gnu.org; Sat, 17 Feb 2018 23:37:08 -0500
Received: by mail-pl0-x230.google.com with SMTP id bb3so3876612plb.2
	for <guix-devel@gnu.org>; Sat, 17 Feb 2018 20:37:07 -0800 (PST)
Received: from garuda.local (c-24-18-253-84.hsd1.wa.comcast.net.
	[24.18.253.84]) by smtp.gmail.com with ESMTPSA id
	v2sm50514501pfe.171.2018.02.17.20.37.03 for <guix-devel@gnu.org>
	(version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
	Sat, 17 Feb 2018 20:37:04 -0800 (PST)
List-Id: "Development of GNU Guix and the GNU System distribution."
	<guix-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/guix-devel/>
List-Post: <mailto:guix-devel@gnu.org>
List-Help: <mailto:guix-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>,
	<mailto:guix-devel-request@gnu.org?subject=subscribe>
Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org
Sender: "Guix-devel" <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
To: guix-devel@gnu.org

--==-=-=
Content-Type: multipart/mixed; boundary="=-=-="

--=-=-=
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Hi,

At FOSDEM, some of us discussed "orchestration", which means something
like "how to deploy services to more than 1 machine in a coordinated
fashion".  Many people contributed to the discussion.  I took notes.
I've thought about this more, reviewed the "wip-deploy" branch, and
written up my thoughts in the attached file.

It's a rough sketch of ideas, biased with my own opinions and
experience, but I think it's good enough to share.  I invite you to
improve upon it: share your own thoughts, hack some code together, and
just iterate on this a bit, so we can make some progress.

Hopefully, we can agree on a basic design and get a working proof of
concept.  Then we can make a blog post about it!

=2D-=20
Chris

--=-=-=
Content-Type: text/x-org
Content-Disposition: attachment; filename=guix-orchestration.org
Content-Transfer-Encoding: quoted-printable

* end goal: blog post, 1 webserver 1 db
We should make something showable and hackable that people can start
to play with.

Proposed goal: show one deployment that upgrades services across two
or more servers.  Maybe a web server and a backing database.  As a
bonus, show features like: "guix gc --requisites" showing ALL runtime
dependencies, even those that are cross-server dependencies.  This
would help illustrates how Guix know the EXACTLY closure of what is
deployed to all hosts.
* Feature: service validation scripts
can be part of a service's start-up script for now

could split into multiple pre-defined hooks - which would encourage
consistency

* Feature: service upgrade/roll-back
make a transient "upgrade service" which manages the transition.
Maybe, a service author can define a procedure like the following:

  (define (switch-to-service #:key (old #f) new)
   ;; Stop the old service, start the new one, transfer state, perform vali=
dation, etc.
   )

it depends on both the old and the new service, so it can run things
in the right order (e.g., run old service's shutdown scripts, then run
new service's start-up script)

it also provides a place to implement more complex logic (e.g.,
checkpoint or back up the database on old service, then restore on new
service, or even migrate while both services are still running, then
shut down old service once migration is complete).

I'm not sure how to wire this into the normal system activation logic
(maybe extend the activation-service-type)?  But to correctly manage
upgrades/rollbacks in general (e.g., database upgrades), it seems
crucial that we should be able to control both the old and the new
service during the transition.

* Feature: <site-configuration>

We need something like a <site-configuration> (maybe
<distributed-system>?) which defines all the services and
operating-systems used in a distributed system.  The key here is that
it involves multiple hosts.

In this configuration, we define services and compose them together.

In it, we define host classes (not individual hosts!).  a host class
can be thought of as a role - web server, database, etc - but
conceptually it represents "a configuration shared by a group of
hosts".  a host class does NOT contain a list of hosts, and it does
NOT represent a single host, since the details of how many hosts exist
in a host class are not relevant to the abstract structure of the
distributed system's service dependencies.

Basically, a host class corresponds to an operating system
configuration, in that every host in the host class will use the same
operating system configuration.

We need a way to assign services to host classes.  I.e. we need to
choose what host pools a service will run on.  This could be
automatic, or this could be manually defined in the site configuration
directly.

Services depend on one another.  Currently we have a nice system for
describing service dependencies within a single host.  How do we
extend this so that a service can depend on a service on another host?
I'm not sure.  The most obvious solution is coarse-grained: provide a
way to declare that host class A depends on host class B.  But I can
see that this might be problematic, too: what if a service A in host
class X depends on a service B in host class Y, but service Y also
depends on a service Z in host class A?  If dependencies go from host
class to host class, then this is a circular dependency, which might
be a problem; if dependencies go from service to service, it's not
circular, so it clearly isn't a problem.  But I'm not sure how to
extend our service dependency model across hosts, so unless you have a
better idea, I think it's reasonable to start with the coarse-grained
model of one host class depending on another host class.

To update your site/distributed system, you would run 'guix deploy' or
similar, a push-model tool for coordinating the deployment. this tool
looks up the mapping from host class to actual nodes at runtime.  the
interface and implementation for how to do the query are decoupled;
e.g., maybe it's a procedure like host-class->hosts, so we can get the
mapping from a local config file if we want, or we can get the mapping
by querying AWS APIs to get all the instances with a specific tag.
decoupling the implementation from the interface for host-class->hosts
makes it easy to accommodate both use cases.

* What about a pull model for deployment?

In some large systems, a pull model for deployment is nice.  A push
model, might be pretty slow if you're deploying to thousands of hosts.
But a pull model has drawbacks, too.  For example, if 1000 hosts are
pulling from the same place, they can brown it out (this is a common
"death spiral" scenario when e.g. an entire site loses and then
regains power).

For now, a push model seems easier and more convenient.  If the
deployments are coordinated by a single process, it makes it easier to
enforce a policy for deployments (e.g., do this set of hosts before
the others).  I'll bet it wouldn't be too hard to take a working push
model and spread it out into more of a pull-like model later, also.
For example, if you have a dozen sites, maybe you have one node that
performs automatic deployment (with a push model) to the site, and
this master node periodically polls some global configuration to get
its latest site configuration.  You get the idea.

For now, a push model seems preferable.

* baby steps: first, let's orchestrate a static fleet of existing guixsd se=
rvers
later, can add ability to deploy to fleet where some servers don't run guix.
later, can add ability to change the size of the fleet when deploying.
these features will probably follow naturally, if the first design is good.

* note: different guix versions on different machines is not necessarily a =
problem
because guix daemon is also included as a service, so it and its
dependencies are exactly described when running 'guix deploy'.  If two
host classes run a slightly different version of guix daemon, it
shouldn't be a problem (assuming, of course, that those two versions
understand how to speak to one another as needed, e.g. for
offloading).

* Feature: allow operator to specify deployment policy
e.g., "one datacenter at a time", "no more than 30% hosts at once", etc.
for starters, we can have a very simple policy

but we should plan on making it easy for operators to define and use
their own policies, similar to how they can define their own
implementation for how to query the hostclass->host mapping

* observation: references today for service don't include dependent services
maybe they should (optionally)?
maybe they shouldn't?

What does a service "depend" on at runtime?  Obviously, it needs its
program files to run.  But it might need other services.  The former
can be described by following store references, but what about the
latter?  Dependencies between services are not reflected in the Guix
store.  Thus, some aspect of "service dependencies" (actually, a large
part of it) lives outside the store, and outside the "purely
functional software deployment model".

* thoughts on david's existing work: wip-deploy
This is the "wip-deploy" branch.

Does not build, but we can probably fix that easily.

The way he decoupled "platforms" (like "servers running on AWS",
"servers running on local VMs using qemu", etc.) from the various
actions is a very nice design that enables us to perform the same
action on different platforms without caring about the tiny details
that are different between the platforms.

seems like 1 "machine" =3D 1 os config =3D 1 host, we need to decouple
this.  Things like ip addr are overly specific, also.  Solve this with
host classes.

dependencies between services in different os configs are not modeled
need to model them to ensure dependencies are not broken during deployment.

no support (yet!) for deployments to existing, running machines.  it
only supports initial deployment.  however, if we make the 'provision'
procedure return state that describes the deployed machines, and we
save it, perhaps we could later look it up and do another deployment
on existing, running machines.  but wait, we already have this: the
deployment object describes the machines, so we would just need a
mechanism to look up their current state: are they running or not? if
they are not running, provision them; if they are running, update
(reconfigure) them.

the existing code seems to anticipate this, but there is no
implementation yet for 'reconfigure'.  I'll bet that if we can figure
out a good solution for the section "service upgrade/rollback" above,
we can use it here.

** Do 'build-deployment' and 'provision-deployment' use the same derivation?
Yes, and it's important that they do both use the same derivation,
otherwise there is no point in providing these as separate steps.

build-deployment sends the result of=20
machine-os-for-platform
(which transforms the os via virtualized-operating-system-os)
to
operating-system-derivation

provision-deployment sends the result of=20
machine-os-for-platform
(which transforms the os via virtualized-operating-system-os)
to
the platform's provision procedure, which here sends it off to:
system-qemu-image/shared-store-script
there, the os gets sent through=20
virtualized-operating-system-os
(for a second time; this procedure is apparently idempotent, which is
good because it means the os used for buliding is the same os used for
provisioning)
and then the os gets sent to
operating-system-derivation

so, 'build-deployment' and 'provision-deployment' wind up using the
same derivation.  this might not have been true if
virtualized-operating-system-os were not idempotent.  In general, it
seems important to ensure that the derivation used by
'build-deployment' and the derivation used by 'provision-deployment'
are in fact identical; otherwise, there isn't much point in building
before provisioning.
* problem: what if i don't want to deploy all services on a host at once?
in a large distributed system, it is often undesirable to deploy all
updates to all services at the same time.  How can we limit this?  A
task like "reconfiguring" a GuixSD server will currently upgrade all
services...

I'm thinking it would be nice to have something like the --upgrade and
=2D-do-not-upgrade options for "guix package", but for service
deployment.

This could be worked around by running just one primary service per
host.  Then you don't care if all the services on the host get
upgraded, since effectively you're treating it all like a single
service.


--=-=-=--

--==-=-=
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAlqJAuwACgkQ3UCaFdgi
Rp3pTg//Y45o2uwMmDAhd5ZdZ3/qSRL9Qp4EnZRZPDeRcTEMmVoTyNnOWcoHmzlx
Jcyi5ChUTSjU6uf2sekPKMCRcIQc8T4Io+2PVIhg7BuxGDJ156HglrwFJsfQzu9G
4lZY6ElvPpzftDhhF+JAGv+1J3BFdRMqJFev9ZsukhYkBGEbHFdKeuKBvl45iYsf
q21PAZxgOQFfXza4mpbqiPDkn2tIVp56uQenD2MR7/OJjhTMMjEVHsPd2yzuWs68
SgCGgvETLJsTCU96bkQB6/od6RaTsnfe6lr7QnSEHMC/C1/G7K9W/6JZGnSduzjk
ie9oFdbasdoPRBsUE0EWJgrrphbANez4ZmU/pvHGyRdJ4mOa8p8rNZ5mT89LJCTn
G6Jal+5Yj9e96a2KjnxrUHZ1+Ufc12yEfiKYP2+TeqghuVTU74D88HOUYgedIId4
HyYZcHRrjVGV3TfNSH74b213k0BSiHbL+WLeWPmq7eZE89kh3Mg/erzDYNxDTXwk
15VtkhIzE4n5VjvVXv3XnFdgeMU+xN6m+UpLWTACCJJFmD8laOJg+S4O8ps3pvLb
NgXQpoHylwUIhV/kpjLlm45rV9dh65OEQCeECL5t+bj7ZHr5+RzBcYDCKupgC4zX
Kgc6CIS2PJAQfsOVId9FSX5NrmWR2X4UJSanTfYoNYGSHN0Ij94=
=gBEj
-----END PGP SIGNATURE-----
--==-=-=--