Hi Dave,

"Thompson, David" <dthompson2@worcester.edu> writes:

> Agreed. Also this should be done in parallel eventually because
> updating 24 machines serially is silly.

Good idea. Do we have a Guix-specific API for parallelism, or should I
look to the Guile manual section on Futures?

> This does bring up the question of what to do upon failure. Other
> deployment systems that I've worked with (mainly AWS CodeDeploy)
> provide some options. First, the user can specify what it means for a
> deploy to succeed. Does it have to successfully deploy to each of them
> or should it allow some amount of failure? Then, upon failure, the
> user can specify whether or not a rollback should happen.

Would it make sense to allow failure to be defined on a
machine-by-machine basis? For example, adding some sort of
'behavior-on-failure' field to the 'machine' type?

Other options that come to mind are a '--behavior-on-failure' option for
'guix deploy', or to use a more elaborate type for deployment
specifications, maybe having a 'machines->deployment' function, similar
to 'packages->manifest'.

> My personal preference for default behavior right now is to update
> everything possible and print out a report so users can see what
> failed, but I think ultimately we'll need to provide more options.

Agreed.

> We need to also keep in mind that in-place updates to machines is just
> a primitive initial use-case. Things will get really fun when we get
> to blue-green deployments in cloud environments because "rollback"
> takes on a whole new meaning. :)

Glad to hear that you're already thinking about this sort of thing :)

Regards,
Jakob