Hi Dave, "Thompson, David" writes: > Agreed. Also this should be done in parallel eventually because > updating 24 machines serially is silly. Good idea. Do we have a Guix-specific API for parallelism, or should I look to the Guile manual section on Futures? > This does bring up the question of what to do upon failure. Other > deployment systems that I've worked with (mainly AWS CodeDeploy) > provide some options. First, the user can specify what it means for a > deploy to succeed. Does it have to successfully deploy to each of them > or should it allow some amount of failure? Then, upon failure, the > user can specify whether or not a rollback should happen. Would it make sense to allow failure to be defined on a machine-by-machine basis? For example, adding some sort of 'behavior-on-failure' field to the 'machine' type? Other options that come to mind are a '--behavior-on-failure' option for 'guix deploy', or to use a more elaborate type for deployment specifications, maybe having a 'machines->deployment' function, similar to 'packages->manifest'. > My personal preference for default behavior right now is to update > everything possible and print out a report so users can see what > failed, but I think ultimately we'll need to provide more options. Agreed. > We need to also keep in mind that in-place updates to machines is just > a primitive initial use-case. Things will get really fun when we get > to blue-green deployments in cloud environments because "rollback" > takes on a whole new meaning. :) Glad to hear that you're already thinking about this sort of thing :) Regards, Jakob