all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* watchdog triggered auto-rollback
@ 2024-05-28  0:52 Nathan Dehnel
  2024-05-28  1:46 ` Richard Sent
  0 siblings, 1 reply; 7+ messages in thread
From: Nathan Dehnel @ 2024-05-28  0:52 UTC (permalink / raw)
  To: raingloom, guix-devel

>Would others find this useful?
I would 100% use this.

>Where in the stack would this be solved?
I think there's two places for rollbacks with two different purposes

GRUB: https://www.gnu.org/software/grub/manual/grub/html_node/fallback.html
GRUB supports falling back to another boot entry if the machine fails
to boot. This could be integrated with guix so GRUB falls back to a
previous guix system generation. This covers the case of "we can't
start a watchdog service because the system won't boot".

SSH watchdog: a shepherd service that tests SSH connectivity, and then
executes "guix system roll-back && reboot". SSH access is a rough
approximation for "the system is working", as kernel, init, and all
manner of networking services, DHCP, DNS, VPN, etc. must work for SSH
to work. And if SSH works then it provides a means for a user to fix
their system anyways.


^ permalink raw reply	[flat|nested] 7+ messages in thread
* watchdog triggered auto-rollback
@ 2024-05-24 12:50 raingloom
  2024-05-25 16:58 ` Richard Sent
  0 siblings, 1 reply; 7+ messages in thread
From: raingloom @ 2024-05-24 12:50 UTC (permalink / raw)
  To: Guix Devel

Since I've been experimenting with a foolproof unikernel based static
website deployment lately, I realized I should write down this idea I've
been chewing for a while:

It would be very nice to have automatic system rollbacks when certain
things break.
One example is broken SSH config that makes a machine unreachable.
Local testing is useful, but like in the SSH example, some issues only
become apparent when you are deploying to the production environment.

Would others find this useful?  Where in the stack would this be solved?
 Could we, for example, catch an issue in the init system and still
perform a rollback?  Or if not a full rollback, then at least a reboot
into the previous config?  (And if that is also broken, then the one
before, etc, etc)

Obviously there are a lot of edge cases and potential bugs in this
mechanism as well.  Sticking with the SSH example, rolling back to a
version that was kept around where the authorized keys are different
would also make the machine unreachable via SSH.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-29 20:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-28  0:52 watchdog triggered auto-rollback Nathan Dehnel
2024-05-28  1:46 ` Richard Sent
2024-05-28 10:10   ` Attila Lendvai
2024-05-29 13:45     ` Richard Sent
2024-05-29 20:41       ` Attila Lendvai
  -- strict thread matches above, loose matches on Subject: below --
2024-05-24 12:50 raingloom
2024-05-25 16:58 ` Richard Sent

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.