all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Implementing guix system rollback / switch-generation
@ 2016-06-05 22:29 Chris Marusich
  2016-06-06  8:10 ` Ludovic Courtès
  2016-06-06 12:10 ` Implementing guix system rollback / switch-generation Leo Famulari
  0 siblings, 2 replies; 7+ messages in thread
From: Chris Marusich @ 2016-06-05 22:29 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 5445 bytes --]

Hi,

Reliable system-level rollback is a great feature of GuixSD.  Currently,
I've heard [1] that rollback can be performed using a variety of
methods.  I've experimented with these methods and come to the
conclusion that they should be improved.  However, before making any
changes, I wanted to get a second opinion.

Basically, I think there should be a command like "guix system
roll-back" which does the opposite of "guix system reconfigure
config.scm".  The rollback command should not require an operating
system configuration file.  I think this would be better than the
current rollback methods.  What do you think?

As I understand it, when I invoke "guix system reconfigure config.scm",
the following things happen (in guix/scripts/system.scm):

* A new system is built in the store (e.g.,
  /gnu/store/1qkdd4glvqjqf7azqniis7abkf7v1lng-system).

* A new symlink is created in /var/guix/profiles (e.g.,
  /var/guix/profiles/system-7-link), which points to the system in the
  store.

* The /var/guix/profiles/system symlink is updated to point to the new
  symlink.

* The new system's activation script is run, and the Shepherd services
  are upgraded.

* A new grub.cfg is copied to /boot/grub/grub.cfg.  This new grub.cfg
  updates the default menu entry to point to the new system, and it adds
  the previous system to the list of previous generations.

At this point, you are running a new system.  I am not sure if you need
to reboot to truly upgrade, but judging by the implementation of the
upgrade-shepherd-services procedure, it seems like you might sometimes
need to reboot to "really" complete the upgrade process.

This is great!  To upgrade your system, you just invoke one command,
and you're running the new system.  In addition, the next time you boot,
the new system will automatically be selected and booted by GRUB.  Nice.

Rollback should be just as easy.  You should be able to invoke a single
command: "guix system roll-back".  When that command succeeds, you
should be running the previous system, and the next time you boot, the
previous system should be automatically selected and booted by GRUB.  It
should be the opposite of "reconfigure".

However, there is currently no way to roll back in this manner.  Here
are the ways we can supposedly roll back, with commentary about each
one:

* Manually update the /var/guix/profiles/system symlink to point to a
  previous generation, e.g., /var/guix/profiles/system-5-link.

If you do this, it seems the running system will not actually be rolled
back.  The system pointed to by /run/current-system will not change,
since it points to a store path rather than to the
/var/guix/profiles/system symlink.  In addition, because
/boot/grub/grub.cfg has not been modified, when you reboot, GRUB will
still automatically select and boot the system from which you wanted to
roll back.  Also, the previous system's activation script will not be
run, and the shepherd services will not be downgraded.  With that in
mind, I'm not really sure what manually flipping this symlink actually
accomplishes.

* Use the emacs interface to update that symlink.

This is the same as above, except that emacs does the symlink flip for
you.  It does not seem to actually roll back the system.  The elisp code
seems to assume that flipping the symlink is sufficient.  Maybe that
works for changing user profile generations, but it seems to be
insufficient for system generation changes.

* While booting, at the GRUB menu, manually select a previous
  generation.

This is the only method that seems to actually put the system back into
a previous state, and it appears to do so correctly.  I've used this
method to save my systems a few times from bad upgrades, which is great.
The only downside to this method is that it doesn't modify the grub.cfg
file: until you reconfigure your system again, every time you restart,
GRUB will continue to automatically select and boot the newer,
problematic system from which you wanted to roll back.

I'd like to implement a rollback mechanism that lets you run a single
command which actually does the opposite of "guix system reconfigure".
I've looked at the code in guix/scripts/system.scm, and at first blush
this seems like a straightforward task.  However, I've noticed that the
switch-to-system procedure requires an operating system configuration
file to do things like get the system's activation script and generate
the new grub.cfg file.  Surely a user should not have to specify the
previous system's operating system configuration file on the command
line in order to roll back the system.  Is it possible to obtain these
things (e.g., the service activation script, the previous system's
grub.cfg) without requiring the user to supply the previous system's
operating system configuration file on the command line?  If I can
figure out how to do that, implementing the rest of the command should
be pretty easy, since it will do basically the same kind of things that
reconfigure does.

More generally, are people satisfied with the way system rollback is
currently implemented in GuixSD?  Do you think that the mechanism I'm
proposing is a bad idea?  I'd hate to try to implement something that
nobody else thinks is needed.

Footnotes: 
[1]  https://lists.gnu.org/archive/html/help-guix/2016-04/msg00001.html

-- 
Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing guix system rollback / switch-generation
  2016-06-05 22:29 Implementing guix system rollback / switch-generation Chris Marusich
@ 2016-06-06  8:10 ` Ludovic Courtès
  2016-06-09  7:19   ` Chris Marusich
  2016-06-06 12:10 ` Implementing guix system rollback / switch-generation Leo Famulari
  1 sibling, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2016-06-06  8:10 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

Hi Chris,

Chris Marusich <cmmarusich@gmail.com> skribis:

> Basically, I think there should be a command like "guix system
> roll-back" which does the opposite of "guix system reconfigure
> config.scm".  The rollback command should not require an operating
> system configuration file.  I think this would be better than the
> current rollback methods.  What do you think?

It would definitely be a welcome addition!  The refactoring that was
done to add ‘guix system list-generations’ was in the same spirit.

> As I understand it, when I invoke "guix system reconfigure config.scm",
> the following things happen (in guix/scripts/system.scm):
>
> * A new system is built in the store (e.g.,
>   /gnu/store/1qkdd4glvqjqf7azqniis7abkf7v1lng-system).
>
> * A new symlink is created in /var/guix/profiles (e.g.,
>   /var/guix/profiles/system-7-link), which points to the system in the
>   store.
>
> * The /var/guix/profiles/system symlink is updated to point to the new
>   symlink.
>
> * The new system's activation script is run, and the Shepherd services
>   are upgraded.
>
> * A new grub.cfg is copied to /boot/grub/grub.cfg.  This new grub.cfg
>   updates the default menu entry to point to the new system, and it adds
>   the previous system to the list of previous generations.
>
> At this point, you are running a new system.  I am not sure if you need
> to reboot to truly upgrade, but judging by the implementation of the
> upgrade-shepherd-services procedure, it seems like you might sometimes
> need to reboot to "really" complete the upgrade process.

Indeed.  Service upgrade is conservative, so it will not stop running
services, because it cannot know if this is something the user wants and
if it is safe; only services that are not currently running are loaded
and started upon reconfigure.

  http://bugs.gnu.org/22039

To improve on this, I think it should be possible for Shepherd services
to provide an ‘upgrade’ method in addition to start/stop, for those
services that can be upgraded with no downtime (there aren’t so many of
them, though.)

> * Manually update the /var/guix/profiles/system symlink to point to a
>   previous generation, e.g., /var/guix/profiles/system-5-link.
>
> If you do this, it seems the running system will not actually be rolled
> back.  The system pointed to by /run/current-system will not change,
> since it points to a store path rather than to the
> /var/guix/profiles/system symlink.

‘guix system reconfigure’ could explicitly update /run/current-system to
point to the old system.

> In addition, because /boot/grub/grub.cfg has not been modified, when
> you reboot, GRUB will still automatically select and boot the system
> from which you wanted to roll back.  Also, the previous system's
> activation script will not be run, and the shepherd services will not
> be downgraded.  With that in mind, I'm not really sure what manually
> flipping this symlink actually accomplishes.

Not much, indeed.

The problem with system rollback, as you’ve seen, is that we lack
information about the old system, such as what its activation script is
and what its Shepherd services are.

We could add the activation script’s file name to the ‘parameters’ file
that you see in the result of ‘guix system build foo.scm’.  But it would
be hard to add a forward-compatible yet complete description of the
Shepherd services there.

> * Use the emacs interface to update that symlink.
>
> This is the same as above, except that emacs does the symlink flip for
> you.  It does not seem to actually roll back the system.  The elisp code
> seems to assume that flipping the symlink is sufficient.  Maybe that
> works for changing user profile generations, but it seems to be
> insufficient for system generation changes.
>
> * While booting, at the GRUB menu, manually select a previous
>   generation.
>
> This is the only method that seems to actually put the system back into
> a previous state, and it appears to do so correctly.  I've used this
> method to save my systems a few times from bad upgrades, which is great.
> The only downside to this method is that it doesn't modify the grub.cfg
> file: until you reconfigure your system again, every time you restart,
> GRUB will continue to automatically select and boot the newer,
> problematic system from which you wanted to roll back.

ISTR that GRUB has a mechanism to record the last selected menu entry
and to use that as the next default entry.

Now, it’s not always what one would want to do.

However, ‘guix system roll-back/switch-generation’ could generate a
grub.cfg where the default menu entry points to whatever old generation
has been selected.

> I'd like to implement a rollback mechanism that lets you run a single
> command which actually does the opposite of "guix system reconfigure".
> I've looked at the code in guix/scripts/system.scm, and at first blush
> this seems like a straightforward task.  However, I've noticed that the
> switch-to-system procedure requires an operating system configuration
> file to do things like get the system's activation script and generate
> the new grub.cfg file.  Surely a user should not have to specify the
> previous system's operating system configuration file on the command
> line in order to roll back the system.  Is it possible to obtain these
> things (e.g., the service activation script, the previous system's
> grub.cfg) without requiring the user to supply the previous system's
> operating system configuration file on the command line?

Currently it’s not possible to obtain the activation script of past
generations, but as I wrote above, it’d be doable.

It’s not possible to obtain past grub.cfg files, but that’s not a
problem: we can always regenerate a new grub.cfg.

What seems more difficult to me is Shepherd services.  Maybe we could
store in the system output (result of ‘guix system build’) an sexp
representation of (part of) our <shepherd-service> records:

  (shepherd-service
    (provisions (x y z))
    (requirements (a b c))
    (start-script "/gnu/store/…-start-foo.scm")
    (stop-script "/gnu/store/…-stop-foo.scm")
    …)

Then ‘upgrade-shepherd-services’ could start from this simplified
representation instead of using the full-blown <shepherd-service>
objects, and thus could work both when instantiating a new generation
and when rolling back.

> More generally, are people satisfied with the way system rollback is
> currently implemented in GuixSD?

Personally I’m not fully satisfied, but it’s true that it covers my main
use case, which is to recover from a broken update.

I had never thought about live “downgrade” of services when rolling
back, because the only times where I’ve wanted to roll back is right
after booting (or trying to boot ;-)) into a new system generation.

> Do you think that the mechanism I'm proposing is a bad idea?  I'd hate
> to try to implement something that nobody else thinks is needed.

I think having basic delete-generations, switch-generations, and
roll-back sub-commands would be definitely welcome.

As a first step, switch-generations/roll-back commands could simply
update the symlinks and regenerate grub.cfg.

Milestone #2 would be running the previous system’s activation script,
which installs /run/current-system and adjust the set of users and
groups.

Milestone #3 would be live service downgrade, as you describe.

Thoughts?

Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing guix system rollback / switch-generation
  2016-06-05 22:29 Implementing guix system rollback / switch-generation Chris Marusich
  2016-06-06  8:10 ` Ludovic Courtès
@ 2016-06-06 12:10 ` Leo Famulari
  1 sibling, 0 replies; 7+ messages in thread
From: Leo Famulari @ 2016-06-06 12:10 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

On Sun, Jun 05, 2016 at 03:29:22PM -0700, Chris Marusich wrote:
> More generally, are people satisfied with the way system rollback is
> currently implemented in GuixSD?  Do you think that the mechanism I'm
> proposing is a bad idea?  I'd hate to try to implement something that
> nobody else thinks is needed.

I roll back from the GRUB menu if I've totally broken the system.
Otherwise, I use Git to manage the system configuration and roll back by
checkout out an earlier commit.

But, I think we should be able to switch to arbitrary system generations
and delete generations with the Guix tools.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing guix system rollback / switch-generation
  2016-06-06  8:10 ` Ludovic Courtès
@ 2016-06-09  7:19   ` Chris Marusich
  2016-06-12 16:46     ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Chris Marusich @ 2016-06-09  7:19 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 4006 bytes --]

ludo@gnu.org (Ludovic Courtès) writes:

> The problem with system rollback, as you’ve seen, is that we lack
> information about the old system, such as what its activation script is
> and what its Shepherd services are.
>
> We could add the activation script’s file name to the ‘parameters’ file
> that you see in the result of ‘guix system build foo.scm’.  But it would
> be hard to add a forward-compatible yet complete description of the
> Shepherd services there.

That makes sense.  I can't think of a better way to get the information
without access to the original operating system configuration file.

> ISTR that GRUB has a mechanism to record the last selected menu entry
> and to use that as the next default entry.
>
> Now, it’s not always what one would want to do.
>
> However, ‘guix system roll-back/switch-generation’ could generate a
> grub.cfg where the default menu entry points to whatever old generation
> has been selected.

Thank you for letting me know about that GRUB feature.  I didn't know
about it; I'll look into it more.  In any case, I do expect that a
roll-back/switch-generation command would modify the default GRUB menu
entry, since "guix system reconfigure" does the same.

> It’s not possible to obtain past grub.cfg files, but that’s not a
> problem: we can always regenerate a new grub.cfg.

I'm curious: is there a reason why /boot is not itself just another
symlink?  It might be nice if instead of overwriting the grub.cfg file,
we could just flip a symlink when rolling back.

> What seems more difficult to me is Shepherd services.  Maybe we could
> store in the system output (result of ‘guix system build’) an sexp
> representation of (part of) our <shepherd-service> records:
>
>   (shepherd-service
>     (provisions (x y z))
>     (requirements (a b c))
>     (start-script "/gnu/store/…-start-foo.scm")
>     (stop-script "/gnu/store/…-stop-foo.scm")
>     …)
>
> Then ‘upgrade-shepherd-services’ could start from this simplified
> representation instead of using the full-blown <shepherd-service>
> objects, and thus could work both when instantiating a new generation
> and when rolling back.

Yes, without access to the original operating system configuration file,
something like this seems like the best (or only?) way.

>> More generally, are people satisfied with the way system rollback is
>> currently implemented in GuixSD?
>
> Personally I’m not fully satisfied, but it’s true that it covers my main
> use case, which is to recover from a broken update.
>
> I had never thought about live “downgrade” of services when rolling
> back, because the only times where I’ve wanted to roll back is right
> after booting (or trying to boot ;-)) into a new system generation.

I think the current rollback mechanism is very usable if you just need
to roll back a small number of machines, and you have a way to manually
select the GRUB menu entries.  However, I'm not sure it would be easy to
roll back many machines remotely, so it would be nice if it were easier.

>> Do you think that the mechanism I'm proposing is a bad idea?  I'd hate
>> to try to implement something that nobody else thinks is needed.
>
> I think having basic delete-generations, switch-generations, and
> roll-back sub-commands would be definitely welcome.
>
> As a first step, switch-generations/roll-back commands could simply
> update the symlinks and regenerate grub.cfg.
>
> Milestone #2 would be running the previous system’s activation script,
> which installs /run/current-system and adjust the set of users and
> groups.
>
> Milestone #3 would be live service downgrade, as you describe.
>
> Thoughts?

I think breaking it down like that makes a lot of sense.  I'll give
milestone #1 a shot: make switch-generations/roll-back commands that
just update the symlinks and regenerate grub.cfg.

Thank you for the input!

-- 
Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing guix system rollback / switch-generation
  2016-06-09  7:19   ` Chris Marusich
@ 2016-06-12 16:46     ` Ludovic Courtès
  2016-06-13  9:21       ` Danny Milosavljevic
  0 siblings, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2016-06-12 16:46 UTC (permalink / raw)
  To: Chris Marusich; +Cc: guix-devel

Hello,

Chris Marusich <cmmarusich@gmail.com> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:

[...]

>> It’s not possible to obtain past grub.cfg files, but that’s not a
>> problem: we can always regenerate a new grub.cfg.
>
> I'm curious: is there a reason why /boot is not itself just another
> symlink?  It might be nice if instead of overwriting the grub.cfg file,
> we could just flip a symlink when rolling back.

/boot contains GRUB, and there’s nothing that runs before GRUB that
would allow us to choose among several GRUBs.

So we assume the latest GRUB always “works”, and we generate a grub.cfg
with a menu list all the older generations, which is rather convenient
from the UI viewpoint, I think.

>> I think having basic delete-generations, switch-generations, and
>> roll-back sub-commands would be definitely welcome.
>>
>> As a first step, switch-generations/roll-back commands could simply
>> update the symlinks and regenerate grub.cfg.
>>
>> Milestone #2 would be running the previous system’s activation script,
>> which installs /run/current-system and adjust the set of users and
>> groups.
>>
>> Milestone #3 would be live service downgrade, as you describe.
>>
>> Thoughts?
>
> I think breaking it down like that makes a lot of sense.  I'll give
> milestone #1 a shot: make switch-generations/roll-back commands that
> just update the symlinks and regenerate grub.cfg.

Awesome.  Thanks for starting this discussion!

Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing guix system rollback / switch-generation
  2016-06-12 16:46     ` Ludovic Courtès
@ 2016-06-13  9:21       ` Danny Milosavljevic
  2016-06-13 15:00         ` Atomic file updates Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Danny Milosavljevic @ 2016-06-13  9:21 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

Hi,

On Sun, 12 Jun 2016 18:46:55 +0200
ludo@gnu.org (Ludovic Courtès) wrote:

> So we assume the latest GRUB always “works”, and we generate a grub.cfg
> with a menu list all the older generations, which is rather convenient
> from the UI viewpoint, I think.

It's definitely convenient. Back when I enabled home encryption GuixSD didn't boot multiple times. Having the older generations there saved me from a system reinstallation / restore.

However, I think the actual point is to have all the "update" actions be atomic. Because there's something in the Linux kernel config (only on GuixSD; works just fine on Ubuntu) which makes my laptop crashy (on the first larger disk write after standby wakeup) I have some experience with Guix non-atomicity and let me tell you it's not good. (Just the other day it broke the substitute cache - so I couldn't use substitutes at all anymore; I've since found and deleted the cache directory contents)

Therefore, while I wouldn't replace (or re-symlink) the entire /boot on guix reconfiguration (it might be on its own partition, too), it *may* be useful to use include files and put these there atomically, one file per version. I'm not sure whether grub supports something like "include *.inc" with wildcards but that would be an idea.

For the record: atomically creating an important file means:
- write content into tempfile, if necessary into a subdir of the correct drive (where the finished file will be)
- fdatasync
- rename tempfile to have finished name (atomically)

Also without the include files it's fine to use one larger grub.cfg and update it like described.

I think that the monolithic grub.cfg is fine.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Atomic file updates
  2016-06-13  9:21       ` Danny Milosavljevic
@ 2016-06-13 15:00         ` Ludovic Courtès
  0 siblings, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2016-06-13 15:00 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: guix-devel

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> However, I think the actual point is to have all the "update" actions be atomic. Because there's something in the Linux kernel config (only on GuixSD; works just fine on Ubuntu) which makes my laptop crashy (on the first larger disk write after standby wakeup) I have some experience with Guix non-atomicity and let me tell you it's not good. (Just the other day it broke the substitute cache - so I couldn't use substitutes at all anymore; I've since found and deleted the cache directory contents)

When you stumble upon issues like this, please report them to
bug-guix@gnu.org with as many details as possible.

In the case of substitutes, files under /var/guix/substitute/cache are
written atomically; in guix/scripts/substitute.scm, it looks like this:

        (with-atomic-file-output file
          (lambda (out)
            (write (cache-entry cache-url narinfo) out)

and effectively it writes to a temporary file, which is then rename(2)d.
I’m adding an fsync(2) call in ‘with-atomic-file-output’, which was
missing until now.

Regardless, you were very unlucky to end up with a truncated file.  ;-)

Anyway, if your “crashy” setup allows you to uncover other issues in
this area, please do report them!

> Therefore, while I wouldn't replace (or re-symlink) the entire /boot on guix reconfiguration (it might be on its own partition, too), it *may* be useful to use include files and put these there atomically, one file per version. I'm not sure whether grub supports something like "include *.inc" with wildcards but that would be an idea.

Under /boot, the only thing that GuixSD’s code controls is grub.cfg,
which is created atomically (see ‘install-grub’ in (gnu build install).)
The other files are installed by ‘grub-install’.

Thanks for your feedback,
Ludo’.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-06-13 15:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-05 22:29 Implementing guix system rollback / switch-generation Chris Marusich
2016-06-06  8:10 ` Ludovic Courtès
2016-06-09  7:19   ` Chris Marusich
2016-06-12 16:46     ` Ludovic Courtès
2016-06-13  9:21       ` Danny Milosavljevic
2016-06-13 15:00         ` Atomic file updates Ludovic Courtès
2016-06-06 12:10 ` Implementing guix system rollback / switch-generation Leo Famulari

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.