unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#32408] [PATCH shepherd] Allow replacement of services
@ 2018-08-09 12:42 Carlo Zancanaro
  2018-08-20 20:33 ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Carlo Zancanaro @ 2018-08-09 12:42 UTC (permalink / raw)
  To: 32408


[-- Attachment #1.1: Type: text/plain, Size: 602 bytes --]

This is a relatively simple patch adding a replacement slot to 
services in the Shepherd. When stopping a service, the replacement 
slot is checked and, if it has a value, is used to upgrade the 
current service.

I've chosen to modify the existing service, rather than creating a 
new one, but that was mostly because it was easier for me to 
implement quickly, and I didn't have a huge amount of time.

I'm hopeful that this, or something like it, can be used by GuixSD 
to allow people to restart services after reconfiguring without 
rebooting (or remembering to stop, reconfigure, start).

Carlo


[-- Attachment #1.2: 0001-service-Add-a-replacement-slot-for-delayed-service-r.patch --]
[-- Type: text/x-patch, Size: 6964 bytes --]

From e03290041c91813f1a301c7e9c4dbb9ee768b400 Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Thu, 9 Aug 2018 22:30:38 +1000
Subject: [PATCH] service: Add a replacement slot for delayed service
 replacement.

* modules/shepherd/service.scm (<service>): Add replacement slot
(replace-service): New procedure.
(stop): Call replace-service after stopping a service.
* tests/replacement.sh: Add a test for it.
* Makefile.am (TESTS): Add the new test.
* doc/shepherd.texi (Slots of services): Document it.
---
 Makefile.am                  |   1 +
 doc/shepherd.texi            |   9 +++
 modules/shepherd/service.scm |  23 +++++++-
 tests/replacement.sh         | 106 +++++++++++++++++++++++++++++++++++
 4 files changed, 138 insertions(+), 1 deletion(-)
 create mode 100644 tests/replacement.sh

diff --git a/Makefile.am b/Makefile.am
index 8dad006..4322d7f 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -184,6 +184,7 @@ SUFFIXES = .go
 
 TESTS =						\
   tests/basic.sh				\
+  tests/replacement.sh				\
   tests/respawn.sh				\
   tests/respawn-throttling.sh			\
   tests/misbehaved-client.sh			\
diff --git a/doc/shepherd.texi b/doc/shepherd.texi
index 7946f8b..1de6d80 100644
--- a/doc/shepherd.texi
+++ b/doc/shepherd.texi
@@ -708,6 +708,15 @@ handler will not start it again.
 
 otherwise @code{#f}.
 
+@item
+@vindex replacement (slot of <service>)
+@code{replacement} specifies a service to be used to replace this one
+when it is stopped.  This service will continue to function normally
+until the @code{stop} action is invoked.  After the service has been
+successfully stopped, its definition will be replaced by the value of
+this slot, which must itself be a service.  This slot is ignored if
+its value is @code{#f}.
+
 @end itemize
 
 @c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 5653388..4f62dc1 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -205,7 +205,10 @@ respawned, shows that it has been respawned more than TIMES in SECONDS."
   (stop-delay? #:init-keyword #:stop-delay?
 	       #:init-value #f)
   ;; The times of the last respawns, most recent first.
-  (last-respawns #:init-form '()))
+  (last-respawns #:init-form '())
+  ;; A replacement for when this service is stopped.
+  (replacement #:init-keyword #:replacement
+               #:init-value #f))
 
 (define (service? obj)
   "Return true if OBJ is a service."
@@ -341,6 +344,21 @@ wire."
 			 (canonical-name obj)))))
   (slot-ref obj 'running))
 
+(define (replace-service service)
+  (let ((replacement (slot-ref service 'replacement)))
+    (define (copy-slot! slot)
+      (slot-set! service slot (slot-ref replacement slot)))
+    (when replacement
+      (copy-slot! 'provides)
+      (copy-slot! 'requires)
+      (copy-slot! 'respawn?)
+      (copy-slot! 'start)
+      (copy-slot! 'stop)
+      (copy-slot! 'actions)
+      (copy-slot! 'running)
+      (copy-slot! 'docstring))
+    service))
+
 ;; Stop the service, including services that depend on it.  If the
 ;; latter fails, continue anyway.  Return `#f' if it could be stopped.
 (define-method (stop (obj <service>) . args)
@@ -385,6 +403,9 @@ wire."
                ;; Reset the list of respawns.
                (slot-set! obj 'last-respawns '())
 
+               ;; Replace the service with its replacement, if it has one
+               (replace-service obj)
+
                ;; Status message.
                (let ((name (canonical-name obj)))
                  (if (running? obj)
diff --git a/tests/replacement.sh b/tests/replacement.sh
new file mode 100644
index 0000000..585ab5a
--- /dev/null
+++ b/tests/replacement.sh
@@ -0,0 +1,106 @@
+# GNU Shepherd --- Ensure replacing services works properly
+# Copyright © 2014, 2016 Ludovic Courtès <ludo@gnu.org>
+# Copyright © 2018 Carlo Zancanaro <carlo@zancanaro.id.au>
+#
+# This file is part of the GNU Shepherd.
+#
+# The GNU Shepherd is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or (at
+# your option) any later version.
+#
+# The GNU Shepherd is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with the GNU Shepherd.  If not, see <http://www.gnu.org/licenses/>.
+
+shepherd --version
+herd --version
+
+socket="t-socket-$$"
+conf="t-conf-$$"
+rconf="t-rconf-$$"
+log="t-log-$$"
+stamp="t-stamp-$$"
+pid="t-pid-$$"
+
+herd="herd -s $socket"
+
+trap "rm -f $socket $conf $rconf $stamp $log;
+      test -f $pid && kill \`cat $pid\` || true; rm -f $pid" EXIT
+
+cat > "$conf"<<EOF
+(use-modules (srfi srfi-26))
+(register-services
+ (make <service>
+   #:provides '(test)
+   #:start (const #t)
+   #:actions (make-actions
+              (say-hello (lambda _
+                          (call-with-output-file "$stamp"
+                           (lambda (port)
+                            (display "Hello" port))))))
+   #:respawn? #f))
+EOF
+
+rm -f "$pid" "$stamp" "$socket"
+shepherd -I -s "$socket" -c "$conf" --pid="$pid" --log="$log" &
+
+while ! test -f "$pid"; do sleep 0.5 ; done
+
+$herd start test
+
+if ! $herd say-hello test; then
+    echo "say-hello failed"
+    exit 1
+fi
+
+cat - > "$rconf"<<EOF
+(let ((service (lookup-running 'test)))
+  (slot-set! service 'replacement
+             (make <service>
+               #:provides '(test)
+               #:start (const #t)
+               #:actions (make-actions
+                          (say-goodbye (lambda _
+                                         (call-with-output-file "$stamp"
+                                           (lambda (port)
+                                             (display "Goodbye" port))))))
+               #:respawn? #f)))
+EOF
+
+$herd load root "$rconf"
+
+if ! $herd say-hello test; then
+    echo "say-hello failed after setting replacement"
+    exit 1
+fi
+
+if test `cat $stamp` != "Hello"; then
+    echo "Output file had the wrong contents! Was:"
+    cat $stamp
+    exit 1
+fi
+
+$herd stop test
+
+$herd start test
+
+if $herd say-hello test; then
+    echo "say-hello should have failed after stop/start"
+    exit 1
+fi
+
+if ! $herd say-goodbye test; then
+    echo "say-goodbye should have failed"
+    exit 1
+fi
+
+if test `cat $stamp` != "Goodbye"; then
+    echo "Output file had the wrong contents! Was:"
+    cat $stamp
+    exit 1
+fi
-- 
2.18.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bug#32408] [PATCH shepherd] Allow replacement of services
  2018-08-09 12:42 [bug#32408] [PATCH shepherd] Allow replacement of services Carlo Zancanaro
@ 2018-08-20 20:33 ` Ludovic Courtès
  2018-08-20 21:16   ` Carlo Zancanaro
  0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2018-08-20 20:33 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: 32408

Hi Carlo,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> This is a relatively simple patch adding a replacement slot to
> services in the Shepherd. When stopping a service, the replacement
> slot is checked and, if it has a value, is used to upgrade the current
> service.
>
> I've chosen to modify the existing service, rather than creating a new
> one, but that was mostly because it was easier for me to implement
> quickly, and I didn't have a huge amount of time.
>
> I'm hopeful that this, or something like it, can be used by GuixSD to
> allow people to restart services after reconfiguring without rebooting
> (or remembering to stop, reconfigure, start).

Awesome!  This is exactly what we need to address the problem.

We’ll still want to be able to special-case things like nginx that can
be “hot-replaced”, though.  So perhaps, in addition to this patch on the
Shepherd side, we’ll need extra stuff in (gnu services shepherd).

For instance, the ‘actions’ field of <shepherd-service> could, by
default, include an “upgrade” action that simply sets the ‘replacement’
slot.  For nginx, we’d provide a custom “upgrade” action that does
“nginx -s restart” or whatever it is that needs to be done.

‘guix system reconfigure’ would automatically invoke the ‘upgrade’
action for each new service.

WDYT?

> From e03290041c91813f1a301c7e9c4dbb9ee768b400 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Thu, 9 Aug 2018 22:30:38 +1000
> Subject: [PATCH] service: Add a replacement slot for delayed service
>  replacement.
>
> * modules/shepherd/service.scm (<service>): Add replacement slot
> (replace-service): New procedure.
> (stop): Call replace-service after stopping a service.
> * tests/replacement.sh: Add a test for it.
> * Makefile.am (TESTS): Add the new test.
> * doc/shepherd.texi (Slots of services): Document it.

Overall LGTM.  Some comments:

> +(define (replace-service service)

Please add a docstring.

> +  (let ((replacement (slot-ref service 'replacement)))
> +    (define (copy-slot! slot)
> +      (slot-set! service slot (slot-ref replacement slot)))
> +    (when replacement
> +      (copy-slot! 'provides)
> +      (copy-slot! 'requires)
> +      (copy-slot! 'respawn?)
> +      (copy-slot! 'start)
> +      (copy-slot! 'stop)
> +      (copy-slot! 'actions)
> +      (copy-slot! 'running)
> +      (copy-slot! 'docstring))
> +    service))

Having a hardcoded list of slots sounds error-prone—surely we’ll forget
to update it down the road.  I wonder what else could be done.

One option would be to grab the block asyncs and atomically replace the
service in the ‘%services’ hash table.  Then we only need to copy the
‘last-respawns’ slot to the new service, I believe.  (This changes the
object identity of the service but I think its OK.)

Another option would be to use GOOPS tricks to iterate over the list of
slots and have a list of slots *not* to copy.  I’m not a big fan of this
option, though.

> +cat - > "$rconf"<<EOF
       ^
You can remove the dash.

> +(let ((service (lookup-running 'test)))
> +  (slot-set! service 'replacement
> +             (make <service>

I wonder if we should let users fiddle with ‘replacement’ directly, or
if we should provide a higher-level construct.

For instance, ‘register-services’ could transparently set the
‘replacement’ field for services already registered instead of doing:

    (assert (null? (lookup-services (canonical-name new))))

Not sure if there are cases where this behavior would be undesirable,
though.

Thoughts?

> +if test `cat $stamp` != "Goodbye"; then

Nitpick: "`cat $stamp`".

Thanks!

Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#32408] [PATCH shepherd] Allow replacement of services
  2018-08-20 20:33 ` Ludovic Courtès
@ 2018-08-20 21:16   ` Carlo Zancanaro
  2018-08-21 10:27     ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Carlo Zancanaro @ 2018-08-20 21:16 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 32408

[-- Attachment #1: Type: text/plain, Size: 5010 bytes --]

Hey Ludo,

On Tue, Aug 21 2018, Ludovic Courtès wrote:
> We’ll still want to be able to special-case things like nginx 
> that can be “hot-replaced”, though.  So perhaps, in addition to 
> this patch on the Shepherd side, we’ll need extra stuff in (gnu 
> services shepherd).

Yeah, if we expose the replacement field directly, then we'll need 
some supporting code in (gnu services shepherd), even if it's just 
to detect whether the service is stopped or not before doing a 
replacement. Although ideally our interface wouldn't introduce 
race conditions like that. (See below for more thoughts on this.)

> For instance, the ‘actions’ field of <shepherd-service> could, 
> by default, include an “upgrade” action that simply sets the 
> ‘replacement’ slot.  For nginx, we’d provide a custom “upgrade” 
> action that does “nginx -s restart” or whatever it is that needs 
> to be done.
>
> ‘guix system reconfigure’ would automatically invoke the 
> ‘upgrade’ action for each new service.
>
> WDYT?

How many services can we meaningfully upgrade like this? My 
understanding is that most of our system services a fed an 
immutable configuration file, and thus restarting/reloading won't 
actually upgrade them. In order to make an upgrade action work the 
service would have to mutate itself into a new correct 
configuration, as well as restarting/reloading the underlying 
daemon. It's even trickier if the daemon itself has been upgraded, 
because then the process will have to be restarted anyway.

At any rate, I think the replacement mechanism (this patch) is 
just one way that a service can be reloaded. It would probably be 
a good idea to create a higher-level abstraction over it. I think 
other mechanisms (like a upgrade/reload action) should be handled 
on the Guix side of things.

>> +  (let ((replacement (slot-ref service 'replacement)))
>> +    (define (copy-slot! slot)
>> +      (slot-set! service slot (slot-ref replacement slot)))
>> +    (when replacement
>> +      (copy-slot! 'provides)
>> +      (copy-slot! 'requires)
>> +      (copy-slot! 'respawn?)
>> +      (copy-slot! 'start)
>> +      (copy-slot! 'stop)
>> +      (copy-slot! 'actions)
>> +      (copy-slot! 'running)
>> +      (copy-slot! 'docstring))
>> +    service))
>
> Having a hardcoded list of slots sounds error-prone—surely we’ll 
> forget to update it down the road.  I wonder what else could be 
> done.
>
> One option would be to grab the block asyncs and atomically 
> replace the service in the ‘%services’ hash table.  Then we only 
> need to copy the ‘last-respawns’ slot to the new service, I 
> believe.  (This changes the object identity of the service but I 
> think its OK.)
>
> Another option would be to use GOOPS tricks to iterate over the 
> list of slots and have a list of slots *not* to copy.  I’m not a 
> big fan of this option, though.

My favourite option for this would be to separate the <service> 
object into an immutable <service> and a mutable <service-state>. 
The <service-state> object would have a reference to a <service> 
object in order to invoke actions on it, and it could also hold a 
second <service> object as a replacement. Then the swap would be 
much more straightforward. I haven't done any real work towards 
this, though.

In the short term, I'd rather replace it in the %services hash 
table. I did it by copying slots because I wasn't sure I would get 
the details of the swap right and didn't have time to properly 
work out how to do it. I'll give it a go!

>> +(let ((service (lookup-running 'test)))
>> +  (slot-set! service 'replacement
>> +             (make <service>
>
> I wonder if we should let users fiddle with ‘replacement’ 
> directly, or if we should provide a higher-level construct.
>
> For instance, ‘register-services’ could transparently set the 
> ‘replacement’ field for services already registered instead of 
> doing:
>
>     (assert (null? (lookup-services (canonical-name new))))
>
> Not sure if there are cases where this behavior would be 
> undesirable, though.
>
> Thoughts?

With this current patch the replacement field is only checked at 
the point when the service is stopped, so the field could only be 
set when the service is actually running. I think it makes the 
most sense to just replace the service directly if it's not 
stopped.

I can't think of any undesirable cases, but having a higher-level 
interface is a good idea. At the very least we need to control the 
inherent race condition involved in (if running? do-x do-y) for if 
the service is stopped after the running? check. At the moment I 
think the only thing we have to worry about there is signals, but 
if we're going to move to have more parallelism through fibers 
then we might need to be even more careful.

I'll try to send through an updated patch later this week.

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#32408] [PATCH shepherd] Allow replacement of services
  2018-08-20 21:16   ` Carlo Zancanaro
@ 2018-08-21 10:27     ` Ludovic Courtès
  2018-08-23 13:45       ` Carlo Zancanaro
  0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2018-08-21 10:27 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: 32408

Hey Carlo,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> On Tue, Aug 21 2018, Ludovic Courtès wrote:
>> For instance, the ‘actions’ field of <shepherd-service> could, by
>> default, include an “upgrade” action that simply sets the
>> ‘replacement’ slot.  For nginx, we’d provide a custom “upgrade”
>> action that does “nginx -s restart” or whatever it is that needs to
>> be done.
>>
>> ‘guix system reconfigure’ would automatically invoke the ‘upgrade’
>> action for each new service.
>>
>> WDYT?
>
> How many services can we meaningfully upgrade like this?

Probably very few.

> My understanding is that most of our system services a fed an immutable
> configuration file, and thus restarting/reloading won't actually
> upgrade them. In order to make an upgrade action work the service
> would have to mutate itself into a new correct configuration, as well
> as restarting/reloading the underlying daemon. It's even trickier if
> the daemon itself has been upgraded, because then the process will
> have to be restarted anyway.

Yeah.

> At any rate, I think the replacement mechanism (this patch) is just
> one way that a service can be reloaded. It would probably be a good
> idea to create a higher-level abstraction over it. I think other
> mechanisms (like a upgrade/reload action) should be handled on the
> Guix side of things.

Sounds good.

>>> +  (let ((replacement (slot-ref service 'replacement)))
>>> +    (define (copy-slot! slot)
>>> +      (slot-set! service slot (slot-ref replacement slot)))
>>> +    (when replacement
>>> +      (copy-slot! 'provides)
>>> +      (copy-slot! 'requires)
>>> +      (copy-slot! 'respawn?)
>>> +      (copy-slot! 'start)
>>> +      (copy-slot! 'stop)
>>> +      (copy-slot! 'actions)
>>> +      (copy-slot! 'running)
>>> +      (copy-slot! 'docstring))
>>> +    service))
>>
>> Having a hardcoded list of slots sounds error-prone—surely we’ll
>> forget to update it down the road.  I wonder what else could be
>> done.
>>
>> One option would be to grab the block asyncs and atomically replace
>> the service in the ‘%services’ hash table.  Then we only need to
>> copy the ‘last-respawns’ slot to the new service, I believe.  (This
>> changes the object identity of the service but I think its OK.)
>>
>> Another option would be to use GOOPS tricks to iterate over the list
>> of slots and have a list of slots *not* to copy.  I’m not a big fan
>> of this option, though.
>
> My favourite option for this would be to separate the <service> object
> into an immutable <service> and a mutable <service-state>. The
> <service-state> object would have a reference to a <service> object in
> order to invoke actions on it, and it could also hold a second
> <service> object as a replacement. Then the swap would be much more
> straightforward. I haven't done any real work towards this, though.

I agree that separating state is the right approach longer-term (it’s
beyond the scope of this patch.)

> In the short term, I'd rather replace it in the %services hash
> table. I did it by copying slots because I wasn't sure I would get the
> details of the swap right and didn't have time to properly work out
> how to do it. I'll give it a go!

Alright!

>>> +(let ((service (lookup-running 'test)))
>>> +  (slot-set! service 'replacement
>>> +             (make <service>
>>
>> I wonder if we should let users fiddle with ‘replacement’ directly,
>> or if we should provide a higher-level construct.
>>
>> For instance, ‘register-services’ could transparently set the
>> ‘replacement’ field for services already registered instead of
>> doing:
>>
>>     (assert (null? (lookup-services (canonical-name new))))
>>
>> Not sure if there are cases where this behavior would be
>> undesirable, though.
>>
>> Thoughts?
>
> With this current patch the replacement field is only checked at the
> point when the service is stopped, so the field could only be set when
> the service is actually running. I think it makes the most sense to
> just replace the service directly if it's not stopped.
>
> I can't think of any undesirable cases, but having a higher-level
> interface is a good idea.

Right.  Currently the Guix side of things does the equivalent of:

  herd eval root '(register-services (load "a.scm") (load "b.scm"))'

for services currently stopped, which is okay but not great.  IWBN if it
could directly use a higher-level action.

> At the very least we need to control the inherent race condition
> involved in (if running? do-x do-y) for if the service is stopped
> after the running? check. At the moment I think the only thing we have
> to worry about there is signals, but if we're going to move to have
> more parallelism through fibers then we might need to be even more
> careful.

Indeed.

Thank you,
Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#32408] [PATCH shepherd] Allow replacement of services
  2018-08-21 10:27     ` Ludovic Courtès
@ 2018-08-23 13:45       ` Carlo Zancanaro
  2018-08-25 14:20         ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Carlo Zancanaro @ 2018-08-23 13:45 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 32408


[-- Attachment #1.1: Type: text/plain, Size: 972 bytes --]

Hey Ludo’,

I've attached an updated patch. I couldn't think of any unwanted 
consequences, so I took your idea of making register-services 
handle most of the details of replacement. With my patch, 
something like
>   herd eval root '(register-services (load "a.scm") (load 
>   "b.scm"))'
will deal with a conflict by either replacing the old service (if 
it's not running), arranging for the old service to be replaced 
when it's stopped, or raising an error. This seems like a sensible 
way for things to function.

>> At the very least we need to control the inherent race 
>> condition [...]
>
> Indeed.

Despite my desire to deal with the race condition, I haven't done 
anything about it in this patch. The modification of %services 
that was done in register-services was already racy, and I don't 
think this patch will make it worse. If it hasn't been a problem 
up until now, then I don't think this will make it a problem.

Carlo


[-- Attachment #1.2: 0001-service-Add-a-replacement-slot-for-delayed-service-r.patch --]
[-- Type: text/x-patch, Size: 9670 bytes --]

From 9ec5c0000e9a45441417a6ee4138cdcbf1b1f2b2 Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Thu, 9 Aug 2018 22:30:38 +1000
Subject: [PATCH] service: Add a replacement slot for delayed service
 replacement.

* modules/shepherd/service.scm (<service>): Add replacement slot
(replace-service): New procedure.
(stop): Call replace-service after stopping a service.
(register-services): Replace existing services where possible, setting the new
replacement slot if they are currently running.
* tests/replacement.sh: Add a test for it.
* Makefile.am (TESTS): Add the new test.
* doc/shepherd.texi (Slots of services): Document it.
---
 Makefile.am                  |   1 +
 doc/shepherd.texi            |   9 +++
 modules/shepherd/service.scm |  68 ++++++++++++++++++-----
 tests/replacement.sh         | 105 +++++++++++++++++++++++++++++++++++
 4 files changed, 168 insertions(+), 15 deletions(-)
 create mode 100644 tests/replacement.sh

diff --git a/Makefile.am b/Makefile.am
index 8dad006..4322d7f 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -184,6 +184,7 @@ SUFFIXES = .go
 
 TESTS =						\
   tests/basic.sh				\
+  tests/replacement.sh				\
   tests/respawn.sh				\
   tests/respawn-throttling.sh			\
   tests/misbehaved-client.sh			\
diff --git a/doc/shepherd.texi b/doc/shepherd.texi
index 7946f8b..1de6d80 100644
--- a/doc/shepherd.texi
+++ b/doc/shepherd.texi
@@ -708,6 +708,15 @@ handler will not start it again.
 
 otherwise @code{#f}.
 
+@item
+@vindex replacement (slot of <service>)
+@code{replacement} specifies a service to be used to replace this one
+when it is stopped.  This service will continue to function normally
+until the @code{stop} action is invoked.  After the service has been
+successfully stopped, its definition will be replaced by the value of
+this slot, which must itself be a service.  This slot is ignored if
+its value is @code{#f}.
+
 @end itemize
 
 @c @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 5653388..006309c 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -205,7 +205,10 @@ respawned, shows that it has been respawned more than TIMES in SECONDS."
   (stop-delay? #:init-keyword #:stop-delay?
 	       #:init-value #f)
   ;; The times of the last respawns, most recent first.
-  (last-respawns #:init-form '()))
+  (last-respawns #:init-form '())
+  ;; A replacement for when this service is stopped.
+  (replacement #:init-keyword #:replacement
+               #:init-value #f))
 
 (define (service? obj)
   "Return true if OBJ is a service."
@@ -341,6 +344,20 @@ wire."
 			 (canonical-name obj)))))
   (slot-ref obj 'running))
 
+(define (replace-service old-service new-service)
+  "Replace OLD-SERVICE with NEW-SERVICE in the services registry.  This
+completely removes all references to OLD-SERVICE before registering
+NEW-SERVICE."
+  (define (remove-service name)
+    (let* ((old (hashq-ref %services name))
+           (new (delete old-service old)))
+      (if (null? new)
+          (hashq-remove! %services name)
+          (hashq-set! %services name new))))
+  (when new-service
+    (for-each remove-service (provided-by old-service))
+    (register-services new-service)))
+
 ;; Stop the service, including services that depend on it.  If the
 ;; latter fails, continue anyway.  Return `#f' if it could be stopped.
 (define-method (stop (obj <service>) . args)
@@ -385,6 +402,11 @@ wire."
                ;; Reset the list of respawns.
                (slot-set! obj 'last-respawns '())
 
+               ;; Replace the service with its replacement, if it has one
+               (let ((replacement (slot-ref obj 'replacement)))
+                 (when replacement
+                   (replace-service obj replacement)))
+
                ;; Status message.
                (let ((name (canonical-name obj)))
                  (if (running? obj)
@@ -1038,25 +1060,41 @@ then disable it."
 
 ;; Add NEW-SERVICES to the list of known services.
 (define (register-services . new-services)
+  "Add NEW-SERVICES to the list of known services.  If a service has already
+been registered, arrange to have it replaced when it is next stopped.  If it
+is currently stopped, replace it immediately."
   (define (register-single-service new)
     ;; Sanity-checks first.
     (assert (list-of-symbols? (provided-by new)))
     (assert (list-of-symbols? (required-by new)))
     (assert (boolean? (respawn? new)))
-    ;; Canonical name actually must be canonical.  (FIXME: This test
-    ;; is incomplete, since we may add a service later that makes it
-    ;; non-cannonical.)
-    (assert (null? (lookup-services (canonical-name new))))
-    ;; FIXME: Verify consistency: Check that there are no circular
-    ;; dependencies, check for bogus conflicts/dependencies, whatever
-    ;; else makes sense.
-
-    ;; Insert into the hash table.
-    (for-each (lambda (name)
-		(let ((old (lookup-services name)))
-		  ;; Actually add the new service now.
-		  (hashq-set! %services name (cons new old))))
-	      (provided-by new)))
+
+    ;; FIXME: Just because we have a unique canonical name now doesn't mean it
+    ;; will remain unique as other services are added. Whenever a service is
+    ;; added it should check that it's not conflicting with any already
+    ;; registered canonical names.
+    (match (lookup-services (canonical-name new))
+      (() ;; empty, so we can safely add ourselves
+       (for-each (lambda (name)
+		   (let ((old (lookup-services name)))
+		     (hashq-set! %services name (cons new old))))
+	         (provided-by new)))
+      ((old) ;; one service registered, so it may be an old version of us
+       (cond
+        ((not (eq? (canonical-name new) (canonical-name old)))
+         (local-output
+          "Cannot register service ~a: canonical name is not unique."
+          (canonical-name new))
+         (throw 'non-canonical-name))
+        ((running? old)
+         (slot-set! old 'replacement new))
+        (#:else
+         (replace-service old new))))
+      (_ ;; in any other case, there are too many services to register
+       (local-output
+        "Cannot register service ~a: canonical name is not unique."
+        (canonical-name new))
+       (throw 'non-canonical-name))))
 
   (for-each register-single-service new-services))
 
diff --git a/tests/replacement.sh b/tests/replacement.sh
new file mode 100644
index 0000000..e06cb93
--- /dev/null
+++ b/tests/replacement.sh
@@ -0,0 +1,105 @@
+# GNU Shepherd --- Ensure replacing services works properly
+# Copyright © 2014, 2016 Ludovic Courtès <ludo@gnu.org>
+# Copyright © 2018 Carlo Zancanaro <carlo@zancanaro.id.au>
+#
+# This file is part of the GNU Shepherd.
+#
+# The GNU Shepherd is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or (at
+# your option) any later version.
+#
+# The GNU Shepherd is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with the GNU Shepherd.  If not, see <http://www.gnu.org/licenses/>.
+
+shepherd --version
+herd --version
+
+socket="t-socket-$$"
+conf="t-conf-$$"
+rconf="t-rconf-$$"
+log="t-log-$$"
+stamp="t-stamp-$$"
+pid="t-pid-$$"
+
+herd="herd -s $socket"
+
+trap "rm -f $socket $conf $rconf $stamp $log;
+      test -f $pid && kill \`cat $pid\` || true; rm -f $pid" EXIT
+
+cat > "$conf"<<EOF
+(use-modules (srfi srfi-26))
+(register-services
+ (make <service>
+   #:provides '(test)
+   #:start (const #t)
+   #:actions (make-actions
+              (say-hello (lambda _
+                          (call-with-output-file "$stamp"
+                           (lambda (port)
+                            (display "Hello" port))))))
+   #:respawn? #f))
+EOF
+
+rm -f "$pid" "$stamp" "$socket"
+shepherd -I -s "$socket" -c "$conf" --pid="$pid" --log="$log" &
+
+while ! test -f "$pid"; do sleep 0.5 ; done
+
+$herd start test
+
+if ! $herd say-hello test; then
+    echo "say-hello failed"
+    exit 1
+fi
+
+cat > "$rconf"<<EOF
+(register-services
+ (make <service>
+   #:provides '(test)
+   #:start (const #t)
+   #:actions (make-actions
+              (say-goodbye (lambda _
+                             (call-with-output-file "$stamp"
+                              (lambda (port)
+                                (display "Goodbye" port))))))
+   #:respawn? #f))
+EOF
+
+$herd load root "$rconf"
+
+if ! $herd say-hello test; then
+    echo "say-hello failed after setting replacement"
+    exit 1
+fi
+
+if test "`cat $stamp`" != "Hello"; then
+    echo "Output file had the wrong contents! Was:"
+    cat $stamp
+    exit 1
+fi
+
+$herd stop test
+
+$herd start test
+
+if $herd say-hello test; then
+    echo "say-hello should have failed after stop/start"
+    exit 1
+fi
+
+if ! $herd say-goodbye test; then
+    echo "say-goodbye failed after replacement"
+    exit 1
+fi
+
+if test "`cat $stamp`" != "Goodbye"; then
+    echo "Output file had the wrong contents! Was:"
+    cat $stamp
+    exit 1
+fi
-- 
2.18.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bug#32408] [PATCH shepherd] Allow replacement of services
  2018-08-23 13:45       ` Carlo Zancanaro
@ 2018-08-25 14:20         ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2018-08-25 14:20 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: 32408

Hello!

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> I've attached an updated patch. I couldn't think of any unwanted
> consequences, so I took your idea of making register-services handle
> most of the details of replacement. With my patch, something like
>>   herd eval root '(register-services (load "a.scm") (load
>> "b.scm"))'
> will deal with a conflict by either replacing the old service (if it's
> not running), arranging for the old service to be replaced when it's
> stopped, or raising an error. This seems like a sensible way for
> things to function.

Awesome.  Thus, the only thing we need to change in ‘guix system
reconfigure’ is to make the ‘register-services’ call unconditional
(currently it’s limited to services that are stopped.)

> Despite my desire to deal with the race condition, I haven't done
> anything about it in this patch. The modification of %services that
> was done in register-services was already racy, and I don't think this
> patch will make it worse. If it hasn't been a problem up until now,
> then I don't think this will make it a problem.

Yeah, sounds reasonable.

> From 9ec5c0000e9a45441417a6ee4138cdcbf1b1f2b2 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Thu, 9 Aug 2018 22:30:38 +1000
> Subject: [PATCH] service: Add a replacement slot for delayed service
>  replacement.
>
> * modules/shepherd/service.scm (<service>): Add replacement slot
> (replace-service): New procedure.
> (stop): Call replace-service after stopping a service.
> (register-services): Replace existing services where possible, setting the new
> replacement slot if they are currently running.
> * tests/replacement.sh: Add a test for it.
> * Makefile.am (TESTS): Add the new test.
> * doc/shepherd.texi (Slots of services): Document it.

Awesome, please push!  And sorry about the delay.

Thank you,
Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-25 14:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-09 12:42 [bug#32408] [PATCH shepherd] Allow replacement of services Carlo Zancanaro
2018-08-20 20:33 ` Ludovic Courtès
2018-08-20 21:16   ` Carlo Zancanaro
2018-08-21 10:27     ` Ludovic Courtès
2018-08-23 13:45       ` Carlo Zancanaro
2018-08-25 14:20         ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).