From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60764) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fs3tO-0004Eu-SC for guix-patches@gnu.org; Tue, 21 Aug 2018 06:28:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fs3tJ-0000pl-Uq for guix-patches@gnu.org; Tue, 21 Aug 2018 06:28:06 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:51182) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fs3tJ-0000ph-QN for guix-patches@gnu.org; Tue, 21 Aug 2018 06:28:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1fs3tJ-0002ei-Mc for guix-patches@gnu.org; Tue, 21 Aug 2018 06:28:01 -0400 Subject: [bug#32408] [PATCH shepherd] Allow replacement of services Resent-Message-ID: From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) References: <87wosz4spx.fsf@zancanaro.id.au> <87h8jobwwp.fsf@gnu.org> <87k1okzqkf.fsf@zancanaro.id.au> Date: Tue, 21 Aug 2018 12:27:25 +0200 In-Reply-To: <87k1okzqkf.fsf@zancanaro.id.au> (Carlo Zancanaro's message of "Tue, 21 Aug 2018 07:16:48 +1000") Message-ID: <87in44ovzm.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Carlo Zancanaro Cc: 32408@debbugs.gnu.org Hey Carlo, Carlo Zancanaro skribis: > On Tue, Aug 21 2018, Ludovic Court=C3=A8s wrote: >> For instance, the =E2=80=98actions=E2=80=99 field of = could, by >> default, include an =E2=80=9Cupgrade=E2=80=9D action that simply sets the >> =E2=80=98replacement=E2=80=99 slot. For nginx, we=E2=80=99d provide a c= ustom =E2=80=9Cupgrade=E2=80=9D >> action that does =E2=80=9Cnginx -s restart=E2=80=9D or whatever it is th= at needs to >> be done. >> >> =E2=80=98guix system reconfigure=E2=80=99 would automatically invoke the= =E2=80=98upgrade=E2=80=99 >> action for each new service. >> >> WDYT? > > How many services can we meaningfully upgrade like this? Probably very few. > My understanding is that most of our system services a fed an immutable > configuration file, and thus restarting/reloading won't actually > upgrade them. In order to make an upgrade action work the service > would have to mutate itself into a new correct configuration, as well > as restarting/reloading the underlying daemon. It's even trickier if > the daemon itself has been upgraded, because then the process will > have to be restarted anyway. Yeah. > At any rate, I think the replacement mechanism (this patch) is just > one way that a service can be reloaded. It would probably be a good > idea to create a higher-level abstraction over it. I think other > mechanisms (like a upgrade/reload action) should be handled on the > Guix side of things. Sounds good. >>> + (let ((replacement (slot-ref service 'replacement))) >>> + (define (copy-slot! slot) >>> + (slot-set! service slot (slot-ref replacement slot))) >>> + (when replacement >>> + (copy-slot! 'provides) >>> + (copy-slot! 'requires) >>> + (copy-slot! 'respawn?) >>> + (copy-slot! 'start) >>> + (copy-slot! 'stop) >>> + (copy-slot! 'actions) >>> + (copy-slot! 'running) >>> + (copy-slot! 'docstring)) >>> + service)) >> >> Having a hardcoded list of slots sounds error-prone=E2=80=94surely we=E2= =80=99ll >> forget to update it down the road. I wonder what else could be >> done. >> >> One option would be to grab the block asyncs and atomically replace >> the service in the =E2=80=98%services=E2=80=99 hash table. Then we only= need to >> copy the =E2=80=98last-respawns=E2=80=99 slot to the new service, I beli= eve. (This >> changes the object identity of the service but I think its OK.) >> >> Another option would be to use GOOPS tricks to iterate over the list >> of slots and have a list of slots *not* to copy. I=E2=80=99m not a big = fan >> of this option, though. > > My favourite option for this would be to separate the object > into an immutable and a mutable . The > object would have a reference to a object in > order to invoke actions on it, and it could also hold a second > object as a replacement. Then the swap would be much more > straightforward. I haven't done any real work towards this, though. I agree that separating state is the right approach longer-term (it=E2=80= =99s beyond the scope of this patch.) > In the short term, I'd rather replace it in the %services hash > table. I did it by copying slots because I wasn't sure I would get the > details of the swap right and didn't have time to properly work out > how to do it. I'll give it a go! Alright! >>> +(let ((service (lookup-running 'test))) >>> + (slot-set! service 'replacement >>> + (make >> >> I wonder if we should let users fiddle with =E2=80=98replacement=E2=80= =99 directly, >> or if we should provide a higher-level construct. >> >> For instance, =E2=80=98register-services=E2=80=99 could transparently se= t the >> =E2=80=98replacement=E2=80=99 field for services already registered inst= ead of >> doing: >> >> (assert (null? (lookup-services (canonical-name new)))) >> >> Not sure if there are cases where this behavior would be >> undesirable, though. >> >> Thoughts? > > With this current patch the replacement field is only checked at the > point when the service is stopped, so the field could only be set when > the service is actually running. I think it makes the most sense to > just replace the service directly if it's not stopped. > > I can't think of any undesirable cases, but having a higher-level > interface is a good idea. Right. Currently the Guix side of things does the equivalent of: herd eval root '(register-services (load "a.scm") (load "b.scm"))' for services currently stopped, which is okay but not great. IWBN if it could directly use a higher-level action. > At the very least we need to control the inherent race condition > involved in (if running? do-x do-y) for if the service is stopped > after the running? check. At the moment I think the only thing we have > to worry about there is signals, but if we're going to move to have > more parallelism through fibers then we might need to be even more > careful. Indeed. Thank you, Ludo=E2=80=99.