unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Shepherd does not recycle zombie processes
@ 2016-11-05 22:22 Dale Mellor
       [not found] ` <CAFAGzb_7ryGVJp1zgjYuT_czaS2o5RLMtfkEHGf9m3z=tXJ3mw@mail.gmail.com>
  2016-11-06 22:09 ` Ludovic Courtès
  0 siblings, 2 replies; 9+ messages in thread
From: Dale Mellor @ 2016-11-05 22:22 UTC (permalink / raw)
  To: guix-devel

I'm running shepherd stand-alone in a Debian system.  But I am seeing
zombie processes which have been kicked off by shepherd, and they do not
get re-spawned.

Any suggestions?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
       [not found]   ` <CAFAGzb99gGz-U-DZTaVZyk1OfMOZG_z8JxiWu3JnfnmHAYgcnQ@mail.gmail.com>
@ 2016-11-06 21:21     ` Carlo Zancanaro
  2016-11-07  8:53       ` Ludovic Courtès
  0 siblings, 1 reply; 9+ messages in thread
From: Carlo Zancanaro @ 2016-11-06 21:21 UTC (permalink / raw)
  To: Dale Mellor; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 537 bytes --]

I've had problems with Shepherd and its daemonize action. If I run
daemonize (as the first thing when Shepherd starts) then it fails to handle
signals from child processes.

I've only been running without the daemonize call for a day or so, but it
seems to properly handle the child processes now.

On 06/11/2016 9:52 am, "Dale Mellor" <dale@rdmp.org> wrote:

I'm running shepherd stand-alone in a Debian system.  But I am seeing
zombie processes which have been kicked off by shepherd, and they do not
get re-spawned.

Any suggestions?

[-- Attachment #2: Type: text/html, Size: 880 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-05 22:22 Shepherd does not recycle zombie processes Dale Mellor
       [not found] ` <CAFAGzb_7ryGVJp1zgjYuT_czaS2o5RLMtfkEHGf9m3z=tXJ3mw@mail.gmail.com>
@ 2016-11-06 22:09 ` Ludovic Courtès
  1 sibling, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2016-11-06 22:09 UTC (permalink / raw)
  To: Dale Mellor; +Cc: guix-devel

Hi Dale,

Dale Mellor <dale@rdmp.org> skribis:

> I'm running shepherd stand-alone in a Debian system.  But I am seeing
> zombie processes which have been kicked off by shepherd, and they do not
> get re-spawned.

This may have to do with unreliable signal handling (SIGCHLD in this
case) in Guile:

  https://lists.gnu.org/archive/html/guile-devel/2013-07/msg00004.html

shepherd.scm has a hack to mostly work around it, but it’s not always
sufficient as evidenced by occasionally failures of the respawn test:

  http://bugs.gnu.org/23811

Does it frequently fail to respawn?

Ludo’.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-06 21:21     ` Carlo Zancanaro
@ 2016-11-07  8:53       ` Ludovic Courtès
  2016-11-07 11:35         ` Carlo Zancanaro
  2016-11-24  8:27         ` Dale Mellor
  0 siblings, 2 replies; 9+ messages in thread
From: Ludovic Courtès @ 2016-11-07  8:53 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: Dale Mellor, guix-devel

Hi,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> I've had problems with Shepherd and its daemonize action. If I run
> daemonize (as the first thing when Shepherd starts) then it fails to handle
> signals from child processes.

Could it be that you invoke the ‘daemonize’ action after respawnable
processes have been started?  The manual has this caveat (info
"(shepherd) The root and unknown services"):

  ‘daemonize’
       Fork and go into the background.  This should be called before
       respawnable services are started, as otherwise we would not get the
       ‘SIGCHLD’ signals when they terminate.

HTH,
Ludo’.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-07  8:53       ` Ludovic Courtès
@ 2016-11-07 11:35         ` Carlo Zancanaro
  2016-11-09 14:49           ` Ludovic Courtès
  2016-11-24  8:27         ` Dale Mellor
  1 sibling, 1 reply; 9+ messages in thread
From: Carlo Zancanaro @ 2016-11-07 11:35 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel


[-- Attachment #1.1: Type: text/plain, Size: 1312 bytes --]


Hey Ludo!

On Mon, Nov 07 2016, Ludovic Courtès wrote
> Could it be that you invoke the ‘daemonize’ action after 
> respawnable processes have been started?  The manual has this 
> caveat (info "(shepherd) The root and unknown services"):
>
>   ‘daemonize’ 
>        Fork and go into the background.  This should be called 
>        before respawnable services are started, as otherwise we 
>        would not get the ‘SIGCHLD’ signals when they terminate.

Yeah, I saw that note in the documentation. I used to have

    (action 'shepherd 'daemonize)

as the first line in ~/.config/shepherd/init.scm. Is there some 
other way that I was supposed to do that?

With that line in place, Shepherd will leave behind a process 
every time I stop/start a service.

I have attached an example init.scm that does this for me. If I 
start: 

    shepherd -c init.scm

and then run:

    herd stop sleep
    herd start sleep
    herd stop sleep
    herd start sleep
    herd stop sleep

then I will have three zombie sleep processes underneath my 
Shepherd process. (If the service were respawnable then it also 
would fail to restart the service.)

I assume this behaviour is wrong, but if I'm doing something wrong 
then please let me know what it is.

Carlo


[-- Attachment #1.2: init.scm --]
[-- Type: application/octet-stream, Size: 221 bytes --]

(action 'shepherd 'daemonize)

(define sleep
  (make <service>
    #:provides '(sleep)
    #:start (make-forkexec-constructor '("sleep" "100"))
    #:stop (make-kill-destructor)))

(register-services sleep)
(start sleep)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-07 11:35         ` Carlo Zancanaro
@ 2016-11-09 14:49           ` Ludovic Courtès
  2016-11-10 13:15             ` Carlo Zancanaro
  0 siblings, 1 reply; 9+ messages in thread
From: Ludovic Courtès @ 2016-11-09 14:49 UTC (permalink / raw)
  To: Carlo Zancanaro; +Cc: guix-devel

Hi,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> Yeah, I saw that note in the documentation. I used to have
>
>    (action 'shepherd 'daemonize)
>
> as the first line in ~/.config/shepherd/init.scm. Is there some other
> way that I was supposed to do that?

No, I think that should work.

> With that line in place, Shepherd will leave behind a process every
> time I stop/start a service.
>
> I have attached an example init.scm that does this for me. If I start: 
>
>    shepherd -c init.scm
>
> and then run:
>
>    herd stop sleep
>    herd start sleep
>    herd stop sleep
>    herd start sleep
>    herd stop sleep
>
> then I will have three zombie sleep processes underneath my Shepherd
> process. (If the service were respawnable then it also would fail to
> restart the service.)

Could you run shepherd in “strace -f” and see where the SIGCHLD signals
go?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-09 14:49           ` Ludovic Courtès
@ 2016-11-10 13:15             ` Carlo Zancanaro
  0 siblings, 0 replies; 9+ messages in thread
From: Carlo Zancanaro @ 2016-11-10 13:15 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1150 bytes --]


On Wed, Nov 09 2016, Ludovic Courtès wrote
> Could you run shepherd in “strace -f” and see where the SIGCHLD 
> signals go?

I don't really know how to read strace's output (and there's a lot 
of it), but sometimes it gives a line like this:

     --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, 
     si_pid=10496, si_status=SIGTERM, si_utime=142, si_stime=22} 
     ---

and sometimes it gives a line like this:

    [pid 10465] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, 
    si_pid=10556, si_status=SIGTERM, si_utime=84, si_stime=16} ---

If it has the [pid xxxxx] prefix then the signal is handled 
properly and the child process is reaped. If there's no prefix 
then a zombie process is left behind. (The pid in the prefix is 
the pid of the forked daemon process.)

Using my example init.scm file I have found that if I run `herd 
restart sleep` repeatedly and quickly (< 1 second) then it will 
consistently reap the processes. If I delay for more than about a 
second between runs then it will fail to reap the old process. 
(With strace attached this time is extended to a few seconds.)

Carlo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-07  8:53       ` Ludovic Courtès
  2016-11-07 11:35         ` Carlo Zancanaro
@ 2016-11-24  8:27         ` Dale Mellor
  2016-11-24 13:22           ` Ludovic Courtès
  1 sibling, 1 reply; 9+ messages in thread
From: Dale Mellor @ 2016-11-24  8:27 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel, Carlo Zancanaro

On Mon, 2016-11-07 at 09:53 +0100, Ludovic Courtès wrote:
> Hi,
> 
> Carlo Zancanaro <carlo@zancanaro.id.au> skribis:
> 
> > I've had problems with Shepherd and its daemonize action. If I run
> > daemonize (as the first thing when Shepherd starts) then it fails to handle
> > signals from child processes.
> 
> Could it be that you invoke the ‘daemonize’ action after respawnable
> processes have been started?  The manual has this caveat (info
> "(shepherd) The root and unknown services"):
> 
>   ‘daemonize’
>        Fork and go into the background.  This should be called before
>        respawnable services are started, as otherwise we would not get the
>        ‘SIGCHLD’ signals when they terminate.
> 
> HTH,
> Ludo’.

  Update: I'm no longer making any use of the daemonize method, instead
simply running as a detached process: `( shepherd & )' at a bash
command-line.  It works perfectly well now, reaping (and restarting)
dead children as necessary.

  Another problem I see though is that if there are jobs which depend on
one which has died, those jobs are not recycled.  Probably this has not
been considered yet, and I'm thinking that a bit more thought needs to
go into the overall design of this thing... (I guess this also ties in
with the concepts of run-levels and hot-re-configuring the shepherd
core).  Wish I had time to look into it.

Tuppence,
/Dale

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Shepherd does not recycle zombie processes
  2016-11-24  8:27         ` Dale Mellor
@ 2016-11-24 13:22           ` Ludovic Courtès
  0 siblings, 0 replies; 9+ messages in thread
From: Ludovic Courtès @ 2016-11-24 13:22 UTC (permalink / raw)
  To: Dale Mellor; +Cc: guix-devel, Carlo Zancanaro

Dale Mellor <no-reply@rdmp.org> skribis:

> On Mon, 2016-11-07 at 09:53 +0100, Ludovic Courtès wrote:
>> Hi,
>> 
>> Carlo Zancanaro <carlo@zancanaro.id.au> skribis:
>> 
>> > I've had problems with Shepherd and its daemonize action. If I run
>> > daemonize (as the first thing when Shepherd starts) then it fails to handle
>> > signals from child processes.
>> 
>> Could it be that you invoke the ‘daemonize’ action after respawnable
>> processes have been started?  The manual has this caveat (info
>> "(shepherd) The root and unknown services"):
>> 
>>   ‘daemonize’
>>        Fork and go into the background.  This should be called before
>>        respawnable services are started, as otherwise we would not get the
>>        ‘SIGCHLD’ signals when they terminate.
>> 
>> HTH,
>> Ludo’.
>
>   Update: I'm no longer making any use of the daemonize method, instead
> simply running as a detached process: `( shepherd & )' at a bash
> command-line.  It works perfectly well now, reaping (and restarting)
> dead children as necessary.

OK.  (That’s also what I do for my user Shepherd.)

>   Another problem I see though is that if there are jobs which depend on
> one which has died, those jobs are not recycled.  Probably this has not
> been considered yet, and I'm thinking that a bit more thought needs to
> go into the overall design of this thing... (I guess this also ties in
> with the concepts of run-levels and hot-re-configuring the shepherd
> core).  Wish I had time to look into it.

Certainly, help welcome!  :-)

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-11-24 13:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-05 22:22 Shepherd does not recycle zombie processes Dale Mellor
     [not found] ` <CAFAGzb_7ryGVJp1zgjYuT_czaS2o5RLMtfkEHGf9m3z=tXJ3mw@mail.gmail.com>
     [not found]   ` <CAFAGzb99gGz-U-DZTaVZyk1OfMOZG_z8JxiWu3JnfnmHAYgcnQ@mail.gmail.com>
2016-11-06 21:21     ` Carlo Zancanaro
2016-11-07  8:53       ` Ludovic Courtès
2016-11-07 11:35         ` Carlo Zancanaro
2016-11-09 14:49           ` Ludovic Courtès
2016-11-10 13:15             ` Carlo Zancanaro
2016-11-24  8:27         ` Dale Mellor
2016-11-24 13:22           ` Ludovic Courtès
2016-11-06 22:09 ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).