* bug#41429: Shepherd Sometimes Crashes @ 2020-05-21 2:59 Katherine Cox-Buday 2020-05-21 12:14 ` Efraim Flashner 2020-05-22 17:39 ` Mathieu Othacehe 0 siblings, 2 replies; 6+ messages in thread From: Katherine Cox-Buday @ 2020-05-21 2:59 UTC (permalink / raw) To: 41429 I am running shepherd as a userspace service manager on an alien distro. Occassionally (often enough as to cause concern), Shepherd is crashing. I am unable to narrow down a cause, but anecdotally, it seems to happen more often when a service it's managing fails repeatedly and is disabled. I'm running `strace` against the Shepherd process in an attempt to submit a better bug report, but this is all I have for now. Maybe others have also seen this behavior. -- Katherine ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#41429: Shepherd Sometimes Crashes 2020-05-21 2:59 bug#41429: Shepherd Sometimes Crashes Katherine Cox-Buday @ 2020-05-21 12:14 ` Efraim Flashner 2020-05-21 12:51 ` Katherine Cox-Buday 2020-05-22 17:39 ` Mathieu Othacehe 1 sibling, 1 reply; 6+ messages in thread From: Efraim Flashner @ 2020-05-21 12:14 UTC (permalink / raw) To: Katherine Cox-Buday; +Cc: 41429 [-- Attachment #1: Type: text/plain, Size: 1281 bytes --] On Wed, May 20, 2020 at 09:59:03PM -0500, Katherine Cox-Buday wrote: > I am running shepherd as a userspace service manager on an alien distro. > Occassionally (often enough as to cause concern), Shepherd is crashing. > I am unable to narrow down a cause, but anecdotally, it seems to happen > more often when a service it's managing fails repeatedly and is > disabled. > > I'm running `strace` against the Shepherd process in an attempt to > submit a better bug report, but this is all I have for now. Maybe others > have also seen this behavior. I found it happens less often with shepherd-0.8. What version are you running? Also possibly related, do you have mismatched versions of guile between guix packages and your distro's native packages? I've also sometimes found shepherd to crash when I add a service where the start command is "wrong", as though the error were so bad that shepherd says "Nope! That's it! I quit!" I'd suggest looking at .config/shepherd/shepherd.log but it's rather sparse. Still, it might have something useful. -- Efraim Flashner <efraim@flashner.co.il> אפרים פלשנר GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#41429: Shepherd Sometimes Crashes 2020-05-21 12:14 ` Efraim Flashner @ 2020-05-21 12:51 ` Katherine Cox-Buday 2020-05-21 14:04 ` Efraim Flashner 0 siblings, 1 reply; 6+ messages in thread From: Katherine Cox-Buday @ 2020-05-21 12:51 UTC (permalink / raw) To: Efraim Flashner; +Cc: 41429 Efraim Flashner <efraim@flashner.co.il> writes: > On Wed, May 20, 2020 at 09:59:03PM -0500, Katherine Cox-Buday wrote: >> I am running shepherd as a userspace service manager on an alien distro. >> Occassionally (often enough as to cause concern), Shepherd is crashing. >> I am unable to narrow down a cause, but anecdotally, it seems to happen >> more often when a service it's managing fails repeatedly and is >> disabled. >> >> I'm running `strace` against the Shepherd process in an attempt to >> submit a better bug report, but this is all I have for now. Maybe others >> have also seen this behavior. > > I found it happens less often with shepherd-0.8. What version are you > running? Also possibly related, do you have mismatched versions of guile > between guix packages and your distro's native packages? Sorry, I forgot to include the version! I am running 0.8 from a store which I update ~1 week. > I've also sometimes found shepherd to crash when I add a service where > the start command is "wrong", as though the error were so bad that > shepherd says "Nope! That's it! I quit!" I'm doing very standard things with `make-forkexec-constructor`, so I wouldn't expect any problems there. Your comment is kind of scary though! Shepherd is the thing I want to stay up no matter what since it's responsible for monitoring and restarting things. The idea that a misbehaving or poorly written service could bring down the entire Shepherd process is a problem! Is there no isolation? > I'd suggest looking at .config/shepherd/shepherd.log but it's rather > sparse. Still, it might have something useful. Yes, this is the first place I looked, but unfortunately there wasn't much usable informatino. -- Katherine ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#41429: Shepherd Sometimes Crashes 2020-05-21 12:51 ` Katherine Cox-Buday @ 2020-05-21 14:04 ` Efraim Flashner 2020-05-21 15:59 ` Katherine Cox-Buday 0 siblings, 1 reply; 6+ messages in thread From: Efraim Flashner @ 2020-05-21 14:04 UTC (permalink / raw) To: Katherine Cox-Buday; +Cc: 41429 [-- Attachment #1: Type: text/plain, Size: 2743 bytes --] On Thu, May 21, 2020 at 07:51:54AM -0500, Katherine Cox-Buday wrote: > Efraim Flashner <efraim@flashner.co.il> writes: > > > On Wed, May 20, 2020 at 09:59:03PM -0500, Katherine Cox-Buday wrote: > >> I am running shepherd as a userspace service manager on an alien distro. > >> Occassionally (often enough as to cause concern), Shepherd is crashing. > >> I am unable to narrow down a cause, but anecdotally, it seems to happen > >> more often when a service it's managing fails repeatedly and is > >> disabled. > >> > >> I'm running `strace` against the Shepherd process in an attempt to > >> submit a better bug report, but this is all I have for now. Maybe others > >> have also seen this behavior. > > > > I found it happens less often with shepherd-0.8. What version are you > > running? Also possibly related, do you have mismatched versions of guile > > between guix packages and your distro's native packages? > > Sorry, I forgot to include the version! I am running 0.8 from a store > which I update ~1 week. > > > I've also sometimes found shepherd to crash when I add a service where > > the start command is "wrong", as though the error were so bad that > > shepherd says "Nope! That's it! I quit!" > > I'm doing very standard things with `make-forkexec-constructor`, so I > wouldn't expect any problems there. > > Your comment is kind of scary though! Shepherd is the thing I want to > stay up no matter what since it's responsible for monitoring and > restarting things. The idea that a misbehaving or poorly written service > could bring down the entire Shepherd process is a problem! Is there no > isolation? I have a whole collection of attempts to integrate mcron with shepherd, to create loops and add jobs only when the service is active. Attempting to fork off and then collect the child process and then fail just enough to make the service restart. Lots of cringe-worthy code. The more common fail scenarios I see are shepherd fails to start because it doesn't like my start code of one of the services or actually starting the service somehow kills it. All of those were with straight lambdas to the start command though. Do you have your services writing out any logs? Maybe there's a clue there. > > I'd suggest looking at .config/shepherd/shepherd.log but it's rather > > sparse. Still, it might have something useful. > > Yes, this is the first place I looked, but unfortunately there wasn't > much usable informatino. > > -- > Katherine -- Efraim Flashner <efraim@flashner.co.il> אפרים פלשנר GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#41429: Shepherd Sometimes Crashes 2020-05-21 14:04 ` Efraim Flashner @ 2020-05-21 15:59 ` Katherine Cox-Buday 0 siblings, 0 replies; 6+ messages in thread From: Katherine Cox-Buday @ 2020-05-21 15:59 UTC (permalink / raw) To: Efraim Flashner; +Cc: 41429 Efraim Flashner <efraim@flashner.co.il> writes: >> Your comment is kind of scary though! Shepherd is the thing I want to >> stay up no matter what since it's responsible for monitoring and >> restarting things. The idea that a misbehaving or poorly written service >> could bring down the entire Shepherd process is a problem! Is there no >> isolation? > > I have a whole collection of attempts to integrate mcron with shepherd, > to create loops and add jobs only when the service is active. Attempting > to fork off and then collect the child process and then fail just enough > to make the service restart. Lots of cringe-worthy code. The more common > fail scenarios I see are shepherd fails to start because it doesn't like > my start code of one of the services or actually starting the service > somehow kills it. All of those were with straight lambdas to the start > command though. I'm not familiar with Shepherd's internals, so I don't know why interacting with a cron is relevant. > Do you have your services writing out any logs? Maybe there's a clue > there. Not yet, but I should be enabling this soon, and if they display anything I'll report back. Still, this seems beside the point: the bug is that Shepherd needs to stay up regardless of what the services it's monitoring do. -- Katherine ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#41429: Shepherd Sometimes Crashes 2020-05-21 2:59 bug#41429: Shepherd Sometimes Crashes Katherine Cox-Buday 2020-05-21 12:14 ` Efraim Flashner @ 2020-05-22 17:39 ` Mathieu Othacehe 1 sibling, 0 replies; 6+ messages in thread From: Mathieu Othacehe @ 2020-05-22 17:39 UTC (permalink / raw) To: Katherine Cox-Buday; +Cc: 41429 Hello Katherine, > I'm running `strace` against the Shepherd process in an attempt to > submit a better bug report, but this is all I have for now. Maybe others > have also seen this behavior. Yes, I have observed this behavior. This should be fixed with the upcoming 0.8.1 release of Shepherd (hopefully !). See: https://lists.gnu.org/archive/html/bug-guix/2020-05/msg00241.html. Thanks for reporting, Mathieu ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-05-22 17:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-05-21 2:59 bug#41429: Shepherd Sometimes Crashes Katherine Cox-Buday 2020-05-21 12:14 ` Efraim Flashner 2020-05-21 12:51 ` Katherine Cox-Buday 2020-05-21 14:04 ` Efraim Flashner 2020-05-21 15:59 ` Katherine Cox-Buday 2020-05-22 17:39 ` Mathieu Othacehe
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).