From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: Improving Shepherd Date: Mon, 05 Feb 2018 14:08:25 +0100 Message-ID: <871shzeg8m.fsf@gnu.org> References: <871si8bc5g.fsf@zancanaro.id.au> <877errn23f.fsf@zancanaro.id.au> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:43623) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eigVg-0005md-7A for guix-devel@gnu.org; Mon, 05 Feb 2018 08:08:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eigVZ-000263-PQ for guix-devel@gnu.org; Mon, 05 Feb 2018 08:08:36 -0500 Received: from hera.aquilenet.fr ([185.233.100.1]:40250) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eigVZ-00024G-F9 for guix-devel@gnu.org; Mon, 05 Feb 2018 08:08:29 -0500 In-Reply-To: <877errn23f.fsf@zancanaro.id.au> (Carlo Zancanaro's message of "Mon, 05 Feb 2018 21:49:08 +1100") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Carlo Zancanaro Cc: guix-devel@gnu.org Hello! Carlo Zancanaro skribis: > A few people came to join me on Friday to think about Shepherd. Thanks > Alex, Efraim, and Jelle. Thanks for summarizing! I was hoping to chime in as well but that did not happen. > User services - Alex has already sent a patch to the list to allow > generating user services from the Guix side. The idea is to generate a > Shepherd config file, allowing a user to invoke shepherd manually to > start their services. A further extension to this would be to have > something like systemd's "user sessions", where the pid 1 Shepherd > automatically starts a user's services when they log in. After replying to Alex=E2=80=99 message, I realized that we could just as w= ell have a separate =E2=80=9Cguix service=E2=80=9D or similar tool to take care= of this? This needs more thought (and perhaps taking a look at systemd user sessions, which I=E2=80=99m not familiar with), but I think Alex=E2=80=99 a= pproach is a good starting point. > Child process control - this is my personal frustration, where > Shepherd loses track of processes that fork away (e.g. "emacs > --daemon"). I barely know anything about Linux process management, but > from my reading this can be solved through Linux namespaces (if user > namespaces are available). Could someone who knows more about this let > me know if that's a productive direction for me to investigate? Or > tell me a better way to go about it? Currently shepherd monitors SIGCHLD, and it=E2=80=99s not supposed to miss those; in some cases it might handle them later than you=E2=80=99d expect, = which means that in the meantime you see a zombie process, but otherwise it seems to work. ISTR you reported an issue when using =E2=80=98shepherd --daemonize=E2=80= =99, right? Perhaps the issue is limited to that mode? > Concurrency/parallelism - I think Jelle was planning to work on this, > but I might be wrong about that. Maybe I volunteered? We're keen to > see Shepherd starting services in parallel, where possible. This will > require some changes to the way we start/stop services (because at the > moment we just send a "start" signal to a single service to start it, > which makes it hard to be parallel), and will require us to actually > build some sort of real dependency resolution. Longer-term our goal > should be to bring fibers into Shepherd, but Efraim mentioned that > fibers doesn't compile on ARM at the moment, so we'll have to get that > working first at least. I=E2=80=99d really like to see that happen. I=E2=80=99ve become more famil= iar with Fibers, and I think it=E2=80=99ll be perfect for the Shepherd (and we=E2=80= =99ll fix the ARM build issue, no doubt.) One thing I=E2=80=99d like to do is to handle SIGCHLD via signalfd(2) inste= ad of an actual signal handler like we do now. That would make it easy to have signal handling part of the main event loop and thus, it would integrate well with Fibers. It seems that signalfd(2) is Linux-only though, which is a bummer. The solution might be to get over it and have it implemented on GNU/Hurd=E2=80= =A6 (I saw this discussion: ; I suspect it=E2=80=99s within reach.) > I mentioned an idea to the guys on Friday about how Shepherd should > treat enabled/disabled services. I've thought about it some more, and > I think it might work. The general idea is that Shepherd would always > try to run an enabled service, and it would leave a disabled service > as-is (unless it's needed to start another service). So it would kind > of work like this: > - if stopped and enabled: try to start service > - if started and enabled: monitor, and restart service if it fails > - if retrying too often: disable this service, and all which depend on > it > - else: only start if another enabled service depends on this one > > This would mean that Shepherd could decide the best way to start/stop > services, including doing so in parallel if possible. Sounds good. That=E2=80=99s annoyed most of us already, so if you get that fixed, you=E2=80=99ll make a lot of people happy. :-) Ludo=E2=80=99.